Launching Offer, Grab Any Course* @ $10. Use Coupon Code : LAUNCH10

Help
course_logo

Learn Hadoop, MapReduce & BigData

 03-Jun-2015
  •  9 Lessons
  •  Counting Quiz
  •  Counting documents
  •  Counting videos
1115 Students
Modern companies estimate that only 12% of their accumulated data is analyzed, and IT professionals who are able to work with the remaining data are becoming increasingly valuable to companies. Big data talent requests are also up 40% in the past year. 

Simply put, there is too much data and not enough professionals to manage and analyze it. This course aims to close the gap by covering MapReduce and its most popular implementation: Apache Hadoop. We will also cover Hadoop ecosystems and the practical concepts involved in handling very large data sets. 

Learn and Master the Most Popular Big Data Technologies in this Comprehensive Course. 

  • Apache Hadoop and MapReduce on Amazon EMR
  • Hadoop Distributed File System vs. Google File System
  • Data Types, Readers, Writers and Splitters
  • Data Mining and Filtering
  • Shell Comments and HDFS
  • Cloudera, Hortonworks and Apache Bigtop Virtual Machines

Mastering Big Data for IT Professionals World Wide 

Broken down, Hadoop is an implementation of the MapReduce Algorithm and the MapReduce Algorithm is used in Big Data to scale computations. The MapReduce algorithms load a block of data into RAM, perform some calculations, load the next block, and then keep going until all of the data has been processed from unstructured data into structured data.

IT managers and Big Data professionals who know how to program in Java, are familiar with Linux, have access to an Amazon EMR account, and have Oracle Virtualbox or VMware working will be able to access the key lessons and concepts in this course and learn to write Hadoop jobs and MapReduce programs.

This course is perfect for any data-focused IT job that seeks to learn new ways to work with large amounts of data.


Contents and Overview 
In over 16 hours of content including 74 lectures, this course covers necessary Big Data terminology and the use of Hadoop and MapReduce.

This course covers the importance of Big Data, how to setup Node Hadoop pseudo clusters, work with the architecture of clusters, run multi-node clusters on Amazons EMR, work with distributed file systems and operations including running Hadoop on HortonWorks Sandbox and Cloudera.

Students will also learn advanced Hadoop development, MapReduce concepts, using MapReduce with Hive and Pig, and know the Hadoop ecosystem among other important lessons.

Upon completion students will be literate in Big Data terminology, understand how Hadoop can be used to overcome challenging Big Data scenarios, be able to analyze and implement MapReduce workflow, and be able to use virtual machines for code and development testing and configuring jobs.

Prerequisites

  • A familiarity of programming in Java.
  • A familiarity of Linux
  • Have Oracle Virtualbox or VMware installed and functioning.

Syllabus

1st lesson

Introduction to Big Data

  • Introduction to course
  • Provisioning a VM with vagrant and puppet
  • Why Hadoop, Big Data and Map Reduce Part - A
  • Why Hadoop, Big Data and Map Reduce Part - B
  • Why Hadoop, Big Data and Map Reduce Part - C
  • Architecture of Clusters
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
2nd lesson

Hadoop Architecture

  • Set up a single Node Hadoop pseudo cluster Part - A
  • Set up a single Node Hadoop pseudo cluster Part - B
  • Set up a single Node Hadoop pseudo cluster Part - C
  • Clusters and Nodes, Hadoop Cluster Part - A
  • Clusters and Nodes, Hadoop Cluster Part -B
  • NameNode, Secondary Name Node, Data Nodes Part - A
  • NameNode, Secondary Name Node, Data Nodes Part - B
  • Running Multi node clusters on Amazons EMR Part - A
  • Running Multi node clusters on Amazons EMR Part - B
  • Running Multi node clusters on Amazons EMR Part - C
  • Running Multi node clusters on Amazons EMR Part - D
  • Running Multi node clusters on Amazons EMR Part - E
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
3rd lesson

DFS

  • Hdfs vs Gfs a comparison - Part A
  • Hdfs vs Gfs a comparison - Part B
  • Run hadoop on Cloudera
  • Run hadoop on Hortonworks
  • File system operations with the HDFS shell Part - A
  • File system operations with the HDFS shell Part - B
  • Advanced hadoop development with Apache Bigtop Part - A
  • Advanced hadoop development with Apache Bigtop Part - B
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
4th lesson

Mapreduce Version 1

  • MapReduce Concepts in detail Part - A
  • MapReduce Concepts in detail Part - B
  • Jobs definition, Job configuration, submission, execution and monitoring Part -A
  • Jobs definition, Job configuration, submission, execution and monitoring Part -B
  • Jobs definition, Job configuration, submission, execution and monitoring Part -C
  • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part A
  • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part B
  • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part C
  • The ETL class, Definition, Extract, Transform, and Load Part - A
  • The ETL class, Definition, Extract, Transform, and Load Part - B
  • The ETL class, Definition, Extract, Transform, and Load Part - C
  • The UDF class, Definition, User Defined Functions Part - A
  • The UDF class, Definition, User Defined Functions Part - B
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
5th lesson

Mapreduce with Hive ( Data warehousing )

  • Schema design for a Data warehouse Part - A
  • Schema design for a Data warehouse Part -B
  • Hive Configuration
  • Hive Query Patterns Part - A
  • Hive Query Patterns Part - B
  • Hive Query Patterns Part - C
  • Example Hive ETL class Part - A
  • Example Hive ETL class Part - B
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
6th lesson

Mapreduce with Pig (Parallel processing)

  • Introduction to Apache Pig Part - A
  • Introduction to Apache Pig Part - B
  • Introduction to Apache Pig Part - C
  • Introduction to Apache Pig Part - D
  • Pig LoadFunc and EvalFunc classes
  • Example Pig ETL class Part - A
  • Example Pig ETL class Part - B
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
7th lesson

The Hadoop Ecosystem

  • Introduction to Crunch Part - A
  • Introduction to Crunch Part - B
  • Introduction to Arvo
  • Introduction to Mahout Part - A
  • Introduction to Mahout Part - B
  • Introduction to Mahout Part - C
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
8th lesson

Mapreduce Version 2

  • Apache Hadoop 2 and YARN Part - A
  • Apache Hadoop 2 and YARN Part - B
  • Yarn Examples
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...
9th lesson

Putting it all together

  • Amazon EMR example Part - A
  • Amazon EMR example Part - B
  • Amazon EMR example Part - C
  • Amazon EMR example Part - D
  • Apache Bigtop example Part - A
  • Apache Bigtop example Part - B
  • Apache Bigtop example Part - D
  • Apache Bigtop example Part - E
  • Apache Bigtop example Part - F
  • Course Summary
  • Counting Problems...
  • Counting Documents...
  • Counting Videos...

Frequently asked questions

What am I going to get from this course?

  • Over 74 lectures and 15.5 hours of content!
  • Become literate in Big Data terminology and Hadoop.
  • Understand the Distributed File Systems architecture and any implementation such as Hadoop Distributed File System or Google File System
  • Use the HDFS shell
  • Use the Cloudera, Hortonworks and Apache Bigtop virtual machines for Hadoop code development and testing
  • Configure, execute and monitor a Hadoop Job

What is the target audience?

  • Big Data professionals who want to Master MapReduce and Hadoop.
  • IT professionals and managers who want to understand and learn this hot new technology
  • Curriculum.

Meet with your teachers

Eduonix Learning Solutions The Knowledge Edge
Eduonix creates and distributes high quality technology training content. Our team of industry professionals have been training manpower for more than a decade. We aim to teach technology the way it is used in industry and professional world. We have professional team of trainers for technologies ranging from Mobility, Web to Enterprise and Database and Server Administration.
I thoroughly enjoyed this course and hope to expand on my gained knowledge about building websites. The course was well presented, easy to follow and engaging. The interactive examples make the course material easier to grasp. I definitely recommend this course to anyone who wishes to learn about HTML or CSS.

First Last

Designation here