HADOOP DEVELOPER

HADOOP DEVELOPER

Overview:
Hadoop is a distributed computing system that works on commodity hardware on a scale and speed that is just not possible for other database processing systems to match. Due to this there is a huge demand for Hadoop Developers who can deploy Hadoop on a massive scale.

A Hadoop Developer is responsible for the actual coding or programming of Hadoop applications. Hadoop developer jobs responsibilities are to write programs as per the system designs and must have fair knowledge about the coding and programming. Task of Hadoop developer is similar to software developer but in Big Data domain. Job of Hadoop developer also includes understanding and working to come up with solutions to problems, designing and architecting along with strong documenting skills.

Training Objectives of Hadoop Developer
The Hadoop Developer online training equips you with the right skill sets needed to take the Professional Hadoop Developer Cloud era Certification. This Hadoop Certification training is your passport to the most sought-after jobs in the Big Data world. Hadoop Developer Certification Training will help you get a detailed idea about Big Data and Hadoop. Some of the topics included are introduction to the Hadoop ecosystem, understanding of HDFS and MapReduce including MapReduce abstraction. Learn to install, implement various components of Hadoop like Pig, Hive, Flume, Sqoop and YARN.

Target Students / Prerequisites:
Anyone interested in Hadoop, HDFS and MapReduce.
For those who want to learn programming in MapReduce a little OOP knowledge is assumed.
Basic database knowledge is assumed. (Like joins, normalization)

Course Content

Hadoop Architecture
Introduction to
Parallel Computer vs. Distributed Computing
How to install Hadoop on your system
How to install Hadoop cluster on multiple
Hadoop Daemons introduction: NameNode, DataNode, JobTracker, TaskTracker
Exploring HDFS (Hadoop Distributed File System) Exploring the HDFS Apache Web UI
NameNode architecture (EditLog, FsImage, location of replicas) Secondary NameNode architecture
DataNode architecture

MapReduce Architecture
Exploring JobTracker/TaskTracker
How a client submits a Map-Reduce job
Exploring Mapper/Reducer/Combiner
Shuffle: Sort & Partition
Input/output formats
Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler) Exploring the
Apache MapReduce Web UI

Hadoop Developer Tasks
Balancing Sorting in HDFS Writting a map-reduce programme
Reading and writing data using
Java Hadoop Eclipse integration
Mapper in details
Reducer in details
Using Combiners
Reducing Intermediate Data with Combiners
Writing Partitioners for Better Load
Searching in HDFS
Indexing in HDFS
Hands-On Exercise

Hadoop Administrative Tasks
Routine Administrative Procedures
Understanding dfsadmin and mradmin Block Scanner, Balancer
Health Check & Safe mode
DataNode commissioning/decommissioning
Monitoring and Debugging on a production
cluster NameNode Backup and Recovery
ACL (Access control list) Upgrading Hadoop

HBase Architecture
Introduction to Hbase
HBase vs. RDBMS
Exploring HBase Master & region server
Column Families and Regions
Basic Hbase shell commands

Hive Architecture
Introduction to Hive
HBase vs. Hive
Installation of Hive
HQL (Hive query language)
Basic Hive commands

Pig Architecture
Introduction to Pig
Installation of Pig on your system
Basic Pig commands
Hands-On Exercise

Sqoop Architecture
Introduction to Sqoop
Installation of Sqoop on your system
Import/Export data from RDBMS to HDFS
Import/Export data from RDBMS to HBase
Import/Export data from RDBMS to Hive
Hands-On Exercise

Mini Project / POC (Proof of Concept)
Facebook-Hive POC
Usages of Hadoop/Hive @ Facebook
Static & dynamic partitioning
UDF ( User defined functions )