Certificate Program in Big Data Foundation + Engineering

Program Structure

Certificate Program in Big Data Foundation+Engineering

Blended Learning | 6 Months (Inclusive of Project – 1 Month) | INR 1,18,000/-

Why Join Certificate Program in Big Data Foundation + Engineering

Use research-based knowledge and research methods including design of experiments, analysis and interpretation of data, and synthesis of the information to provide valid conclusions

Function effectively as an individual, and as a member or leader in diverse teams, in the Data science corporate and research world

Demonstrate knowledge and understanding of data sciences and apply these to one’s own work, as a member and leader in a team, to manage projects and in multidisciplinary environments

Skills To Be Mastered

Java / Python / SQL / NoSQL / MapReduce / Tableau / R / Hive / Pig / Yarn / Ozee

Program Structure

This Program is designed to train you to use Hadoop as a Data Management tool using languages like R and Java. More than 70% hands-on learning of Apache projects like Hive, Pig, Yarn, Oozie including data visualisation platform Tableau ensures that you will be Hadoop-friendly. Introductory overview of Apache Spark will also assist you in completing the 1 Month Capstone Project which in turn ensures practical implementation of Hadoop based research idea.

Program Duration

6 Months (Inclusive of Project - 1 Month)

Mode Of Delivery

Blended Learning

Skillville Certificate


Vidyalankar and Skillville


100% Placement assistance
Create profile, showcasing relevant skillsets
Aptitude, GD and Interview Training


6 Months Certificate program in Big Data Foundation + Engineering

-Big data processing: Value of Big data, History, Hadoop Development
-Cloud Computing with AWS: EC2,S3

-Fundamental of Python programming language & Setup in Windows.
-Hadoop on Local Ubuntu Host
-Setting Up Hadoop, Downloading Hadoop
-Setting Up SSH
-Using Hadoop to calculate pi
-Configuring the pseudo-distributed mode
-Changing the base HDFS Directory
-Formatting the name node
-Starting Hadoop
-Using Hdfs
-Wordcount, the Hello world Mapreduce
-Using Ealstic mapreduce
-Wordcount in EMR using management console
-Comparison of Local Vs EMR Hadoop

-Key Value pairs
-Hadoop java API for Mapreduce
-Mapreduce program:Setting up classpath
-Implimenting Wordcount
-Buiding a Jar file
-Running wordcount on Local Hadoop cluster
-Running wordcount on EMR
-WordCount with combiner
-Fixing wordcount to work with combiner
-Hadoop specific data types
-Using the writable wrapper class

-Using languages other than Java with Hadoop
-Word count using streaming
-Analysing large dataset
-Summarizing the UFO data
-Summarizing the shape data
-Correlating sighting duration to UFO shape
-Performing the shape/time analysis from the command line
-Using chain/mapper for field validation/analysis
-Using the distributed cache to improve location output
-Counters, status and other output, creating counters, task states and writing log output

-Simple,Adcvanced,and in-between joins
-Reduce-sidejoins using MultipleInputs
-Graph Algorithms
-Representing the graph
-Creating the source code
-The first run
-the second run
-the third run
-the fourth and the last run
-Using language -independent data structures
-Getting and installing Avro
-Defining the schema
-Creating the source Avro data with ruby
-Consumung the Avaro data with Java
-Generating the shape summaries in Mapreduce
-Examining the output data with ruby
-Examining output data with Java

-Hadoop Node Failure
-Killing a data node process
-The replication factor in action
-Intentionally causing missing blocks
-Killing a task tracker process
-Killing the job tracker
-Killing the namenode process
-Causing the task failure
-Handling dirty data by using skip mode

– Brawsing default properties
– Setting up cluster
– Examining a default rack configuration
– Adding a rack awareness script
– Cluster access control
– Demonstrating the default security
– Managing the namenode
– Adding an additional fsimage location
– Swapping for the new namenode host
– Managing HDFS
– Mapreduce management
– Changing Job priorities and killing a job
– Scalling

– Overview of Hive
– Setting up Hive
– Installing Hive
– Creating a table for the UFO data
– Inserting the UFO data
– SAS user interface
– Validating the table
– Redefining the table with the correct column separator
– Creating a table with correct column separator
– Creating a table from an existing file
– Performing a join
– Using views
– Exporting query output
– Making a partitioned UFO sighting table
– Adding a new user defined function(UDF)
– Hive on AWS-T
– Running UFO analysis on EMR

– Common data paths
– Installing and setting up MySQL
– Configuring MySQL to allow remote connections
– Setting up the employee database
– Getting data into Hadoop
– Exporting data from MySQL to HDFS
– Exporting data from MySQL to HDFS
– Exporting data from MySQL into Hive

– AWS Services
– Getting web server data into Hadoop
– Introducing Apache flume
– Installing and configuring Flume
– Capturing network traffic to a log file
– Logging to the console
– Capturing the output of a command in a flat file
– Capturing a remote file in alocal flat file
– Writing network traffic onto HDFS
– Adding timestamps
– Multilevel flume Networks
– Writing to multiple sinks

– Hbase
– Oozie
– Pig
– R for Analytics

– Introduction
– Installation of Tableau Desktop
– Tableau architecture
– Installation of Tableau Desktop
– Tableau server component
– Tableau Environment
– Tableau Workspace
– Build views in tableau
– Connect to data source in tableau
– Export DB connection in tableau
– Data blending in tableau
– Joining tables in tableau
– Data Bins in Tableau
– Creating a dashboard in Tableau
– Tableau Desktop shortcuts Cheat Sheet.

Admission Details


Engineering, Graduate and Post Graduates (Computer Science, Information Technology)(Electronics and telecommunication, Electronics (With pre-requisite test)
Science Graduate (BSC IT, BCA) Diploma Engineer of above branches (With pre-requisite test)
Phd-Persuing in the above mentioned domains

Fees Details: INR 1,18,000/-

Please consult your Admission Counselor for flexi-EMI options

Contact Us

Please reach out to the admission office if you have any queries


Our placement assistance program offers students one-on-one career counselling, and the chance to work with our corporate partners.

In case you drop out of the course due to a genuine reason, you will have 6 months’ time to return to it. If you fail to return to your course within this period, you will have to start afresh.

You can pay for the course of your interest on the website by clicking on the Fees tab via e-wallets, net banking, credit cards, debit cards as well as NEFT/Bank Transfer.

Refund must be claimed before the commencement of your batch. The Application form fees are non-refundable. Skillville will deduct 20% of the program fees paid till date of application for refund towards administrative charges and 80% will be refundable within 1 month from the approval of refund by a Skillville Authorized representative.

You will have an opportunity to catch up with a simultaneous batch in session or you can reach the respective Faculty to cover the missed class.

Please contact your Admission Counselor or drop an email regarding your querry to admissions@skillville.in.

Yes, Certificate will comprise following certification bodies AIMA, AICTE, SAS.Learnt tools and technologies will be enlisted on the certificate of completion along with secured grade.

Layer 1
Login Categories