Diploma in Data Science
According to International Data Corporation (IDC), a technology research company, global data analytics and big data market is likely to reach USD 57 billion by 2020 with a CAGR of 23%. This also means that there is going to be a bigger gap in demand and supply of data scientists, including professionals for information management, software development, data analytics, artificial intelligence and data discovery. Hence, Python, R, SAS, Machine learning etc. have become the most sought after skills.
Diploma in Data Science with R and Python is aimed at equipping IT professionals with the know-how of the data science field. The Data Science Course will introduce machine learning as well Neural Networks, Artificial Intelligence and Business Intelligence. This Data Science Course will also help the IT enterprises from the perspective of implementing machine learning and developing the resources to handle more intellectual work.
Data Science Course is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms. This program equips learner with all the conceptual and technical skills required for the ultimate position in the analytics industry. The program introduces the learner to business analytics using the most in-demand analytics technologies like R, GitHub and Python and teaches implementation of various data science course concepts such as data exploration, visualization, and hypothesis testing. Special focus has been placed on Machine Learning techniques used for regression, classification and clustering.
6 months – 5 months + 1 month project (off site)- Daily 4 hrs or 20 hrs / wk OR Week end with 10 hrs / wk, 11 months with 10 month + 1 month project (off site)
Course 1: Introduction to Data Science – 8 Hours
Description: This course gives an overview of the basic concepts of data and tools that data analysts work with. The course cover various methods of obtaining data with various formats, also cover the basics of cleaning and visualizing data.
- Basics of Data
- What is Data Science?
- Data Science and Ethical Issues
- Big Data and Data Science Hype, Datafication
- Understand Data Science Pipeline – Data Wrangling, Exploratory Analysis, Modeling
- Getting and Cleaning Data
- Visualising the Data
- The Data Scientist’s Toolbox
- Applications of Data Science in Business and Industry
- Case Study
Course 2: Statistical Inference – 24 hours
Description: This course presents a fundamentals of statistical inference which helps the learner to understand the process of drawing conclusions about populations or scientific truths from data using different modes of performing inference.
- Introduction to statistics
- Distance Measures – Euclidean, Manhatten, Mahalanobis
- Correlation, Regression
- Hypothesis testing
- Case Study
Course 3: R Programming – 32 Hours
Description: This course provides an in-depth understanding of R, R-studio, and R packages. Learner will learn how to program in R with the various types of functions, data structure, and perform data visualizations using the various graphics available in R. The course also covers the GitHub and working on the same.
- Introduction to R
- Install R, RStudio, R Package
- Operators in R
- Loops in R
- R Functions
- R Data Structure
- knitr, RPub, R Markdown, swirl, ggplot2
- Introduction to GitHub
- Install GitHub
- Creating GitHub repository
- GitHub Commands
- Demos, sample examples
- Data Science with R programming
Course 4: Python Programming – 48 Hours
Description: This course helps the learner to understand the essential concepts of Python programming like data types, basic operators, and functions. Leaners will perform high-level computing using NumPy, SciPy packages along with the Pandas package used for data analysis and manipulation. Learner will gain expertise in machine learning using the Scikit package, matplotlib library for data visualization and BeautifulSoup for web scraping.
- Introduction to Python
- Install Python
- Basic types of Python
- Operators and Functions of Python
- Computation with Python – NumPy, SciPy
- Data Manipulation in Python- Pandas
- Understanding DataFrame
- Data Visualisation in Python – matplotlib
- Introduction to Scikit – Machine learning
- Web Scraping in Python – BeautifulSoup
- Integration using PySpark, Hadoop, MapReduce
- Demos, sample examples
Course 5: Machine Learning – 160 Hours
Description: This course covers Data Processing, EDA Regression (Linear Regression, Support Vector Machine, Decision Tree, Random Forest Walk ) Classification (Logistic Regression, K-NN, SVM, Naïve Bayes, Decision Tree, Random Forest) Clustering (K- Means , Hierarchical) Association Rule Mining Eclat Dimensionality Reduction Model Selection & BoostingInstall R, RStudio
Course 6: Business Intelligence Tools – 16 Hours
Description: This course covers Introduction to Advanced BI tools like Power BI or Qlik
Course 7: Natural Language Processing – 40 Hours
Description: This course covers How to build a Spam Detector, How to build a Twitter Sentiment Analyzer, NLTK, Latent Semantic Analysis.
Course 8: Neural Networks – 80 Hours
Description: This course covers Artificial Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Boltzmann Machines, Tensorflow with Python
Course 9: Case Studies and Capstone Project – Artificial Intelligence, Automation – 150 Hours (including notional hours)
Program Content Form: Video lectures, presentations, interactive content, industry projects, GitHub, text reading
Fresher graduate with good numerical and analytical skills, minimum 65% at graduate exam.