Data Scientist

Cover the entire job spectrum from the smallest of startups to the largest of organizations. Gain an unbeatable skill set by learning data science with the leading analytical tools – SAS, R, Python and SQL. Whether you are a professional, a graduate or an analyst, rely on the leading analytics training institute to gain expertise in data science.

  • Training Type Online
  • Course Duration 6 weeks, 10 hrs/week
$ 900

Want to know more?

COURSES INCLUDED

  • Introduction to Data Science
  • Getting To Know Your Data
  • Overview of Tasks & Techniques: Prediction
  • Evaluation and Methodology of Data Science
  • Data Engineering
  • Overview of Tasks & Techniques: Probabilistic Models
  • Overview of Tasks & Techniques: Exploratory Data Mining
  • Case Studies in Data Science

Introduction to Data Science

  1. What is data science, relation to data mining, machine learning, big data and statistics
  2. Motivating examples
  3. Why is it interesting?
  4. Several data science settings
  5. Introduction to the WEKA tool
  6. Practical information

Getting To Know Your Data

  1. From data to features
  • Interactive group
  • discussion
  • Representing problems with matrices
  • Representing problem with relations
  • Example: Text with TFIDF
  1. Computing simple
  • Boxplots
  • Scatterplots
  • Time series
  • Spatial data
  1. Case studies
  • X & Y examples
  • Medical data

Overview of Tasks & Techniques: Prediction

  1. The prediction task
  • Definition
  • Examples
  • Format of input / output data
  1. Prediction algorithms
  • Decision trees
  • Rule learners
  • Linear/logistic regression
  • Nearest neighbour learning
  • Support vector machines
  1. Properties of prediction algorithms and practical exercises
  2. Combining classifiers

Evaluation and Methodology of Data Science

  1. Experimental setup
  • Training, tuning, test data
  • Holdout method, cross-validation, bootstrap method
  1. Measuring performance of a model
  • Accuracy, ROC curves, precision-recall curves
  • Loss functions for regression
  1. Interpretation of results
  • Confidence interval for accuracy
  • Hypothesis tests for comparing models, algorithms

Data Engineering

  1. Attribute selection
  • Filter methods
  • Wrapper methods
  1. Data discretization
  • Unsupervised discretization
  • Supervised discretization
  1. Data transformations
  • PCA and variants
  1. Exercises

Overview of Tasks & Techniques: Probabilistic Models

  1. Introduction
  • Probabilities
  • Rule of Bayes and Conditional Independence
  1. Naive Bayes
  • Application to spam filtering
  1. Bayesian Networks
  • Graphical representation
  • Independence and correlation
  1. Temporal models
  • Markov Chains
  • Hidden Markov Models

Overview of Tasks & Techniques: Exploratory Data Mining

  1. Introduction to Exploratory Data Mining
  2. Association discovery
  • What is association discovery?
  • What are the challenges?
  • In detail: Apriori
  1. Clustering
  • What is clustering?
  • What are the challenges?
  • In detail: agglomerative clustering
  1. Hands-on: clustering in WEKA

Case Studies in Data Science

  1. Eve, the Pharmaceutical Robot Scientist: Data Science for Drug Discovery
  2. Data science for sports analytics
  3. Data science for sensor data (Introduction to challenge)