Get in Touch

Course Outline

Introduction to Machine Learning

  • Types of machine learning – supervised vs unsupervised.
  • From statistical learning to machine learning.
  • The data mining workflow: business understanding, data preparation, modeling, deployment.
  • Choosing the right algorithm for the task.
  • Overfitting and the bias-variance tradeoff.

Overview of Python and ML Libraries

  • Why use programming languages for ML.
  • Choosing between R and Python.
  • Python crash course and Jupyter Notebooks.
  • Python libraries: pandas, NumPy, scikit-learn, matplotlib, seaborn.

Testing and Evaluating ML Algorithms

  • Generalization, overfitting, and model validation.
  • Evaluation strategies: holdout, cross-validation, bootstrapping.
  • Metrics for regression: ME, MSE, RMSE, MAPE.
  • Metrics for classification: accuracy, confusion matrix, unbalanced classes.
  • Model performance visualization: profit curve, ROC curve, lift curve.
  • Model selection and grid search for tuning.

Data Preparation

  • Data import and storage in Python.
  • Exploratory analysis and summary statistics.
  • Handling missing values and outliers.
  • Standardization, normalization, and transformation.
  • Qualitative data recoding and data wrangling with pandas.

Classification Algorithms

  • Binary vs multiclass classification.
  • Logistic regression and discriminant functions.
  • Naïve Bayes, k-nearest neighbors.
  • Decision trees: CART, Random Forests, Bagging, Boosting, XGBoost.
  • Support Vector Machines and kernels.
  • Ensemble learning techniques.

Regression and Numerical Prediction

  • Least squares and variable selection.
  • Regularization methods: L1, L2.
  • Polynomial regression and nonlinear models.
  • Regression trees and splines.

Unsupervised Learning

  • Clustering techniques: k-means, k-medoids, hierarchical clustering, SOMs.
  • Dimensionality reduction: PCA, factor analysis, SVD.
  • Multidimensional scaling.

Text Mining

  • Text preprocessing and tokenization.
  • Bag-of-words, stemming, and lemmatization.
  • Sentiment analysis and word frequency.
  • Visualizing text data with word clouds.

Recommendation Systems

  • User-based and item-based collaborative filtering.
  • Designing and evaluating recommendation engines.

Association Pattern Mining

  • Frequent itemsets and Apriori algorithm.
  • Market basket analysis and lift ratio.

Outlier Detection

  • Extreme value analysis.
  • Distance-based and density-based methods.
  • Outlier detection in high-dimensional data.

Machine Learning Case Study

  • Understanding the business problem.
  • Data preprocessing and feature engineering.
  • Model selection and parameter tuning.
  • Evaluation and presentation of findings.
  • Deployment.

Summary and Next Steps

Requirements

  • Basic understanding of statistics and linear algebra.
  • Familiarity with data analysis or business intelligence concepts.
  • Some exposure to programming (preferably Python or R) is recommended.
  • Interest in learning applied machine learning for data-driven projects.

Audience

  • Data analysts and scientists.
  • Statisticians and research professionals.
  • Developers and IT professionals exploring machine learning tools.
  • Anyone involved in data science or predictive analytics projects.
 21 Hours

Number of participants


Price per participant

Testimonials (3)

Upcoming Courses

Related Categories