Getting started with Machine Learning

What are Computer Science, Artificial Intelligence and Machine Learning?

Different types of Machine Learning

Introduction to Python and Jupyter

Numpy, Pandas, Scikit-learn and Matplotlib

Data Scraping

Data scraping: City dataset from Versus.com

Data scraping: Phone dataset from Versus.com

Data scraping: KDnuggets.com’s post statistics

Data Scraping: Android App Dataset from Google Play Store

Data Cleaning

Data Cleaning case study: Google Play Store Dataset

Preparatory Phase

Splitting data into a Training set and a Validation set

Imbalanced Learning: sampling techniques

Exploratory Data Analysis

QQ plot versus PP plot versus Probability plot

Multicollinearity (or Collinearlity)

Does Causality imply Correlation?

A survey of correlation analysis methods

Subgroup Discovery: Beyond coverage and mean-shift

On ensuring fairness: Statistical parity vs Causal graphs

Feature Engineering

How to deal with missing values (NaNs)

Feature Selection with sklearn

When to do feature centering, scaling and normalization?

How to convert Categorical Variables to Numerical Variables?

Regression Models

Linear Regression

Introduction to Linear Regression

How to make a Linear Regressor? (theory)

Confidence Intervals for Coefficients

Evaluation

Regression Objective and Evaluation Functions

Classification Models

Logistic Regression

An overview of Logistic Regression

Other algorithms

Naive Bayes classifier: a comprehensive guide

Information Gain, Gain Ratio and Gini Index

Ensemble: Bagging, Random Forest, Boosting and Stacking

Evaluation

How to read the Confusion Matrix

ROC curve and AUC: a comprehensive overview

Precision-Recall curve: an overview

Binary Classification Evaluation Summary

Reinforcement Learning

Q Learning, Deep Q Learning introduction with Tensorflow

Reinforcement Learning for the Join Optimization problem in Database

Deep Learning

Sigmoid, tanh activations and their loss of popularity

ELU activation: A comprehensive analysis

Batch Normalization and why it works

Attention in Deep Learning, your starting point (with code)

The Transformer neural network architecture

Deep Learning normalization methods

A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

Some thoughts regarding Deep Learning’s achievements, hype, and challenges

Conversational AI: ChatGPT alternatives and their advantages

Advanced (yet simple) Prompt Engineering techniques for Large Language Models

Is ChatGPT as good as humans? A study in the field of programming education

Data Visualization

Charts to show relationships between variables

Charts to compare different samples

Coronavirus cases: an interactive geo-map with Plotly

Text Mining

Optimal aligning – Needleman Wunsch Algorithm

Shift-AND algorithm for exact pattern matching

Bit-parallel algorithms for generalized string matching

Statistics

Z-score, Z-statistic, Z-test, Z-distribution

T-statistic, T-test and the T-family

Paired Two-sample T-test (Dependent T-test)

Unpaired Two-sample T-test (Independent T-test)

Miscellaneous

Over-fitting and Under-fitting

Parametric vs Non-parametric algorithms

How Intelligent systems damage the data

Information Theory concepts: Entropy, Mutual Information, KL-Divergence, and more

Database

SQL – an introduction to basic SELECT queries

SQL – combining data (UNION, JOIN, etc.)