Getting started with Machine Learning
What are Computer Science, Artificial Intelligence and Machine Learning?
Different types of Machine Learning
Introduction to Python and Jupyter
Numpy, Pandas, Scikit-learn and Matplotlib
Data Scraping
Data scraping: City dataset from Versus.com
Data scraping: Phone dataset from Versus.com
Data scraping: KDnuggets.com’s post statistics
Data Scraping: Android App Dataset from Google Play Store
Data Cleaning
Data Cleaning case study: Google Play Store Dataset
Preparatory Phase
Splitting data into a Training set and a Validation set
Imbalanced Learning: sampling techniques
Exploratory Data Analysis
QQ plot versus PP plot versus Probability plot
Multicollinearity (or Collinearlity)
Does Causality imply Correlation?
A survey of correlation analysis methods
Subgroup Discovery: Beyond coverage and mean-shift
On ensuring fairness: Statistical parity vs Causal graphs
Feature Engineering
How to deal with missing values (NaNs)
Feature Selection with sklearn
When to do feature centering, scaling and normalization?
How to convert Categorical Variables to Numerical Variables?
Regression Models
Linear Regression
Introduction to Linear Regression
How to make a Linear Regressor? (theory)
Confidence Intervals for Coefficients
Evaluation
Regression Objective and Evaluation Functions
Classification Models
Logistic Regression
An overview of Logistic Regression
Other algorithms
Naive Bayes classifier: a comprehensive guide
Information Gain, Gain Ratio and Gini Index
Ensemble: Bagging, Random Forest, Boosting and Stacking
Evaluation
How to read the Confusion Matrix
ROC curve and AUC: a comprehensive overview
Precision-Recall curve: an overview
Binary Classification Evaluation Summary
Reinforcement Learning
Q Learning, Deep Q Learning introduction with Tensorflow
Reinforcement Learning for the Join Optimization problem in Database
Deep Learning
Sigmoid, tanh activations and their loss of popularity
ELU activation: A comprehensive analysis
Batch Normalization and why it works
Attention in Deep Learning, your starting point (with code)
The Transformer neural network architecture
Deep Learning normalization methods
A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more
Some thoughts regarding Deep Learning’s achievements, hype, and challenges
Conversational AI: ChatGPT alternatives and their advantages
Advanced (yet simple) Prompt Engineering techniques for Large Language Models
Is ChatGPT as good as humans? A study in the field of programming education
Yes, we can now make ChatGPT as good as human tutors for programming education
Data Visualization
Charts to show relationships between variables
Charts to compare different samples
Coronavirus cases: an interactive geo-map with Plotly
Text Mining
Optimal aligning – Needleman Wunsch Algorithm
Shift-AND algorithm for exact pattern matching
Bit-parallel algorithms for generalized string matching
Statistics
Z-score, Z-statistic, Z-test, Z-distribution
T-statistic, T-test and the T-family
Paired Two-sample T-test (Dependent T-test)
Unpaired Two-sample T-test (Independent T-test)
Miscellaneous
Over-fitting and Under-fitting
Parametric vs Non-parametric algorithms
How Intelligent systems damage the data
Information Theory concepts: Entropy, Mutual Information, KL-Divergence, and more
Database
SQL – an introduction to basic SELECT queries
SQL – combining data (UNION, JOIN, etc.)