Feature Engineering, Machine Learning - Data MiningLeave a comment

Feature selection with sklearn

September 4, 2019July 31, 2020 Tung.M.Phung

Feature selection is hard but very important. Continue reading Feature selection with sklearn

Machine Learning - Data Mining, Regression ModelsLeave a comment

Advantages of Linear Regression

October 8, 2019July 31, 2020 Tung.M.Phung

Linear regression is frequently used in practice because of these 7 reasons. Continue reading Advantages of Linear Regression

Machine Learning - Data Mining, MiscellaneousLeave a comment

Over-fitting and Under-fitting

October 5, 2019July 31, 2020 Tung.M.Phung

Let’s examine everything we need to know about over-fitting and under-fitting. Continue reading Over-fitting and Under-fitting

Machine Learning - Data Mining, Regression ModelsLeave a comment

Regularization for Linear regression

October 5, 2019July 31, 2020 Tung.M.Phung

We tackle Regularization for Linear Regression by answering 5 questions: What, When, Where, How, and Why? Continue reading Regularization for Linear regression

DataVisualization, Machine Learning - Data MiningLeave a comment

Charts to show relationships between (or among) variables

September 14, 2019July 31, 2020 Tung.M.Phung

This is my 4th blog on a series of data visualization with charts for specific purposes. Continue reading Charts to show relationships between (or among) variables

Machine Learning - Data MiningLeave a comment

Data Mining – Machine Learning

August 30, 2019December 28, 2023 Tung.M.Phung

Getting started with Machine Learning What are Computer Science, Artificial Intelligence and Machine Learning? Different types of Machine Learning Machine Learning Applications Introduction to Python and Jupyter Numpy, Pandas, Scikit-learn and Matplotlib Data Scraping Data scraping: City dataset from Versus.com Data scraping: Phone dataset from Versus.com Data scraping: KDnuggets.com’s post statistics Data Scraping: Android App Dataset from Google Play Store Data Cleaning Data Cleaning case study: Google Play Store Dataset Preparatory Phase Control Variable Splitting data into a Training set and a Validation set Imbalanced Learning: sampling techniques Exploratory Data Analysis QQ plot versus PP plot versus Probability plot Multicollinearity … Continue reading Data Mining – Machine Learning

Machine Learning - Data MiningLeave a comment

Reinforcement Learning approaches for the Join Optimization problem in Database: DQ, ReJoin, Neo, RTOS, and Bao

March 15, 2022March 31, 2022 Tung.M.Phung

How can we use Reinforcement Learning for the problem of Join optimization in Database? Let us take a look at the 5 recent and outstanding approaches from top researchers in the world: DQ, ReJoin, Neo, RTOS, and Bao. Continue reading Reinforcement Learning approaches for the Join Optimization problem in Database: DQ, ReJoin, Neo, RTOS, and Bao

Deep Learning, Machine Learning - Data Mining, Text Mining1 Comment

A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

December 10, 2021March 26, 2023 Tung.M.Phung

Let us review a list of pretrained language models, including BERT, Transformer-XL, XLNet, RoBERTa, DistilBERT, ALBERT, BART, ELECTRA, ConvBERT, DeBERTa, and BigBird. Continue reading A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

Classification Models, Machine Learning - Data Mining, Regression ModelsLeave a comment

Ensemble: Bagging, Random Forest, Boosting and Stacking

May 4, 2020July 31, 2020 Tung.M.Phung

An ensemble of trees (in the form of bagging, random forest, or boosting) is usually preferred over one decision tree alone. Continue reading Ensemble: Bagging, Random Forest, Boosting and Stacking

Feature Engineering, Machine Learning - Data Mining3 Comments

How to convert Categorical Variables to Numerical Variables

December 18, 2019July 31, 2020 Tung.M.Phung

In Machine Learning, while some predictive models allow categorical variables in the data, most require all predictor variables to be continuous Continue reading How to convert Categorical Variables to Numerical Variables

Machine Learning - Data Mining, Regression ModelsLeave a comment

Linear Regression in Python

December 6, 2019July 31, 2020 Tung.M.Phung

Train and cross-validate your Linear regression on Python with pre-defined or customized evaluation functions. Continue reading Linear Regression in Python

Intro to ML, Machine Learning - Data MiningLeave a comment

Numpy, Pandas, Scikit-learn and Matplotlib

September 20, 2019July 31, 2020 Tung.M.Phung

Mining from data is not a simple task and the help of libraries makes the process more ẹnoyable. Continue reading Numpy, Pandas, Scikit-learn and Matplotlib

Tung M Phung's Blog

Search Results for: feature selection

Feature selection with sklearn

Advantages of Linear Regression

Over-fitting and Under-fitting

Regularization for Linear regression

Charts to show relationships between (or among) variables

Data Mining – Machine Learning

Reinforcement Learning approaches for the Join Optimization problem in Database: DQ, ReJoin, Neo, RTOS, and Bao

A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

Ensemble: Bagging, Random Forest, Boosting and Stacking

How to convert Categorical Variables to Numerical Variables

Linear Regression in Python

Numpy, Pandas, Scikit-learn and Matplotlib