# Feature selection with sklearn

Feature selection is hard but very important. Continue reading Feature selection with sklearn

Skip to content
# Search Results for: feature selection

# Feature selection with sklearn

# Advantages of Linear Regression

# Over-fitting and Under-fitting

# Regularization for Linear regression

# Charts to show relationships between (or among) variables

Sticky post
# Data Mining – Machine Learning

# Reinforcement Learning approaches for the Join Optimization problem in Database: DQ, ReJoin, Neo, RTOS, and Bao

# A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

# Ensemble: Bagging, Random Forest, Boosting and Stacking

# How to convert Categorical Variables to Numerical Variables

# Linear Regression in Python

# Numpy, Pandas, Scikit-learn and Matplotlib

Feature selection is hard but very important. Continue reading Feature selection with sklearn

Linear regression is frequently used in practice because of these 7 reasons. Continue reading Advantages of Linear Regression

Let’s examine everything we need to know about over-fitting and under-fitting. Continue reading Over-fitting and Under-fitting

We tackle Regularization for Linear Regression by answering 5 questions: What, When, Where, How, and Why? Continue reading Regularization for Linear regression

This is my 4th blog on a series of data visualization with charts for specific purposes. Continue reading Charts to show relationships between (or among) variables

Getting started with Machine Learning What are Computer Science, Artificial Intelligence and Machine Learning? Different types of Machine Learning Machine Learning Applications Introduction to Python and Jupyter Numpy, Pandas, Scikit-learn and Matplotlib Data Scraping Data scraping: City dataset from Versus.com Data scraping: Phone dataset from Versus.com Data scraping: KDnuggets.com’s post statistics Data Scraping: Android App Dataset from Google Play Store Data Cleaning Data Cleaning case study: Google Play Store Dataset Preparatory Phase Control Variable Splitting data into a Training set and a Validation set Imbalanced Learning: sampling techniques Exploratory Data Analysis QQ plot versus PP plot versus Probability plot Multicollinearity … Continue reading Data Mining – Machine Learning

How can we use Reinforcement Learning for the problem of Join optimization in Database? Let us take a look at the 5 recent and outstanding approaches from top researchers in the world: DQ, ReJoin, Neo, RTOS, and Bao. Continue reading Reinforcement Learning approaches for the Join Optimization problem in Database: DQ, ReJoin, Neo, RTOS, and Bao

Let us review a list of pretrained language models, including BERT, Transformer-XL, XLNet, RoBERTa, DistilBERT, ALBERT, BART, ELECTRA, ConvBERT, DeBERTa, and BigBird. Continue reading A review of pre-trained language models: from BERT, RoBERTa, to ELECTRA, DeBERTa, BigBird, and more

An ensemble of trees (in the form of bagging, random forest, or boosting) is usually preferred over one decision tree alone. Continue reading Ensemble: Bagging, Random Forest, Boosting and Stacking

In Machine Learning, while some predictive models allow categorical variables in the data, most require all predictor variables to be continuous Continue reading How to convert Categorical Variables to Numerical Variables

Train and cross-validate your Linear regression on Python with pre-defined or customized evaluation functions. Continue reading Linear Regression in Python

Mining from data is not a simple task and the help of libraries makes the process more ẹnoyable. Continue reading Numpy, Pandas, Scikit-learn and Matplotlib