Let us discuss the state-of-the-art methods for transforming every kind of input data into fixed-length vectors of continuous values, including Word2Vec, Doc2Vec, Image2Vec, Node2Vec, Edge2Vec, Code2Vec, and Data2Vec. Continue reading Transforming everything to vectors with Deep Learning: from Word2Vec, Node2Vec, to Code2Vec and Data2Vec
The Transformer neural networks, explained in details. Continue reading The Transformer neural network architecture
Introduction to and comparison of Batch Norm, Weight Norm, Layer Norm, Instance Norm, and Group Norm. Continue reading Deep Learning normalization methods
Get your first intuition with Attention right with a minimal code-example. Continue reading Attention in Deep Learning, your starting point (with code)
Through various experiments, ELU is accepted by many researchers as a good successor of the original version (ReLU). Continue reading ELU activation: A comprehensive analysis
Despise its simplicity, ReLU previously achieved the top performance over various tasks of modern Machine Learning. Continue reading Rectifier Linear Unit (ReLU)
The sigmoid and tanh activation functions were very frequently used in the past but have been losing popularity in the era of Deep learning. Continue reading Sigmoid, tanh activations and their loss of popularity
Many people have a tendency to always do feature centering, scaling or normalizing right before applying predictive models to the data… Continue reading When to do feature centering, scaling and normalization?