When browsing through tutorials and reading books, sometimes we see the terms ‘Parametric‘ and ‘Nonparametric‘ algorithms. The authors of the tutorials and books suggest when we should use this type of algorithms and when to use the other, but how do we differentiate between these two? This blog post will show us how!
Let me take a subtle note before we begin, hopefully, this will not drag your mood down, but the separating line of Parametric and Nonparametric is somehow vague, meaning the discrimination is a bit ambiguous. Nevertheless, it is just “a bit ambiguous”, not “very ambiguous”, so don’t be depressed, ok?
Properties
The most important factor to determine if an algorithm is parametric or not depends on its nature and assumptions: a parametric algorithm has some assumptions about the underlying characteristics of the function it estimates, while a nonparametric algorithm does not.
E.g.
Parametric  Nonparametric 
The Linear Regression has the assumption that the predictor variables have a linear and additive relationship to the target variable.  KNearest Neighbors does not make any assumption about the data, its classification of a data point depends purely on the known points around. 
The second trait is: for parametric algorithms, the number of parameters is usually fixed, while for nonparametric algorithms, it can potentially grow to infinity, depending on the training data.
E.g.
Parametric  Nonparametric 
The Linear Regression has a fixed number of weights, which is predefined before we train the model.  For the Decision Tree, the number of nodes in the tree is not constant. Depending on the size and the width of input data that the number of nodes can rise indefinitely. 
The third and also the last property is: The parameters of parametric models usually have their meaning (i.e. role) in explaining how the models work, while this is not true for nonparametric models.
E.g.
Parametric  Nonparametric 
Each weight of a Linear Regression model represents how a predictor variable affects the target variable. A big absolute weight means the predictor has a big influence on the target, while a nearlyzero weight says that the predictor is somehow not important at all.  For deep neural networks, the value of weights on the connections and the bias terms do not tell anything specific. 
Examples
Some parametric algorithms are:
 Linear regression
 Logistic regression
 Gaussian Naive Bayes
Some nonparametric algorithms are:
 Knearest neighbors
 Decision tree
 Deep neural networks
 Support vector machine
 Naive Bayes with Density estimation
Advantages
Each type of algorithm has its own advantages and disadvantages. While we look at them from this angle, there are only 2 types, so the advantages of this type are the disadvantages of the other type and vice versa.
The parametric algorithms usually have below strengths:
 Simpler and more intuitive.
 Faster to train and give predictions.
 Require fewer data.
 Interpretable.
To the contrary, nonparametric algorithms thrive on:
 Have flexibility, more versatile as they do not make assumptions about data.
 Stronger performance if being provided enough data.
To sum up
In this blog post, we discussed the differences and how to differentiate parametric and nonparametric algorithms. In short, parametric algorithms (represented by Linear regression) make strong assumptions about the data, thus they require fewer data to train and lesser time to run. On the other hand, Nonparametric algorithms (e.g. Deep neural networks) do not assume prior knowledge of the data, which makes them more versatile.
Test your understanding 

References: