Similarity?¶
What is similarity?¶
In words:
Overview of Methods¶
Family | Example Methods |
---|---|
Linear | \rho V-Coefficient |
Distance-Based | Energy Distance, MMD |
Shannon Entropy | D_\text{KL} |
- All Linear methods are just regression - notebook
Method | Property |
---|---|
Linear | Uniform |
Distance | Adaptive |
Kernels | Adaptive, Smoothing |
Information | Distribution |
Contents¶
Linear Methods¶
In this document, I will go over the main linear ways that we can estimate similarity. They include the classic measures such as covariance and Pearson's correlation coefficient \rho. I will also go over how this can be extended to multivariate datasets with measures such as the congruence coefficient and the \rhoV-Coefficient; multi-dimensional extensions to the original \rho.
- Theory: Document
- Lab: Colab Notebook
Distance-Based¶
This is a non-linear extension to the linear method which attempts to estimate similarity by distance formulas. This gives us a lot of flexibility with the distance metric that we choose (as long as they satisfy some key conditions). In the document, I will go over some of the common ones and how we go from linear to distance-metrics.
- Theory: Document
Kernel Methods¶
The kernel methods acts as a non-linear extension to the linear method and it generalizes the distance-based methods. These utilize kernel functions which act as non-linear functions which operate on the linear method and/or the distance metric. This adds an even higher level of flexibility. This is good or bad depending upon your perspective. There are two main candidates: the HSIC/Kernel Alignment and the Maximum Mean Discrepency (MMD) methods. They are related but the way the methods are calculated vary as will be explained above.
Variation of Information¶
The final method that we investigate is a different approach than the methods listed about: estimating the PDF of the dataset and then calculating Information theoretic measures.
Visualization¶
This is something that I didn't believe was so important until I had to explain my results. It was very clear it my head but I ended up showing a lot of different graphs and it was difficult for everyone to grasp the essence. Taylor Diagrams were originally used for univariate comparisons to show the Pearson correlation coefficient, the norm of the vector and the RMSE all in one plot. I show that this can be extended to the multivariate measures such as the \rhoV-Coefficient, the Kernel Alignment Method and the Variation of Information methods.
Resources¶
- Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions - Sung-Huyk Cha - International Journal of Mathematical Models and Methods in Applied Sciences, (...)
- Multivariate Statistics Summary and Comparison of Techniques - Presenation