How to use t-SNE effectively
I often use t-SNE for projects at work. It’s a great technique for dimensionality reduction and visualization. As it puts a particular focus on forming groups, it is suited as a pre-processing step for clustering high-dimensional data. For example, I once used it to generate genome clusters of a metagenome (more info).
Every so often, when presenting plots generated by t-SNE, questions arise on how to interpret those results. As an example, people commonly apply PCA and want to know how to interpret the axes of t-SNE visualizations. In PCA, the axes are the chosen principal components, but in t-SNE they are meaningless. And as soon as you dive deeper into t-SNE, more questions (and possibly misinterpretations!) will arise. For example, tuning hyper-parameters is crucial and one has to be aware of their big influence on the resulting representation.
In this context, I was delighted to see that Distill took the effort and applied t-SNE to a bunch of toy examples and interactively visualized the results.
How to Use t-SNE Effectively is a great article, and I advise everyone using this technique to read it!