.___.
     /     \
    |-■ ― ■-|
    /  \_/  \ 
  .' /     \ `.
 / _|       |_ \
(_/ |       | \_)
    \       /
   __\_>-<_/__
   ~;/     \;~
<< back

How to use t-SNE effectively

I often use t-SNE for projects at work. It’s a great technique for dimensionality reduction and visualization. As it puts a particular focus on forming groups, it is suited as a pre-processing step for clustering high-dimensional data. For example, I once used it to generate genome clusters of a metagenome (more info).

t-SNE visualization of a biogas metagenome.
t-SNE visualization of a biogas metagenome.

Every so often, when presenting plots generated by t-SNE, questions arise on how to interpret those results. As an example, people commonly apply PCA and want to know how to interpret the axes of t-SNE visualizations. In PCA, the axes are the chosen principal components, but in t-SNE they are meaningless. And as soon as you dive deeper into t-SNE, more questions (and possibly misinterpretations!) will arise. For example, tuning hyper-parameters is crucial and one has to be aware of their big influence on the resulting representation.

In this context, I was delighted to see that Distill took the effort and applied t-SNE to a bunch of toy examples and interactively visualized the results.

How to Use t-SNE Effectively is a great article, and I advise everyone using this technique to read it!