Explaining Projections of High-Dimensional Data

Zonglin Tian

Research output: ThesisDoctoral thesis 1 (Research UU / Graduation UU)


Visualization techniques and methods are often a key aid for scientists who aim to form, refine, or invalidate hypotheses about underlying phenomena based on multidimensional datasets. Among such techniques, dimensionality reduction techniques, also called projections, offer significant advantages with respect to other visualization techniques in terms of their computational and visual scalability in both the number of samples and dimensions, and have hence become one of the most used visualizations of multidimensional data. However, projections create images which are hard, if not impossible, to interpret in detail without additional visual help. In this thesis, we study how to enrich existing projection techniques by so called interactive visual explanatory mechanisms. Our first contribution is an extension of an existing family of local explanation techniques to characterize the data in terms of local dimension correlation and intrinsic dimensionality. We show that our extensions, which are simple to implement, efficient to compute, and applicable to any projection method, can significantly contribute -- when combined with existing explanations -- to a better understanding of the visualized dataset. Our second contribution studies the usage of projections that create 3D scatterplots as opposed to the traditional 2D ones that most existing projection methods employ. We show that 3D projections offer only minimal improvements with respect to existing quality metrics used to measure projections. However, when annotated with our explanations, and since interactively viewable from multiple viewpoints, 3D projections create a stronger involvement of the user in exploring the depicted data. To further study how the insights obtained from a projection depend on the chosen viewpoints, we propose quality metrics to characterize the structures visible from any given viewpoint. We also propose an interactive tool to guide users in finding good-quality viewpoints. We conduct a user study that shows that our quality metrics agree with viewpoints perceived as useful by users both when guided by, and when not having, our interactive tool. Our final contribution explains multidimensional projections from the perspective of their computational stability. For this, we propose to use a variant of sensitivity analysis -- a well-known technique in signal processing but, to our knowledge, not having been used yet in assessing projections. We show that a recent deep-learning technique (NNP), which excels in computational speed, simplicity of use, genericity, quality, and out-of-sample ability, also meets the stability requirement as it exhibits only small output changes for significantly large changes of its input data for a range of perturbations.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Utrecht University
  • Telea, Alex, Supervisor
  • Behrisch, Michael, Co-supervisor
Award date18 Oct 2023
Place of PublicationUtrecht
Publication statusPublished - 18 Oct 2023


  • Visualization techniques
  • Multidimensional datasets
  • Dimensionality reduction
  • Projections
  • Computational and visual scalability
  • Local explanation techniques
  • Intrinsic dimensionality
  • 3D scatterplots
  • Interactive visual explanations
  • Computational stability


Dive into the research topics of 'Explaining Projections of High-Dimensional Data'. Together they form a unique fingerprint.

Cite this