Binarized Johnson-Lindenstrauss embeddings

Sjoerd Dirksen, Alexander Stollenwerk

Research output: Working paperPreprintAcademic

Abstract

We consider the problem of encoding a set of vectors into a minimal number of bits while preserving information on their Euclidean geometry. We show that this task can be accomplished by applying a Johnson-Lindenstrauss embedding and subsequently binarizing each vector by comparing each entry of the vector to a uniformly random threshold. Using this simple construction we produce two encodings of a dataset such that one can query Euclidean information for a pair of points using a small number of bit operations up to a desired additive error - Euclidean distances in the first case and inner products and squared Euclidean distances in the second. In the latter case, each point is encoded in near-linear time. The number of bits required for these encodings is quantified in terms of two natural complexity parameters of the dataset - its covering numbers and localized Gaussian complexity - and shown to be near-optimal.
Original languageEnglish
PublisherarXiv
Pages1-27
Number of pages27
DOIs
Publication statusPublished - 17 Sept 2021

Keywords

  • cs.IT
  • cs.DS
  • math.IT
  • math.MG

Fingerprint

Dive into the research topics of 'Binarized Johnson-Lindenstrauss embeddings'. Together they form a unique fingerprint.

Cite this