TileDB builds a universal storage engine for multidimensional arrays — think columnar databases but generalised to any shape of data. Their vector search product lets you store and query high-dimensional embeddings at scale, which is what everyone needs now that everything is an LLM.
I worked on the vector search engine. The lead engineer who had written most of the codebase left shortly before I joined, so I spent a lot of time getting up to speed on thousands of lines of modern C++ before I could contribute meaningfully.
Integrating distance metrics
Once I had the codebase under my fingers, I integrated inner product and cosine similarity into the engine through a type erasure layer — so callers don't pay for virtual dispatch and the compiler can inline aggressively. No performance compromise from the abstraction.
Memory scaling
During vector ingestion, the distributed execution graph was holding far more live data than it needed to. I redesigned the resource scaling strategy so memory usage tracked the actual working set rather than worst-case allocation. Result: 8× memory reduction with only a 4% latency increase.
Research interest sparked
I ended up spending a lot of time reading about vector search algorithms and distance metrics and helping my coworkers debug new algos. I ended up writing a couple of papers on vector databases and became the expert in my lab.
Measures the angle between vectors — ignores magnitude. Two documents with the same topic but different lengths score identically.
Magnitude matters here. A longer vector in the same direction scores higher. Used when embedding norms carry information.
Straight-line distance in embedding space. Sensitive to both direction and magnitude. Standard default for spatial problems.
Hover the canvases to drag the query vector (blue) and see the score update in real time. The reference vector (red) is fixed. All drawn in 2D for illustration; production vectors are typically 768–3072 dimensions.
- Type erasure via function pointer tables lets the dispatch layer select the right kernel at runtime without virtual calls. Callers see a single typed interface with zero overhead at the abstraction boundary.
- The ingestion graph scheduler was reworked to track per-node memory high-water marks.
- IVF_FLAT (Inverted File Index, flat quantisation) partitions the vector space into Voronoi cells. At query time only the nearest k cells are scanned, reducing the number of distance computations from O(N) to O(N/k) with negligible recall loss at typical probe counts.