Vector search has risen to become a foundational tool in modern search and retrieval systems, including the RAG pipelines that power many AI applications. However, the demands on retrieval systems are growing more sophisticated, which is revealing the limits of relying on a single vector similarity score.
Vespa is a popular open source search and data serving engine. Central to Vespa’s architecture is tensor-based retrieval, which is an approach that represents data as tensors rather than simple vectors. Tensor-based retrieval enables richer mathematical operations and more flexible ranking functions that can surmount the limitations of a single vector similarity score.
Radu Gheorghe is a software engineer at Vespa with a background spanning nearly 12 years of consulting and training on Elasticsearch and Solr. In this episode, Radu joins Sean Falconer to discuss why vector similarity alone falls short in production, how tensor-based retrieval generalizes to support richer ranking functions, the trade-offs in chunking and multi-stage re-ranking architectures, and where AI search is headed next.
Full Disclosure: This episode is sponsored by Vespa.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from AI to quantum computing. Currently, Sean is an AI Entrepreneur in Residence at Confluent where he works on AI strategy and thought leadership. You can connect with Sean on LinkedIn.
Â
Please click here to see the transcript of this episode.
