ColBERT ColBERTv2 Late Interaction and Why They Matter in Search
Jina-ColBERT
https://jina.ai/news/what-is-colbert-and-late-interaction-and-why-they-matter-in-search/
"ColBERT" stands for Contextualized Late Interaction over BERT
"Late interaction" is the essence of ColBERT. The term is derived from the model's architecture and processing strategy, where the interaction between the query and document representations occurs late in the process, after both have been independently encoded. This contrasts with "early interaction" models, where query and document embeddings interact at earlier stages, typically before or during their encoding by the model.
Unlike representation-based approaches that encode each document into one embedding vector, ColBERT encodes documents (and queries) into bags of embeddings, with each token in a document having its own embedding. This approach inherently means that for longer documents, more embeddings will be stored, which is a pain point of the original ColBERT, and later addressed by ColBERTv2.