Cross-encoder

Technical implementation · AI Search Infrastructure

Definition

A cross-encoder is the model architecture used in reranking. It takes a query-chunk pair as joint input — processing both together — and outputs a relevance score. Unlike a bi-encoder, it can attend to the interaction between query and chunk, making it dramatically more accurate at relevance scoring.

Why It Matters for AI Search

Cross-encoders are too slow to run over a full index at query time, which is why retrieval pipelines separate first-pass retrieval (bi-encoder speed) from reranking (cross-encoder accuracy). The cross-encoder sees far fewer candidates — the top-k from first pass — and can afford to be thorough. Content that answers a query completely and specifically scores better under cross-encoder reranking than content that is merely topically adjacent.

Definition

Why It Matters for AI Search

Reranking

Bi-encoder

First-pass retrieval

Retrieval pipeline

Cosine similarity

Relevant Plate Lunch Collective Services

​Definition

​Why It Matters for AI Search

​Related Terms

Reranking

Bi-encoder

First-pass retrieval

Retrieval pipeline

Cosine similarity

​Relevant Plate Lunch Collective Services

Definition

Why It Matters for AI Search

Related Terms

Relevant Plate Lunch Collective Services