Plain-English Definition
The maximum amount of text a model can attend to in a single response.
Category
LLM Retrieval, RAG, and Answer Systems
How AI systems fetch, rank, and compress information—useful for designing pages that survive retrieval and summarization.
Explore categoryRelated concepts
Other terms in this category that are worth understanding alongside this one.
A pattern where a model retrieves documents at query-time and generates an answer grounded in those sources.
Selecting candidate documents or passages likely to contain the answer.
A numeric representation of text used to measure semantic similarity for retrieval.
A store optimized for similarity search over embeddings.
Splitting content into smaller passages for retrieval and context packing.
The length of each passage; too small loses context, too large wastes limited context space.
Now see where you stand
Run a free audit on your site. Get a score on all 15 dimensions and a clear list of what to fix.