A context window is the maximum amount of text — measured in tokens — that a language model can process in a single inference call. Content within the context window is available for the model to reason about; content outside it must be retrieved separately or is unavailable.
Context window size determines how much source material an AI system can process when generating a response. For RAG systems, the context window sets the upper limit on how many retrieved passages can be included before answer synthesis. For content strategy, understanding context windows explains why concise, front-loaded content often performs better than long, detailed content in AI extraction — the model is working within a token budget, and content that reaches its key claims early is more likely to survive the context window than content that builds to its conclusion.