Context length
Context length is the maximum number of tokens the model can consider at once (prompt + history + new output). Larger context windows require more memory.
Why it matters
- Larger context improves long conversations and RAG
- Memory usage grows with context length
Configuration
Set context length via server configuration or model options depending on your setup.