🦎Psyllama

Context length

Context length is the maximum number of tokens the model can consider at once (prompt + history + new output). Larger context windows require more memory.

Why it matters

- Larger context improves long conversations and RAG
- Memory usage grows with context length

Configuration

Set context length via server configuration or model options depending on your setup.