Streaming

Psyllama supports streaming tokens over both the CLI and the HTTP API. Streaming improves perceived latency and enables interactive UIs.

CLI

Most interactive commands stream by default.

psyllama run kimi-k2.5:cloud

Set stream to true on supported endpoints.

curl http://localhost:11434/api/chat \
  -d '{"model":"kimi-k2.5:cloud","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

Read the HTTP response body incrementally. Each chunk is a JSON object containing partial output.