🦎Psyllama

Streaming

Psyllama supports streaming tokens over both the CLI and the HTTP API. Streaming improves perceived latency and enables interactive UIs.

CLI

Most interactive commands stream by default.

psyllama run kimi-k2.5:cloud

HTTP API

Set stream to true on supported endpoints.

curl http://localhost:11434/api/chat \
  -d '{"model":"kimi-k2.5:cloud","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

Consuming streaming output

Read the HTTP response body incrementally. Each chunk is a JSON object containing partial output.