Hardware support
Psyllama performance depends on the underlying runtime (llama.cpp) and your CPU/GPU setup.
NVIDIA
For CUDA acceleration, install NVIDIA drivers compatible with your GPU and OS. Ensure your build/runtime has CUDA support enabled.
AMD Radeon
AMD support depends on your platform and backend support. On Linux, ROCm or Vulkan paths may be available depending on your build.
Apple (Metal)
On macOS, Metal acceleration can provide strong performance on Apple Silicon.
Vulkan
Some environments may use Vulkan for GPU offload depending on build options.
Tips
- Use quantized models to reduce memory usage
- Start with smaller models to validate GPU acceleration