llamafile v0.10.0 alpha

This repository contains a few llamafiles built from our work-in-progress branch. They are "alpha" quality and just meant to showcase our progress while rebuilding llamafile from more recent versions of llama.cpp. They might change or be removed soon - but the good news is if they do, there will be better versions available!

At the present time, these binaries support two different use cases:

llamafile TUI
http / openai compatible server (start with the --server flag), which relies on the llama.cpp server

Just click on "Files and Versions" and choose your llamafile! Every file with the .llamafile extension is a pre-packaged llamafile that you can just download and execute, while llamafile_0.10.0.alpha is just the main executable, which you can run with custom GGUF files using the --model flag.

These llamafiles run on CPU and (if available) Metal GPUs. Support for other GPU types and CPU optimizations are in our roadmap.

For more information, check our Christmas update!

Downloads last month: 30

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support