Inference Requirements

Paul-Arista · April 28, 2026, 8:20pm

Does hugging face have anything like this - LLM Directory: All Local LLMs List

which shows inference requirements

Lightcap · May 2, 2026, 2:02pm

I don’t think Hugging Face has one central directory exactly like that.

The closest thing I know of on HF is the model memory estimator / model-memory-usage Space. It can give you a decent first estimate for whether a model will fit in VRAM, but it is not really the same as a curated LLM requirements table.

The annoying part is that “requirements” are not one number. A 7B model can mean very different things depending on fp16, int8, 4-bit, GGUF, vLLM, Transformers, llama.cpp, context length, batch size, KV cache, CPU offload, etc.

So for now I usually treat HF model cards as the source for model details, then use a memory estimator or do the rough math myself. For LLMs, fp16/bf16 is roughly 2 GB per billion parameters just for weights, 8-bit around half of that, 4-bit around a quarter, plus overhead for runtime and context.

It would be nice if HF had this as a first-class field on model pages, even if it was only approximate. Right now it is scattered between model cards, discussions, Spaces, and external tools.

John6666 · May 3, 2026, 5:17am

It’s not actually run by Hugging Face itself, it’s user-generated, but leaderboards might be similar? Spaces - Hugging Face

Also, Eval Results of models: Models – Hugging Face

Paul-Arista · May 3, 2026, 11:12am

Spaces - Hugging Face I select open LLM leaderboard and get error

John6666 · May 3, 2026, 2:43pm

Oh… Report it to the author via discussion: open-llm-leaderboard/open_llm_leaderboard · Discussions

Topic		Replies	Views
Identify model requirements in memory and disk Models	1	215	July 26, 2025
Let's add system requirements to model publications 🤗Hub	5	1259	October 27, 2024
Memory Requirements for Running LLM Beginners	2	7853	May 8, 2024
Local HW specs for Hosting meta-llama/Llama-3.2-11B-Vision-Instruct 🤗Transformers	4	2083	October 28, 2024
How can I search for models, sorted in order of required vram? Site Feedback	0	477	January 28, 2023

Inference Requirements

Related topics