LLM Compressor testing - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

LLM Compressor testing

updated Nov 17, 2025

nm-testing/tinysmokellama-3.2

354k • Updated 8 days ago • 95.9k
nm-testing/llama2.c-stories42M-pruned2.4

Updated Oct 29, 2025 • 340
nm-testing/tinyllama-fp8-dynamic-compressed

1B • Updated Oct 9, 2024 • 446
nm-testing/tinyllama-w4a16-compressed

1B • Updated Oct 9, 2024 • 1.38k
nm-testing/tinyllama-w8a8-compressed

1B • Updated Oct 9, 2024 • 851
nm-testing/tinyllama-w8a16-dense

1B • Updated Mar 7 • 537
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-compressed

1B • Updated Jan 14, 2025 • 364
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-uncompressed

1B • Updated Jan 14, 2025 • 160
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-compressed

1B • Updated Jan 14, 2025 • 340
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 181
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-compressed

1B • Updated Jan 14, 2025 • 331
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-uncompressed

1B • Updated Jan 14, 2025 • 180
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-compressed

1B • Updated Jan 14, 2025 • 398
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 160