README.md · QuantTrio/Kimi-Dev-72B-GPTQ-Int8 at main

Kimi-Dev-72B-GPTQ-Int8 / README.md

JunHowie

Upload README.md

2a42f96 verified 2 months ago

preview code

raw

history blame contribute delete

3.77 kB

	---
	language:
	- multilingual
	license: mit
	license_link: https://huggingface.co/moonshotai/Kimi-Dev-72B/blob/main/LICENSE.md
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- GPTQ
	- Int8
	- vLLM
	- code
	- swebench
	- software
	- issue-resolving
	base_model:
	- moonshotai/Kimi-Dev-72B
	base_model_relation: quantized
	---
	# Kimi-Dev-72B-GPTQ-Int8
	Base model: [moonshotai/Kimi-Dev-72B](https://huggingface.co/moonshotai/Kimi-Dev-72B)

	<i>Calibrate using the https://huggingface.co/datasets/timdettmers/openassistant-guanaco/blob/main/openassistant_best_replies_eval.jsonl dataset.</i>
	<br>
	<i>The quantization configuration is as follows</i>

	```
	quant_config = QuantizeConfig(bits=8, group_size=128, desc_act=False)
	```

	### 【vLLM Startup Command】
	```
	vllm serve JunHowie/Kimi-Dev-72B-GPTQ-Int8
	```


	### 【Model Download】

	```python
	from huggingface_hub import snapshot_download
	snapshot_download('JunHowie/Kimi-Dev-72B-GPTQ-Int8', cache_dir="your_local_path")
	```

	### 【Overview】
	<!-- # Kimi-Dev -->

	<div align="center">
	<img src="./assets/main_logo.png" alt="Kimi Logo" width="400" />
	<h2><a href="https://moonshotai.github.io/Kimi-Dev/">
	Introducing Kimi-Dev: <br>A Strong and Open-source Coding LLM for Issue Resolution</a></h2>
	</a></h2>
	<b>Kimi-Dev Team</b>
	<br>

	</div>
	<div align="center">
	<a href="">
	<b>📄 Tech Report (Coming soon...)</b>
	</a>  \|
	<a href="https://github.com/MoonshotAI/Kimi-Dev">
	<b>📄 Github</b>
	</a>
	</div>

	<br>
	<br>

	<!-- https://github.com/MoonshotAI/Kimi-Dev -->

	We introduce Kimi-Dev-72B, our new open-source coding LLM for software engineering tasks. Kimi-Dev-72B achieves a new state-of-the-art on SWE-bench Verified among open-source models.

	- Kimi-Dev-72B achieves 60.4% performance on SWE-bench Verified. It surpasses the runner-up, setting a new state-of-the-art result among open-source models.


	- Kimi-Dev-72B is optimized via large-scale reinforcement learning. It autonomously patches real repositories in Docker and gains rewards only when the entire test suite passes. This ensures correct and robust solutions, aligning with real-world development standards.


	- Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub. We welcome developers and researchers to explore its capabilities and contribute to development.


	<div align="center">
	<img src="./assets/open_performance_white.png" alt="Kimi Logo" width="600" />
	<p><b>Performance of Open-source Models on SWE-bench Verified.</b></p>

	</div>



	## Quick Start
	```
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "moonshotai/Kimi-Dev-72B"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Give me a short introduction to large language model."
	messages = [
	{"role": "system", "content": "You are a helpful assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

	```

	## Citation
	```
	@misc{kimi_dev_72b_2025,
	title = {Introducing Kimi-Dev: A Strong and Open-source Coding LLM for Issue Resolution},
	author = {{Kimi-Dev Team}},
	year = {2025},
	month = {June},
	url = {\url{https://www.moonshot.cn/Kimi-Dev}}
	}
	```