How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "T1anyu/DeepInnovator" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T1anyu/DeepInnovator",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "T1anyu/DeepInnovator" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T1anyu/DeepInnovator",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Quick Links

DeepInnovator-14B

💻 Code📄 Paper🤗 Model

Model Description

DeepInnovator is a Large Language Model trained to possess genuine innovative capability — the ability to autonomously generate novel and significant research ideas. Unlike existing approaches that rely on sophisticated prompt engineering, DeepInnovator is built upon a systematic training paradigm designed to trigger the innovative capability of LLMs.

Key Features

  • 🚀 Innovative Capability: Trained specifically for generating novel research ideas
  • 📚 Knowledge-Grounded: Leverages structured research knowledge extracted from vast scientific literature
  • 🔄 Iterative Refinement: Employs "Next Idea Prediction" paradigm for continuous idea improvement
  • 🏆 State-of-the-Art Performance: Achieves 80.53%-93.81% win rates against untrained baselines

Training Methodology

DeepInnovator comprises two core components:

1. "Standing on the Shoulders of Giants"

An automated data extraction pipeline that extracts and organizes structured research knowledge from a vast corpus of unlabeled scientific literature.

2. "Conjectures and Refutations"

A "Next Idea Prediction" training paradigm that models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next ideas.

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "T1anyu/DeepInnovator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Using vLLM for Faster Inference

from vllm import LLM, SamplingParams

llm = LLM(model="T1anyu/DeepInnovator")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024)

prompt = "Based on the recent advances in graph neural networks and large language models, propose a novel research idea:"
outputs = llm.generate([prompt], sampling_params)

print(outputs[0].outputs[0].text)

Evaluation Results

Both automatic and expert evaluations demonstrate that DeepInnovator-14B significantly outperforms untrained baselines:

Comparison Win Rate
vs. Untrained Baselines 80.53% - 93.81%
vs. Leading LLMs Comparable Performance

Citation

If you find DeepInnovator useful in your research, please cite our paper:

@article{fan2026deepinnovator,
  title={DeepInnovator: Triggering the Innovative Capabilities of LLMs},
  author={Fan, Tianyu and Zhang, Fengji and Zheng, Yuxiang and Chen, Bei and Niu, Xinyao and Huang, Chengen and Lin, Junyang and Huang, Chao},
  journal={arXiv preprint arXiv:2602.18920},
  year={2026}
}

License

This model is released under the Apache 2.0 License.

Links

Acknowledgements

This work is developed by the HKU Data Science Lab (HKUDS).

Downloads last month
1,112
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with T1anyu/DeepInnovator.

Model tree for T1anyu/DeepInnovator

Base model

Qwen/Qwen2.5-14B
Finetuned
(407)
this model
Quantizations
1 model

Paper for T1anyu/DeepInnovator