EpistemeAI
/

ReasoningCore-Llama-3B-R1-aligned

Text Generation

text-generation-inference

Model card Files Files and versions

legolasyiu commited on Apr 9

Commit

0a85301

·

verified ·

1 Parent(s): 5990dd3

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -13,8 +13,7 @@ language:
 # ReasoningCore-Llama-3B-R1-aligned
-**ReasoningCore-Llama-3B-R1-aligned** is a multilingual, reasoning‑enhanced large language model developed by EpitemeAI. Pretrained on vast amounts of publicly available data and instruction‑tuned to excel at nuanced reasoning, dialogue management, retrieval, and summarization tasks, it often outperforms many current open source and proprietary conversational models on a range of industry benchmarks. Fine tuned with reasoning dataset.
 ### We used GRPO technique:
 To provide a comprehensive overview of Group Relative Policy Optimization (GRPO), a post-training technique for Large Language Models (LLMs), and its application in the DeepSeek-R1 model.

 # ReasoningCore-Llama-3B-R1-aligned
+**ReasoningCore-Llama-3B-R1-aligned** is a multilingual, reasoning‑enhanced large language model developed by EpistemeAI. It is supervised fine tuning with alignment and safety dataset to steer to safety response.
 ### We used GRPO technique:
 To provide a comprehensive overview of Group Relative Policy Optimization (GRPO), a post-training technique for Large Language Models (LLMs), and its application in the DeepSeek-R1 model.