legolasyiu commited on
Commit
0a85301
·
verified ·
1 Parent(s): 5990dd3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -13,8 +13,7 @@ language:
13
 
14
  # ReasoningCore-Llama-3B-R1-aligned
15
 
16
- **ReasoningCore-Llama-3B-R1-aligned** is a multilingual, reasoning‑enhanced large language model developed by EpitemeAI. Pretrained on vast amounts of publicly available data and instruction‑tuned to excel at nuanced reasoning, dialogue management, retrieval, and summarization tasks, it often outperforms many current open source and proprietary conversational models on a range of industry benchmarks. Fine tuned with reasoning dataset.
17
-
18
  ### We used GRPO technique:
19
 
20
  To provide a comprehensive overview of Group Relative Policy Optimization (GRPO), a post-training technique for Large Language Models (LLMs), and its application in the DeepSeek-R1 model.
 
13
 
14
  # ReasoningCore-Llama-3B-R1-aligned
15
 
16
+ **ReasoningCore-Llama-3B-R1-aligned** is a multilingual, reasoning‑enhanced large language model developed by EpistemeAI. It is supervised fine tuning with alignment and safety dataset to steer to safety response.
 
17
  ### We used GRPO technique:
18
 
19
  To provide a comprehensive overview of Group Relative Policy Optimization (GRPO), a post-training technique for Large Language Models (LLMs), and its application in the DeepSeek-R1 model.