Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
HuggingFaceFW
's Collections
🤏 Smol-Data
🌐 FineWiki
📄 FinePDFs
🥂 FineWeb2
🍷 FineWeb
📚 FineWeb-Edu
📀 Dataset comparison models
🧪 FineWeb v1 data experiments
📀 Dataset comparison models
updated
Jun 12, 2024
1.8B models trained on 350BT to compare different pretraining datasets
Upvote
42
+32
Sort: Collection
HuggingFaceFW/ablation-model-fineweb-edu
Text Generation
•
2B
•
Updated
Jun 11, 2024
•
43
•
24
HuggingFaceFW/ablation-model-fineweb-v1
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
26
•
14
HuggingFaceFW/ablation-model-refinedweb
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
6
•
3
HuggingFaceFW/ablation-model-c4
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
5
•
4
HuggingFaceFW/ablation-model-dolma-v1_6
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
5
•
2
HuggingFaceFW/ablation-model-slimpajama
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
6
•
2
HuggingFaceFW/ablation-model-the-pile
Text Generation
•
2B
•
Updated
Apr 25, 2024
•
7
•
1
HuggingFaceFW/ablation-model-redpajama2
Text Generation
•
2B
•
Updated
May 5, 2024
•
4
Upvote
42
+38
Sort: Collection
Share collection
View history
Collection guide
Browse collections