mena-open-data
's Collections
Arabic NLP datasets
updated
lightonai/nanobeir-multilingual
Viewer
•
Updated
•
522k
•
566
•
11
Viewer
•
Updated
•
47.8M
•
12.7k
•
31
Viewer
•
Updated
•
2.72k
•
9
Viewer
•
Updated
•
7.42k
•
87
•
2
Viewer
•
Updated
•
149
•
32
Viewer
•
Updated
•
4.13k
•
1.05k
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
Viewer
•
Updated
•
981k
•
417
•
2
malaysia-ai/Multilingual-TTS
Viewer
•
Updated
•
34.2M
•
1.64k
•
13
opendatalab/WanJuanSiLu-Multimodal-5Languages
Preview
•
Updated
•
88
•
3
Preview
•
Updated
•
208
•
35
Viewer
•
Updated
•
66k
•
64
•
10
LLaMAX/BenchMAX_Function_Completion
Viewer
•
Updated
•
2.79k
•
1.01k
•
1
Viewer
•
Updated
•
8.86k
•
2.18k
•
7
Viewer
•
Updated
•
3.25M
•
87
•
3
MLCommons/ml_spoken_words
Updated
•
2.15k
•
34
Twitter/HashtagPrediction
Viewer
•
Updated
•
1.07M
•
113
•
2
Viewer
•
Updated
•
1.4M
•
330
•
1
Viewer
•
Updated
•
3.62M
•
787
•
2
Viewer
•
Updated
•
197k
•
564
•
3
Viewer
•
Updated
•
54.9k
•
4.43k
•
74
Viewer
•
Updated
•
108k
•
5.98k
•
66
Updated
•
3.88k
•
14
Viewer
•
Updated
•
624
•
235
•
4
Viewer
•
Updated
•
5.07k
•
1.21k
Viewer
•
Updated
•
13.3k
•
81
•
4
Viewer
•
Updated
•
200
•
135
Viewer
•
Updated
•
37.4k
•
408
•
4
Updated
•
1.39k
•
4
Viewer
•
Updated
•
130k
•
211
•
2
Viewer
•
Updated
•
3.12k
•
1.2k
vg055/SemEval2025_Task11_TrackA
Viewer
•
Updated
•
2k
•
6
sarulab-speech/commonvoice22_sidon
Viewer
•
Updated
•
15.1M
•
1.39k
•
13
Preview
•
Updated
•
25
ToxicityPrompts/PolyGuardMix
Viewer
•
Updated
•
1.91M
•
309
•
4
Viewer
•
Updated
•
481k
•
86
•
15
Preview
•
Updated
•
243
•
8
Viewer
•
Updated
•
124M
•
301
•
16
linagora/linto-dataset-audio-ar-tn
Viewer
•
Updated
•
37.3k
•
1.95k
•
13
Viewer
•
Updated
•
13.6k
•
877
•
26
Viewer
•
Updated
•
676k
•
2.02k
•
35
Viewer
•
Updated
•
9.71k
•
1.66k
•
19
fr3on/election-questions-arabic
Viewer
•
Updated
•
1.49k
•
47
Updated
•
101
•
8
Viewer
•
Updated
•
3
•
13
•
1
Updated
•
362
•
20
papluca/language-identification
Viewer
•
Updated
•
90k
•
3.12k
•
61
vincentkoc/tiny_qa_benchmark_pp
Viewer
•
Updated
•
662
•
463
•
2
Viewer
•
Updated
•
70.3M
•
7.44k
•
17
Viewer
•
Updated
•
88.8k
•
9.46k
•
1.46k
Viewer
•
Updated
•
4.8k
•
16
s-nlp/EverGreen-Multilingual
Viewer
•
Updated
•
4.76k
•
60
•
1
camel-ai/ai_society_translated
Preview
•
Updated
•
117
•
16
LLaMAX/BenchMAX_Problem_Solving
Viewer
•
Updated
•
12.1k
•
621
•
1
alexandrainst/multi-wiki-qa
Viewer
•
Updated
•
1.22M
•
2.31k
•
21
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
4.68k
•
49
•
3
Melaraby/EvArEST-dataset-for-Arabic-scene-text-recognition
Viewer
•
Updated
•
296k
•
97
mozilla-foundation/common_voice_17_0
Updated
•
2.57k
•
2
suchirsalhan/Phonemized-UD
Viewer
•
Updated
•
1.19M
•
2.02k
LLMXperts/Arabic-NLi-Triplet
Viewer
•
Updated
•
571k
•
40
Updated
•
1.64k
•
3
adithya7/xlel_wd_dictionary
Viewer
•
Updated
•
230k
•
1.33k
•
3
Viewer
•
Updated
•
10k
•
279
•
54
Viewer
•
Updated
•
86.8M
•
3.06k
•
21
Viewer
•
Updated
•
76.3k
•
5.03k
•
4
Viewer
•
Updated
•
78k
•
100
•
3
Viewer
•
Updated
•
46.2k
•
1.13k
•
26
SaiedAlshahrani/Detect-Egyptian-Wikipedia-Articles
Viewer
•
Updated
•
756k
•
566
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair
Viewer
•
Updated
•
328k
•
89
•
4
aida-ugent/llm-ideology-analysis
Viewer
•
Updated
•
315k
•
180
•
4
Viewer
•
Updated
•
1.2k
•
34
•
6
Viewer
•
Updated
•
206k
•
3.61k
•
331
Viewer
•
Updated
•
290k
•
445
•
42
Viewer
•
Updated
•
255k
•
126
•
5
Preview
•
Updated
•
121
•
3
tellarin-ai/ntx_llm_instructions
Viewer
•
Updated
•
5.98k
•
124
Viewer
•
Updated
•
29.2k
•
3.28k
•
34
UBC-NLP/nilechat-arabizi-mor
Viewer
•
Updated
•
1.45M
•
25
•
2
Viewer
•
Updated
•
2.14M
•
48
•
5
CohereLabs/include-lite-44
Viewer
•
Updated
•
10.8k
•
1.08k
•
14
Viewer
•
Updated
•
3.48k
•
620
•
14
Viewer
•
Updated
•
7.35k
•
1.11k
Viewer
•
Updated
•
5.16k
•
237
•
5
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
479
•
5
Viewer
•
Updated
•
9.03B
•
33.7k
•
36
Viewer
•
Updated
•
310k
•
1.8k
•
9
CohereLabs/fusion-pairwise-evals-finetuned
Viewer
•
Updated
•
5.25k
•
21
Viewer
•
Updated
•
400
•
41
•
7
Viewer
•
Updated
•
8.69k
•
94
•
1
faisaltareque/XL-HeadTags
Viewer
•
Updated
•
415k
•
60
•
3
Viewer
•
Updated
•
3.91M
•
488
•
6
Viewer
•
Updated
•
100
•
26
•
1
Viewer
•
Updated
•
798k
•
3.79k
•
80
Viewer
•
Updated
•
330
•
73
•
3
Viewer
•
Updated
•
94.4k
•
1.65k
•
11
Updated
•
919
•
8
CohereLabs/fusion-synth-data-ufb
Viewer
•
Updated
•
94.7k
•
35
•
1
QCRI/AraDICE-ArabicMMLU-egy
Viewer
•
Updated
•
14.5k
•
1.55k
•
1
Viewer
•
Updated
•
121
•
88
•
3
Viewer
•
Updated
•
2.97M
•
1.94k
•
29
ClusterlabAi/101_billion_arabic_words_dataset
Viewer
•
Updated
•
33.1M
•
1.14k
•
68
omar-emad/financesecondtrial
Viewer
•
Updated
•
30
•
7
Updated
•
10
Viewer
•
Updated
•
11.4k
•
25
Viewer
•
Updated
•
695k
•
647
•
8
CohereLabs/deja-vu-pairwise-evals
Updated
•
30
•
3
kaust-generative-ai/fineweb-edu-ar
Viewer
•
Updated
•
363M
•
347
•
13
Preview
•
Updated
•
57
•
1
Viewer
•
Updated
•
893
•
21
•
1
Viewer
•
Updated
•
135k
•
540
•
1
UBC-NLP/nilechat-arabizi-egy
Viewer
•
Updated
•
572k
•
29
Viewer
•
Updated
•
761k
•
35
•
3
Viewer
•
Updated
•
11.1k
•
106
•
5
KFUPM-JRCAI/arabic-generated-abstracts
Viewer
•
Updated
•
8.39k
•
796
Viewer
•
Updated
•
5.73k
•
170
•
6
badrex/ALDi-predictions-MADIS5
Viewer
•
Updated
•
263
•
7
Viewer
•
Updated
•
467k
•
10
•
2
Viewer
•
Updated
•
10.1k
•
74
•
1
CohereLabs/include-base-44
Viewer
•
Updated
•
23k
•
6.88k
•
42
CohereLabs/m-ArenaHard-v2.0
Viewer
•
Updated
•
11.5k
•
298
•
5
Viewer
•
Updated
•
77.2M
•
2.51k
•
51
ToxicityPrompts/PolyGuardPrompts
Viewer
•
Updated
•
29.3k
•
179
•
2
Updated
•
9.22k
•
2
SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101
Viewer
•
Updated
•
728k
•
61
•
4
QCRI/AraDICE-ArabicMMLU-lev
Viewer
•
Updated
•
14.5k
•
1.48k
Viewer
•
Updated
•
97.6k
•
2.22k
•
47
Updated
•
1.07k
•
12
Viewer
•
Updated
•
141k
•
43
•
7
CohereLabsCommunity/afri-aya
Viewer
•
Updated
•
2.47k
•
163
•
11
Omar-youssef/Egyptian-text-summarization
Viewer
•
Updated
•
3.69k
•
25
jonathanmutal/Medical-Questionnaire-Multilingual-Translation
Preview
•
Updated
•
18
Updated
•
39.4k
•
41
CohereLabs/Global-MMLU-Lite
Viewer
•
Updated
•
10.9k
•
5.72k
•
28
MBZUAI/speecht5_tts_clartts_ar
Text-to-Speech
•
Updated
•
1.72k
•
24
LLaMAX/BenchMAX_General_Translation
Viewer
•
Updated
•
228k
•
722
abdullah-alamodi/aqeedah-rag-dataset
Viewer
•
Updated
•
5.42k
•
22
•
1
Viewer
•
Updated
•
63.8k
•
410
•
1
Viewer
•
Updated
•
127k
•
1.28k
•
26
Viewer
•
Updated
•
5.1M
•
1.38k
•
47
sboughorbel/arabic-web-edu-seed
Viewer
•
Updated
•
236k
•
81
•
3
amphora/Open-R1-Mulitlingual-SFT
Viewer
•
Updated
•
128k
•
93
•
3
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
5.4k
•
31
brighter-dataset/BRIGHTER-emotion-intensities
Viewer
•
Updated
•
41.2k
•
572
•
3
LLaMAX/BenchMAX_Domain_Translation
Viewer
•
Updated
•
47.3k
•
686
LLaMAX/BenchMAX_Rule-based
Viewer
•
Updated
•
7.29k
•
697
•
2
ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Paper
•
2511.10090
•
Published
Viewer
•
Updated
•
393k
•
7.39k
•
513
Omar-youssef/islamic-qa-egyptian-arabic
Viewer
•
Updated
•
7.47k
•
37
alconost/alconost-multilingual-speech-en-ja-ar-pl-v1
Viewer
•
Updated
•
280
•
41
LLaMAX/BenchMAX_Question_Answering
Viewer
•
Updated
•
17
•
85
2A2I/Arabic-OpenHermes-2.5
Viewer
•
Updated
•
982k
•
403
•
20
FreedomIntelligence/ApolloMoEDataset
Viewer
•
Updated
•
293k
•
165
•
5
SaiedAlshahrani/Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
1.09M
•
71
•
1
UBC-NLP/palmx_2025_subtask1_culture
Viewer
•
Updated
•
4.5k
•
67
•
1
Viewer
•
Updated
•
17.6M
•
99
•
4
Viewer
•
Updated
•
8.79k
•
462
•
41
Viewer
•
Updated
•
158k
•
186
•
7
UBC-NLP/nilechat-fw-edu-egy
Viewer
•
Updated
•
5.52M
•
64
•
2
LLaMAX/BenchMAX_Model-based
Viewer
•
Updated
•
8.5k
•
240
Viewer
•
Updated
•
180
•
1.32k
•
1
Raniahossam33/Arabic_cultural_dataset
Viewer
•
Updated
•
12.1k
•
6
•
2
Preview
•
Updated
•
46
Viewer
•
Updated
•
380M
•
23.3k
•
39
Viewer
•
Updated
•
7.18B
•
35.1k
•
565
visheratin/laion-coco-nllb
Viewer
•
Updated
•
894k
•
1.49k
•
44
obadx/recitation-segmentation-augmented
Viewer
•
Updated
•
64.6k
•
440
Viewer
•
Updated
•
159M
•
10.9k
•
12
Viewer
•
Updated
•
2.56M
•
27.8k
•
77
Viewer
•
Updated
•
602k
•
9.61k
•
144
Viewer
•
Updated
•
13.2k
•
7.56k
•
2
rabah2026/Quran-Ayah-Corpus
Viewer
•
Updated
•
263k
•
965
•
1
omar-emad/FinanceTripletSecond
Viewer
•
Updated
•
30
•
14
Viewer
•
Updated
•
3.3k
•
100
•
8
Viewer
•
Updated
•
6.98k
•
148
•
8
Viewer
•
Updated
•
1.05M
•
116
•
12
UBC-NLP/palmx_2025_subtask2_islamic
Viewer
•
Updated
•
1.9k
•
15
Viewer
•
Updated
•
388
•
137
rubricreward/m-reward-bench
Viewer
•
Updated
•
66k
•
24
Fujitsu-FRE/MAPS_Verified
Viewer
•
Updated
•
3.05k
•
3.96k
•
2
Viewer
•
Updated
•
135k
•
1.6k
•
277
LLaMAX/BenchMAX_Multiple_Functions
Viewer
•
Updated
•
5.41k
•
172
Fumika/Wikinews-multilingual
Viewer
•
Updated
•
15.2k
•
44
•
7
Omartificial-Intelligence-Space/awesome_chatgpt_prompts_ar
Viewer
•
Updated
•
201
•
29
•
1
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
•
11.6k
•
2.77k
•
29
NAMAA-Space/QariOCR-v0.3-markdown-mixed-dataset
Viewer
•
Updated
•
37k
•
188
•
9
Viewer
•
Updated
•
1.49M
•
55
•
2
Viewer
•
Updated
•
23k
•
585
•
1
m0pper/Small-Multilingual-Corpora
Viewer
•
Updated
•
7.61M
•
135
Viewer
•
Updated
•
236k
•
12
Preview
•
Updated
•
13
haoranxu/X-ALMA-Preference
Viewer
•
Updated
•
772k
•
161
•
6
SaiedAlshahrani/Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
847k
•
169
•
2
Viewer
•
Updated
•
367
•
21
•
2
vgaraujov/semeval-2025-task11-track-c
Viewer
•
Updated
•
57.3k
•
205
Viewer
•
Updated
•
935
•
1.76k
•
1
Viewer
•
Updated
•
3.94k
•
1.11k
Viewer
•
Updated
•
7.62k
•
5.62k
•
3
Viewer
•
Updated
•
10.4k
•
3.08k
•
35
Updated
•
2.43k
•
123
brighter-dataset/BRIGHTER-emotion-categories
Viewer
•
Updated
•
140k
•
1.49k
•
13
lukasellinger/homonym-mcl-wic
Viewer
•
Updated
•
1.61k
•
22
Viewer
•
Updated
•
160
•
35
•
3
Preview
•
Updated
•
31
HeshamHaroon/Arabic_Function_Calling
Viewer
•
Updated
•
50.8k
•
195
•
55