Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,19 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Infinity-Parser-7B
|
| 2 |
+
|
| 3 |
+
<a><img src="assets/logo.png" height="16" width="16" style="display: inline"><b> Paper </b></a> |
|
| 4 |
+
<a href="https://github.com/infly-ai/INF-MLLM/tree/main/Infinity-Parser"><img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" height="16" width="16" style="display: inline"><b> Github </b></a> |
|
| 5 |
+
<a href="https://huggingface.co/spaces/infly/Infinity-Parser-Demo">💬<b> Web Demo </b></a>
|
| 6 |
+
|
| 7 |
+
# Introduction
|
| 8 |
+
|
| 9 |
+
We develop Infinity-Parser, an end-to-end scanned document parsing model trained with reinforcement learning. By incorporating verifiable rewards based on layout and content, Infinity-Parser maintains the original document's structure and content with high fidelity. Extensive evaluations on benchmarks in cluding OmniDocBench, olmOCR-Bench, PubTabNet, and FinTabNet show that Infinity-Parser consistently achieves state-of-the-art performance across a broad range of document types, languages, and structural complexities, substantially outperforming both specialized document parsing systems and general-purpose vision-language models.
|
| 10 |
+
|
| 11 |
+
# Architecture
|
| 12 |
+
|
| 13 |
+
Overview of Infinity-Parser training framework. Our model is optimized via reinforcement finetuning with edit distance, layout, and order-based rewards.
|
| 14 |
+
|
| 15 |
+

|
| 16 |
+
|
| 17 |
+
# License
|
| 18 |
+
|
| 19 |
+
This dataset is licensed under apache-2.0.
|