Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

naver-clova-ix
/
donut-base

Image-to-Text
Transformers
PyTorch
vision-encoder-decoder
image-text-to-text
donut
vision
Model card Files Files and versions
xet
Community
16

Instructions to use naver-clova-ix/donut-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Transformers

    How to use naver-clova-ix/donut-base with Transformers:

    # Use a pipeline as a high-level helper
    # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
    # You must load the model directly (see below) or downgrade to v4.x with:
    # 'pip install "transformers<5.0.0'
    from transformers import pipeline
    
    pipe = pipeline("image-to-text", model="naver-clova-ix/donut-base")
    # Load model directly
    from transformers import AutoTokenizer, AutoModelForImageTextToText
    
    tokenizer = AutoTokenizer.from_pretrained("naver-clova-ix/donut-base")
    model = AutoModelForImageTextToText.from_pretrained("naver-clova-ix/donut-base")
  • Notebooks
  • Google Colab
  • Kaggle
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

fine tuning Donut OCR model for higher accuracy

#16 opened 9 months ago by
MugishaTheProgrammer

Adding ONNX file of this model

#15 opened 12 months ago by
effyau

Fine tuning using LoRa

1
#14 opened about 1 year ago by
DenisMir

Error Fine Tuning due to unexpected keyword argument

1
#13 opened over 1 year ago by
reganshen

How to extract all the text from the document?

1
#12 opened almost 2 years ago by
Maz369

add _name_or_path

#11 opened almost 2 years ago by
nbroad

Why torch.compile has very small acceleration for Donut model?

๐Ÿ‘€ 1
1
#10 opened about 2 years ago by
gorodnitskiy

Will this modle suitable for invoce processing ?

2
#9 opened over 2 years ago by
total008

Discrepancies between DONUT / BART Tokenizer and missing characters

๐Ÿ‘ 9
1
#8 opened over 2 years ago by
DieseKartoffel

Adding `safetensors` variant of this model

#7 opened over 2 years ago by
SFconvertbot

Architecture of donut

#6 opened almost 3 years ago by
shubham05

Change image_mean and image_std to ImageNet to match original codebase

#5 opened over 3 years ago by
morgan

Failed to download Donut Processor

๐Ÿ‘ 1
5
#3 opened over 3 years ago by
paturi1710

Minimum GPU requirement to fine-tune donut model

#2 opened almost 4 years ago by
LeonardoVaz
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs