Foundation Models for Vision - a itseffi Collection

itseffi 's Collections

Most influential papers

LLMs

Foundation Models for Vision

Multi-agent collaboration

Foundation Models for Vision

updated Feb 20, 2024

Running

110

Grounding DINO Demo

💻

110

Cutting edge open-vocabulary object detection app
Running

Featured

94

Owlv2

👀

94

State-of-the-art Zero-shot Object Detection
openai/clip-vit-base-patch32

Zero-Shot Image Classification • Updated Feb 29, 2024 • 14.6M • 828
openai/clip-vit-large-patch14

Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 7.9M • 1.94k
google/pix2struct-large

Image-to-Text • 1B • Updated Sep 6, 2023 • 737 • 34
google/pix2struct-ai2d-base

Visual Question Answering • 0.3B • Updated Dec 24, 2023 • 1.59k • 43
HuggingFaceM4/idefics-80b-instruct

Text Generation • 80B • Updated Oct 12, 2023 • 2.18k • 189
Runtime error

Featured

41

BLIP2 with transformers

🌖

41

BLIP2 (cutting edge image captioning) in 🤗transformers