Instructions to use adept/fuyu-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use adept/fuyu-8b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="adept/fuyu-8b")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("adept/fuyu-8b") model = AutoModelForImageTextToText.from_pretrained("adept/fuyu-8b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use adept/fuyu-8b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "adept/fuyu-8b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/adept/fuyu-8b
- SGLang
How to use adept/fuyu-8b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "adept/fuyu-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "adept/fuyu-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use adept/fuyu-8b with Docker Model Runner:
docker model run hf.co/adept/fuyu-8b
Fuyu is now supported by chatllm.cpp
#72 opened about 1 year ago
by
J22
How can I reproduce the results on OKVQA?
#71 opened almost 2 years ago
by
rayruiyang
Has anyone tried adding positional embeddings to the image patches to improve the model?
2
#70 opened about 2 years ago
by
jchiu1234
How to evaluate it on AI2D dataset?
π 1
#69 opened about 2 years ago
by
boydcheung
Masking the image tokens during training
#68 opened about 2 years ago
by
jchiu1234
finetune fuyu-8b model
1
#67 opened over 2 years ago
by
yinincanada
Is there any way to use image embeddings as input? (similar to input_embeds param)
#66 opened over 2 years ago
by
sanchd
OCR function
#65 opened over 2 years ago
by
linxi
Does localization really work?
#64 opened over 2 years ago
by
Seungyoun
finetune fuyu8b text location with image size of 1920x1080 always got OOM even on A100*8
#63 opened over 2 years ago
by
Nooodles
Are there special tokens that are ignored during loss computation?
9
#62 opened over 2 years ago
by
Nyandwi
why does the coordinates need to be divided by two in scale_bbox_to_transformed_image?
2
#61 opened over 2 years ago
by
Nooodles
Here is a simple multimodal like training script to see model working.
π 7
3
#60 opened over 2 years ago
by
besiktas
GPU requirements
5
#59 opened over 2 years ago
by
thightower1
I keep running out of memory. Why dont they just tell what equipment is required to run these models
#58 opened over 2 years ago
by
alquimista888
crash kernel
6
#57 opened over 2 years ago
by
simonbrbx
Tips on resolving this typing.Optional error seemingly related to PIL.Image?
3
#56 opened over 2 years ago
by
justinwickett
demo of PDF vqa
1
#55 opened over 2 years ago
by
verigle
Upload 2.jpg
#53 opened over 2 years ago
by
Aaronx
8B? Or 9B?
#51 opened over 2 years ago
by
mrfakename
Memory Spikes while Getting Model Logits
2
#49 opened over 2 years ago
by
Nyandwi
Is there a way to run it on a 8GB GPU?
1
#47 opened over 2 years ago
by
bobe94
issue with quantization on windows
#46 opened over 2 years ago
by deleted
How does the Fuyu model Get images?
3
#45 opened over 2 years ago
by
VatsaDev
For the vqav2 data set example "fish and carrot", why does the model output a sentence instead of a phrase?
8
#44 opened over 2 years ago
by
changgeli
fine-tuning using FSDP and non 80GB cards?
8
#43 opened over 2 years ago
by
besiktas
Released capabilities
6
#42 opened over 2 years ago
by
ludeksvoboda
Update README.md
1
#41 opened over 2 years ago
by
ybelkada
Colab
π 1
1
#39 opened over 2 years ago
by
nengelmann
whether special instruction is need to trigger OCR location function?
3
#38 opened over 2 years ago
by
liupei0408
How to get Image embedding using Fuyu
2
#37 opened over 2 years ago
by
oaishi
How to get the detailed description in the fuyu-8b-demoοΌ
1
#35 opened over 2 years ago
by
dwdxdy
The Numbers
1
#33 opened over 2 years ago
by
changgeli
Questions about the examples in the blog
2
#32 opened over 2 years ago
by
AudreyLin
ImportError for FuyuProcessor in Transformers v4.34.1
3
#30 opened over 2 years ago
by
ClaraLovesFunk
hi love it
β€οΈ 3
#29 opened over 2 years ago
by
boinc
The 8b model could get correct results for case showed on the offical blog
π 1
2
#28 opened over 2 years ago
by
YuntaoChen
long response times
1
#27 opened over 2 years ago
by
FantasticMrCat42
ValueError: Unable to infer channel dimension format
7
#26 opened over 2 years ago
by
vishal1278
A working demo.py for your reference
1
#25 opened over 2 years ago
by
Colderthanice
Using this model as a QA-tool/OCR on a text heavy document?
2
#24 opened over 2 years ago
by
Techie5879
Loading the model on multi-gpu setup?
1
#23 opened over 2 years ago
by
Techie5879
issue with inference
π 1
5
#22 opened over 2 years ago
by
zhangchaosunshine
issue with running the model
π 2
5
#21 opened over 2 years ago
by
slay
Possible for quantization other than bitsandbytes?
π 7
1
#20 opened over 2 years ago
by
Yhyu13
Run on MBP M1
4
#17 opened over 2 years ago
by
sagar-kris
License question
π 3
1
#16 opened over 2 years ago
by deleted
Warning output
π 1
4
#15 opened over 2 years ago
by
dashesy