adept/fuyu-8b · Discussions

#70 opened about 2 years ago by

jchiu1234

How to evaluate it on AI2D dataset?

#69 opened about 2 years ago by

boydcheung

Masking the image tokens during training

#68 opened about 2 years ago by

jchiu1234

finetune fuyu-8b model

#67 opened over 2 years ago by

yinincanada

Is there any way to use image embeddings as input? (similar to input_embeds param)

#66 opened over 2 years ago by

sanchd

OCR function

#65 opened over 2 years ago by

linxi

Does localization really work?

#64 opened over 2 years ago by

Seungyoun

finetune fuyu8b text location with image size of 1920x1080 always got OOM even on A100*8

#63 opened over 2 years ago by

Nooodles

Are there special tokens that are ignored during loss computation?

9

#62 opened over 2 years ago by

Nyandwi

why does the coordinates need to be divided by two in scale_bbox_to_transformed_image?

#61 opened over 2 years ago by

Nooodles

Here is a simple multimodal like training script to see model working.

👍 7

#60 opened over 2 years ago by

besiktas

GPU requirements

5

#59 opened over 2 years ago by

thightower1

I keep running out of memory. Why dont they just tell what equipment is required to run these models

#58 opened over 2 years ago by

alquimista888

crash kernel

6

#57 opened over 2 years ago by

simonbrbx

Tips on resolving this typing.Optional error seemingly related to PIL.Image?

#56 opened over 2 years ago by

justinwickett

demo of PDF vqa

#55 opened over 2 years ago by

verigle

test

#54 opened over 2 years ago by

Aaronx

Upload 2.jpg

#53 opened over 2 years ago by

Aaronx

test

#52 opened over 2 years ago by

Aaronx

8B? Or 9B?

#51 opened over 2 years ago by

mrfakename

Memory Spikes while Getting Model Logits

#49 opened over 2 years ago by

Nyandwi

Is there a way to run it on a 8GB GPU?

#47 opened over 2 years ago by

bobe94

issue with quantization on windows

#46 opened over 2 years ago by deleted

How does the Fuyu model Get images?

#45 opened over 2 years ago by

VatsaDev

For the vqav2 data set example "fish and carrot", why does the model output a sentence instead of a phrase?

8

#44 opened over 2 years ago by

changgeli

fine-tuning using FSDP and non 80GB cards?

8

#43 opened over 2 years ago by

besiktas

Released capabilities

6

#42 opened over 2 years ago by

ludeksvoboda

Update README.md

#41 opened over 2 years ago by

ybelkada

Colab

#39 opened over 2 years ago by

nengelmann

whether special instruction is need to trigger OCR location function?

#38 opened over 2 years ago by

liupei0408

How to get Image embedding using Fuyu

#37 opened over 2 years ago by

oaishi

How to get the detailed description in the fuyu-8b-demo？

#35 opened over 2 years ago by

dwdxdy

The Numbers

#33 opened over 2 years ago by

changgeli

Questions about the examples in the blog

#32 opened over 2 years ago by

AudreyLin

ImportError for FuyuProcessor in Transformers v4.34.1

#30 opened over 2 years ago by

ClaraLovesFunk

hi love it

❤️ 3

#29 opened over 2 years ago by

boinc

The 8b model could get correct results for case showed on the offical blog

#28 opened over 2 years ago by

YuntaoChen

long response times

#27 opened over 2 years ago by

FantasticMrCat42

ValueError: Unable to infer channel dimension format

7

#26 opened over 2 years ago by

vishal1278

A working demo.py for your reference

#25 opened over 2 years ago by

Colderthanice

Using this model as a QA-tool/OCR on a text heavy document?

#24 opened over 2 years ago by

Techie5879

Loading the model on multi-gpu setup?

#23 opened over 2 years ago by

Techie5879

issue with inference

5

#22 opened over 2 years ago by

zhangchaosunshine

issue with running the model

👍 2

5

#21 opened over 2 years ago by

slay

Possible for quantization other than bitsandbytes?

👍 7

#20 opened over 2 years ago by

Yhyu13

Run on MBP M1

4

#17 opened over 2 years ago by

sagar-kris

License question

👍 3

#16 opened over 2 years ago by deleted

Warning output