Video-Text-to-Text
Transformers
Safetensors
English
video
multimodal
video-captioning
temporal-grounding
qwen
text-generation
VLM
Instructions to use NemoStation/Marlin-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NemoStation/Marlin-2B with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("NemoStation/Marlin-2B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Collaboration Opportunity
#6 opened about 19 hours ago
by
Phase-Technologies
Train on own videos / labels?
2
#5 opened 2 days ago
by
horsto
Can you use this model with image and text-only inputs apart from video?
3
#4 opened 5 days ago
by
lunahr
Question about the evaluation metrics for captioning benchmarks
1
#3 opened 7 days ago
by
ygyjrc