video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions.
AI & ML interests
https://www.ee.tsinghua.edu.cn/en/
Recent Activity
View all activity
Organization Card
Department of Electronic Engineering, Tsinghua University
models 18
tsinghua-ee/video-SALMONN2_plus_3B_full
5B ⢠Updated ⢠207
tsinghua-ee/video-SALMONN2_plus_7B_full
9B ⢠Updated ⢠68
tsinghua-ee/video_SALMONN2plus_3B_audioAlign
5B ⢠Updated ⢠4
tsinghua-ee/D-ORCA-8B-0210
10B ⢠Updated ⢠6 ⢠1
tsinghua-ee/WAVE-7B
Updated ⢠777 ⢠1
tsinghua-ee/video_SALMONN2_7B_audioAlign
Updated
tsinghua-ee/video_SALMONN2plus_72B_audioAlign
Updated ⢠3
tsinghua-ee/video_SALMONN2plus_7B_audioAlign
9B ⢠Updated ⢠342
tsinghua-ee/SALMONN
Automatic Speech Recognition ⢠Updated ⢠51
tsinghua-ee/video-SALMONN-2_plus_72B
Updated ⢠3 ⢠2
datasets 8
tsinghua-ee/ELViM
Viewer ⢠Updated ⢠211 ⢠103
tsinghua-ee/SACRED-Bench
Viewer ⢠Updated ⢠2.48k ⢠175
tsinghua-ee/F-16-NBA
Preview ⢠Updated ⢠50
tsinghua-ee/AVUTBenchmark
Viewer ⢠Updated ⢠3.28k ⢠7.26k ⢠1
tsinghua-ee/video-SALMONN_2_testset
Preview ⢠Updated ⢠102
tsinghua-ee/QualiSpeech
Viewer ⢠Updated ⢠14.6k ⢠748 ⢠21
tsinghua-ee/RivaBench
Viewer ⢠Updated ⢠542 ⢠651 ⢠2
tsinghua-ee/SAVEBench
Preview ⢠Updated ⢠94 ⢠3