Submitted by akhaliq 36 IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models · 5 authors 6.48k 2
Submitted by akhaliq 28 SpeechX: Neural Codec Language Model as a Versatile Speech Transformer · 10 authors 1
Submitted by akhaliq 11 RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs · 5 authors 271
Submitted by akhaliq 8 Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation · 4 authors 0 1
Submitted by akhaliq 7 The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation · 10 authors
Submitted by akhaliq 6 VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use · 9 authors 50 1