Chao-Chun (Joe) Hsu's picture

Chao-Chun (Joe) Hsu

joe32140

·

https://chaochunhsu.github.io

AI & ML interests

Hi, I am Joe!

Recent Activity

liked a model 4 days ago

perplexity-ai/pplx-embed-v1-late-0.6b

upvoted a paper 10 days ago

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

upvoted a collection 10 days ago

jina-embeddings-v5-omni

View all activity

Organizations

upvoted a paper 10 days ago

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

Paper • 2605.08384 • Published 15 days ago • 10

upvoted a collection 10 days ago

jina-embeddings-v5-omni

Multimodal (text + image + video + audio) embedding models aligned with jina-embeddings-v5-text-*. Two sizes, four task variants each. • 27 items • Updated 10 days ago • 36

upvoted an article about 1 month ago

Article

DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models

lightonai

•

Apr 21

• 38

upvoted a collection about 1 month ago

DenseOn & LateOn

A collection of open state-of-the-art single and multi-vector models • 7 items • Updated about 1 month ago • 10

upvoted a collection 2 months ago

CodeScout

RL-trained code search agents (1.7B, 4B, 14B) that outperform 2–18× larger models using only a Unix terminal. 📄 arxiv.org/abs/2603.17829 • 12 items • Updated Mar 19 • 8

upvoted a collection 3 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.64k

upvoted a paper 3 months ago

ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models

Paper • 2602.16609 • Published Feb 18 • 7

upvoted a collection 3 months ago

artificial-hivemind

This collection contains datasets for the Artificial Hiveminds paper. • 4 items • Updated May 16, 2025 • 16

upvoted 2 collections 4 months ago

LightOnOCR-2 🦉

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family • 12 items • Updated Apr 7 • 24

Qwen3-VL-Embedding

2 items • Updated Jan 8 • 68

upvoted a collection 6 months ago

Sarashina2.2

Large Language Models developed by SB Intuitions. Pretrained and instruction-tuned models are available in three sizes: 0.5B, 1B, and 3B. • 6 items • Updated Mar 5, 2025 • 10

upvoted an article 8 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

fzliu, KennethEnevoldsen, Samoed, isaacchung, tomaarsen, fzoll

•

Oct 1, 2025

• 144

upvoted a collection 9 months ago

EmbeddingGemma

3 items • Updated Mar 12 • 119

upvoted an article 9 months ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

+4

tomaarsen, Xenova, alvarobartt, ariG23498, pcuenq, sergiopaniego

•

Sep 4, 2025

• 274

upvoted an article 11 months ago

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers

tomaarsen, arthurbresnu

•

Jul 1, 2025

• 138

upvoted a collection 12 months ago

Qwen3-Embedding

6 items • Updated Dec 31, 2025 • 164

upvoted a collection about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.79k

upvoted a paper about 1 year ago

FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents

Paper • 2504.13128 • Published Apr 17, 2025 • 7

upvoted 2 collections about 1 year ago

AceMath

We are releasing math instruction models, math reward models, general instruction models, all training datasets, and a math reward benchmark. • 11 items • Updated 3 days ago • 18

reranking series v2

V2 crispy rerank series • 3 items • Updated 1 day ago • 25