Llama baseline checkpoints (0.6B, 1.3B)
Chunyuan Deng
CharlesDDDD
·
AI & ML interests
Architecheture, Interpretability.
Recent Activity
submitted
a paper
about 4 hours ago
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer upvoted a paper about 1 month ago
SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers updated
a collection
about 1 month ago
looped_transformer