Hybrid-Distillation Collection Model weights in "Distilling to Hybrid Attention Models via KL-Guided Layer Selection" (https://arxiv.org/abs/2512.20569). • 113 items • Updated Jan 25