Building on HF

Ujjwal Tyagi

Ujjwal-Tyagi

AI & ML interests

Chief Scientist at Shirova AI, focused on advancing open-source AI, Experienced in LLM fine-tuning, model architecture, and research, with a strong interest in building scalable and efficient models

Recent Activity

liked a model about 11 hours ago

llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic

upvoted a paper about 11 hours ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

liked a model about 22 hours ago

abhinand/sarvam-105b-bf16

View all activity

Organizations

replied to DedeProGames's post 8 days ago

Oh wow, good

posted an update 14 days ago

Post

233

6 Open-Source Libraries to FineTune LLMs
1. Unsloth
GitHub: https://github.com/unslothai/unsloth
→ Fastest way to fine-tune LLMs locally
→ Optimized for low VRAM (even laptops)
→ Plug-and-play with Hugging Face models

2. Axolotl
GitHub: https://github.com/OpenAccess-AI-Collective/axolotl
→ Flexible LLM fine-tuning configs
→ Supports LoRA, QLoRA, multi-GPU
→ Great for custom training pipelines

3. TRL (Transformer Reinforcement Learning)
GitHub: https://github.com/huggingface/trl
→ RLHF, DPO, PPO for LLM alignment
→ Built on Hugging Face ecosystem
→ Essential for post-training optimization

4. DeepSpeed
GitHub: https://github.com/microsoft/DeepSpeed
→ Train massive models efficiently
→ Memory + speed optimization
→ Industry standard for scaling

5. LLaMA-Factory
GitHub: https://github.com/hiyouga/LLaMA-Factory
→ All-in-one fine-tuning UI + CLI
→ Supports multiple models (LLaMA, Qwen, etc.)
→ Beginner-friendly + powerful

6. PEFT
GitHub: https://github.com/huggingface/peft
→ Fine-tune with minimal compute
→ LoRA, adapters, prefix tuning
→ Best for cost-efficient training

1 reply

reacted to SeaWolf-AI's post with ❤️🔥 22 days ago

Post

8735

🧬 Introducing Darwin-9B-NEG — the first model with Native Entropy Gating (NEG)

🔗 Try it now: FINAL-Bench/Darwin-9B-NEG
🔗 Q4 bit : FINAL-Bench/Darwin-9B-MFP4

We're thrilled to release Darwin-9B-NEG, a 9B-parameter reasoning model
that embeds an architecturally-internalised sense of self-confidence directly
into the transformer — our proprietary Native Entropy Gating (NEG) technology.

📊 GPQA Diamond (198 PhD-level questions):

▸ Baseline Darwin-9B (no NEG) → 51.01 %
▸ Pure NEG (greedy · 1× cost) → 63.64 % 🔥 +12.63 %p
▸ + Permutation (4× cost) → 76.26 %
▸ + Ensemble Refinement (~20×) → 84.34 % 🏆

With only 9 billion parameters and 1× inference cost, Pure NEG jumps
+12.63 %p over the same model without NEG. Going all-in with ensemble
refinement pushes it to 84.34 % — surpassing the published Qwen3.5-9B
leaderboard score (81.7 %) by +2.64 %p.

🔬 What makes NEG different from Multi-Turn Iteration (MTI)?

Classical MTI needs 3-8× extra inference passes. NEG instead lives
INSIDE the single decoding loop. Two tiny modules ride with the
transformer: NEG-Head predicts per-token entropy from the last hidden
state, and NEG-Gate conditionally restricts the top-k choice when
confidence is low. The gate activates in only 4.36 % of tokens —
essentially free at inference time.

✨ Key differentiators
• Architecturally internalised — model file *is* the feature
• 1× inference cost (vs. 3-8× for MTI)
• Drop-in with vLLM / SGLang / TGI / transformers — no extra engine
• +12.63 %p reasoning at zero latency overhead
• Single-file deployment, Apache 2.0 licensed

🧬 Lineage
Qwen/Qwen3.5-9B → Darwin-9B-Opus (V7 evolutionary merge) → Darwin-9B-NEG (V8 + NEG training)

#Darwin #NEG #NativeEntropyGating #GPQA #Reasoning #LLM #OpenSource #Apache2

reacted to Benedictat's post with ❤️🔥👍🚀 24 days ago

Post

3569

Built a WeChat Mini Program in 20 minutes flat with Hy3 Preview + WorkBuddy…

and I didn’t type a single line of code. Not even a semicolon.

This Coding Agent is on steroids. Its comprehension in long back-and-forths is night and day better, and that 256K context window swallows the entire project structure whole.

Tell it what you want, and it actually gets the full picture no confused blank stares from the AI.

And we’re not messing around with dinky little code snippets here. It spits out a fully functional project

app.json, every page’s wxml/wxss/js/json, even Mock data pre-packed. Import it into WeChat Dev Tools and it runs on the first try

Only one tiny visual nitpick, zero logic bugs. Point out the flaw, and it fixes it instantly no new bugs, no passive-aggressive code breaks, no headaches

The entire vibe Tell it your idea → Get a complete working project → Mention a tiny flaw → AI polishes it.

No coding, no endless edits, no soul-crushing debugging that makes you want to throw your laptop. Absolute game-changer

posted an update 24 days ago

Post

209

This is the best set of AI and ML books and a full guide to learning machine learning from the ground up. This is my study material that I used, so I thought it would be helpful to share it with others. Like, share, and add it to your collection at Ujjwal-Tyagi/ai-ml-foundations-book-collection.

replied to SeanLee97's post 25 days ago

Oh very wonderful work! Nice work guys

replied to anakin87's post 27 days ago

I love your diagrams, it's very good for beginners, nice work!

reacted to anakin87's post with ❤️ 27 days ago

Post

10395

How LLM training with RL Environments works?

It all starts with 𝗥𝗲𝗶𝗻𝗳𝗼𝗿𝗰𝗲𝗺𝗲𝗻𝘁 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗩𝗲𝗿𝗶𝗳𝗶𝗮𝗯𝗹𝗲 𝗥𝗲𝘄𝗮𝗿𝗱𝘀
- question asked
- model generates reasoning + answer
- answer checked against ground truth
- reward drives RL training

In this setup, the environment is simple: fixed questions and answers, rollout logic, reward(s)

Consider a more complex tic-tac-toe env ❌⭕
It adds:
- dynamic game generation/handling
- tunable opponent skill
- multi-turn interactions

(envs can also include tools)

---

What happens at training?

We use 𝗚𝗿𝗼𝘂𝗽 𝗥𝗲𝗹𝗮𝘁𝗶𝘃𝗲 𝗣𝗼𝗹𝗶𝗰𝘆 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 with a tic-tac-toe env

No critic model needed, the group is the baseline
Simpler than PPO

1️⃣ Rollout generation: from the same board, model plays N games via sampling
2️⃣ Each game scored with deterministic rewards (win, format, ...)
3️⃣ Mean score computed across the group
4️⃣ Each rollout's advantage = its score minus the group mean
5️⃣ Model updated to favor trajectories above baseline

🔁 Repeat

For a deep dive, check out
🌱 https://github.com/anakin87/llm-rl-environments-lil-course
a free hands-on course on RL environments for LLMs

2 replies

posted an update 27 days ago

Post

3943

We are hiring at Shirova AI. We need AI researchers and engineers to work in our research lab. Shirova AI is a research lab in India, so we can help our researchers move to nearby workspaces or let them work from home without ever coming to the lab. We're building our founding team, so the pay will be good. You can learn, so don't hesitate to mail us at: careers@shirova.com

replied to intrect's post 27 days ago

insightful paper, as you are a researcher, you can apply as a researcher role through this mail: careers@shirova.com, we are building our founding team of researchers, Shirova AI is a research lab based in india

replied to DedeProGames's post about 1 month ago

Glad to hear, nice work!

replied to kanaria007's post about 2 months ago

oh I can understand, your research is interesting, nice work!, keep going 😀 🤗

replied to reaperdoesntknow's post about 2 months ago

Oh nice! Good work

replied to their post about 2 months ago

You're welcome. If you haven't already, you can review my master notes in the dataset repo card, https://huggingface.co/datasets/Ujjwal-Tyagi/ai-ml-foundations-book-collection#my-master-notes-and-main-concept-understanding-after-i-read-those-books

replied to kanaria007's post about 2 months ago

it looks interesting but like any implementation plan, or any kind of result by implementing it? in the simple easy way, could you please explain what is it for and how we can implement it?

reacted to their post with ❤️ about 2 months ago

Post

2821

I am sharing my study material for AI & ML, these books are really a "bible" and gives very strong foundation, I also have given guidance, introduction and my master notes in the dataset repo card! I hope you will find them helpful, if you have any queries, just start a discussion and I am always there to help you out!
Ujjwal-Tyagi/ai-ml-foundations-book-collection

4 replies

Ujjwal Tyagi

AI & ML interests

Recent Activity

Organizations

Ujjwal-Tyagi's activity