Models

73,427

Full-text search

Active filters: reinforcement-learning

Simplified-Reasoning/SU-01

Text Generation • 31B • Updated about 8 hours ago • 712 • 19

zghhui/OmniNFT

Any-to-Any • Updated 1 day ago • 59 • 21

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 536

twnlp/ChineseErrorCorrector4-4B

Text Generation • 4B • Updated 1 day ago • 157 • 4

JohnRoger/SU-01-Q4_K_M-GGUF

Reinforcement Learning • 31B • Updated 5 days ago • 238 • 3

Mercury7353/MetaAgent-X

Reinforcement Learning • 8B • Updated 6 days ago • 87 • 3

Jincenzi/SocialR1-8B

Text Generation • 4B • Updated 8 days ago • 44 • 2

YuvrajSingh9886/LFM2.5-350M-grpo-summarization-quality-bleu

Summarization • 0.4B • Updated 6 days ago • 258 • 2

6kplus/PhyMotion-CausalForcing-1.3B

Text-to-Video • Updated 4 days ago • 2

leorc/Simulus

Reinforcement Learning • Updated Feb 21, 2025 • 1

farama-minari/Swimmer-v5-PPO-medium

Reinforcement Learning • Updated Jan 29, 2025 • 2 • 1

NousResearch/DeepHermes-ToolCalling-Specialist-Atropos

Reinforcement Learning • 8B • Updated Apr 28, 2025 • 72 • 18

ValueFX9507/Tifa-DeepsexV3-14b-GGUF-Q6

Reinforcement Learning • 15B • Updated Jul 1, 2025 • 13.8k • 44

Arc-Intelligence/ATLAS-8B-Thinking

Text Generation • 8B • Updated Sep 12, 2025 • 11 • 6

exla-ai/openpie-0.6

Robotics • Updated Feb 4 • 41 • 23

AQ-MedAI/PulseMind-72B

Image-Text-to-Text • 73B • Updated Jan 30 • 30 • 2

nvidia/GEAR-SONIC

Reinforcement Learning • Updated Apr 11 • 43

nvidia/EGM-8B

Image-Text-to-Text • 9B • Updated Apr 10 • 504 • 9

XunmeiLiu/VFIG-4B

Reinforcement Learning • 4B • Updated Mar 27 • 131 • 6

bue0912/ToolOmni-Qwen3-4B

Text Generation • 4B • Updated Apr 16 • 25 • 3

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 17 days ago • 220 • 3

intcomp/sub-jepa

Reinforcement Learning • Updated 8 days ago • 2

mradermacher/SocialR1-8B-GGUF

Reinforcement Learning • 4B • Updated 8 days ago • 770 • 1

mradermacher/SocialR1-8B-i1-GGUF

Reinforcement Learning • 4B • Updated 8 days ago • 3.48k • 1

Mouhamedamar/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated 6 days ago • 60 • 1

JosedelaPepe/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated 5 days ago • 47 • 1

axi0mX/SU-01-GGUF

Text Generation • 31B • Updated 2 days ago • 1.9k • 1

Alopezcordero/ppo-LunaLander-v3

Reinforcement Learning • Updated 4 days ago • 146 • 1

mradermacher/AgentHijack-Agent-GGUF

Reinforcement Learning • 8B • Updated 2 days ago • 680 • 1

ccnets/causal-gpt-rl

Reinforcement Learning • Updated 1 day ago • 70 • 2