Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

gpjt-llm-from-scratch

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

28

Base only

Active filters: gpjt-llm-from-scratch

gpjt/8xa100m40

Text Generation • 0.2B • Updated Mar 24 • 7

gpjt/8xa100m80

Text Generation • 0.2B • Updated Mar 24 • 14

gpjt/8xb200m160

Text Generation • 0.2B • Updated Mar 24 • 17

gpjt/8xh100m80-best

Text Generation • 0.2B • Updated Mar 24 • 21

gpjt/8xh100m80-latest

Text Generation • 0.2B • Updated Mar 24 • 8

gpjt/1xrtx3090m24-fineweb

Text Generation • 0.2B • Updated Mar 24 • 8

gpjt/1xrtx3090m24-fineweb-edu

Text Generation • 0.2B • Updated Mar 24 • 9

gpjt/1xrtx3090m24-fineweb-edu-2x

Text Generation • 0.2B • Updated Mar 24 • 8

gpjt/8xa100m40-baseline

Text Generation • 0.2B • Updated Mar 24 • 10

gpjt/8xa100m40-gradient-clipping

Text Generation • 0.2B • Updated Mar 24 • 9

gpjt/8xa100m40-remove-dropout

Text Generation • 0.2B • Updated Mar 24 • 18

gpjt/8xa100m40-qkv-bias

Text Generation • 0.2B • Updated Mar 24 • 7

gpjt/8xa100m40-schedule-learning-rate

Text Generation • 0.2B • Updated Mar 24 • 10

gpjt/8xa100m40-weight-decay-gpt2

Text Generation • 0.2B • Updated Mar 24 • 14

gpjt/8xa100m40-weight-decay-cerebras

Text Generation • 0.2B • Updated Mar 24 • 9

gpjt/8xa100m80-no-amp

Text Generation • 0.2B • Updated Apr 3 • 9

gpjt/8xa100m40-baseline-2

Text Generation • 0.2B • Updated Apr 8 • 16

gpjt/8xa100m40-baseline-3

Text Generation • 0.2B • Updated Apr 8 • 9

gpjt/8xa100m40-baseline-4

Text Generation • 0.2B • Updated Apr 8 • 9

gpjt/8xa100m40-baseline-5

Text Generation • 0.2B • Updated Apr 8 • 11

gpjt/8xa100m40-baseline-6

Text Generation • 0.2B • Updated Apr 8 • 8

gpjt/8xa100m40-baseline-7

Text Generation • 0.2B • Updated Apr 8 • 12

gpjt/8xa100m40-baseline-8

Text Generation • 0.2B • Updated Apr 8 • 12

gpjt/8xa100m40-stacked-interventions-1

Text Generation • 0.2B • Updated Apr 8 • 8

gpjt/8xa100m40-stacked-interventions-2

Text Generation • 0.2B • Updated Apr 8 • 9

gpjt/8xa100m40-stacked-interventions-3

Text Generation • 0.2B • Updated Apr 9 • 8

gpjt/1xrtx3090-baseline

Text Generation • 0.2B • Updated Apr 14 • 13

gpjt/1xrtx3090-stacked-interventions

Text Generation • 0.2B • Updated Apr 15 • 8