🧠 Text Detector Model v2 — Fine-Tuned AI vs Human Text Classifier

This model (silentone0725/text-detector-model-v2) is a fine-tuned text classifier that distinguishes between human-written and AI-generated text in English.
It is trained on a large combined dataset of diverse genres and writing styles, built to generalize well on modern large language model (LLM) outputs.

🧩 Model Lineage

Stage	Model	Description
v2	`silentone0725/text-detector-model-v2`	Fine-tuned with stronger regularization, early stopping, and expanded dataset.
Base	`silentone0725/text-detector-model`	Your prior fine-tuned model on GPT-4 & human text dataset.
Backbone	`distilbert-base-uncased`	Original pretrained transformer from Hugging Face.

📊 Model Details

Property	Description
Task	Binary Classification — Human (0) vs AI (1)
Languages	English
Dataset	`silentone0725/ai-human-text-detection-v1`
Split Ratio	70% Train / 15% Validation / 15% Test
Regularization	Dropout = 0.3, Weight Decay = 0.2, Early Stopping = 2
Precision	Mixed FP16
Optimizer	AdamW

🧪 Evaluation Metrics

Metric	Validation	Test
Accuracy	99.67%	99.67%
F1-Score	0.9967	0.9967
Eval Loss	0.0156	0.0156

🧠 Training Configuration

Hyperparameter	Value
Learning Rate	2e-5
Batch Size	8
Epochs	6
Weight Decay	0.2
Warmup Ratio	0.1
Dropout	0.3
Max Grad Norm	1.0
Gradient Accumulation	2
Early Stopping Patience	2
Mixed Precision	FP16

🚀 Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "silentone0725/text-detector-model-v2"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "This paragraph was likely written by a machine learning model."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
pred = torch.argmax(outputs.logits, dim=1).item()

print("🧍 Human" if pred == 0 else "🤖 AI")

📈 W&B Experiment Tracking

Training metrics were logged using Weights & Biases (W&B).
📊 View Training Dashboard →

📚 Citation

If you use this model, please cite it as:

@misc{silentone0725_text_detector_v2_2025,
  author = {Thakuria, Daksh},
  title = {Text Detector Model v2 — Fine-Tuned DistilBERT for AI vs Human Text Detection},
  year = {2025},
  howpublished = {\url{https://huggingface.co/silentone0725/text-detector-model-v2}},
}

⚠️ Limitations

Trained only on English data.
May overestimate AI probability on mixed or partially edited text.
Should not be used for moderation or legal decisions without human verification.

❤️ Credits

Developer: Daksh Thakuria (@silentone0725)
Base Model: silentone0725/text-detector-model
Backbone: distilbert-base-uncased
Frameworks: 🤗 Transformers, PyTorch, W&B

📦 Last updated: November 2025
🚀 Developed and fine-tuned in Google Colab with W&B tracking

Downloads last month: 5

Safetensors

Model size

67M params

Tensor type

F32

Dataset used to train harshinisree7/text-detector-model-v2

Evaluation results

Accuracy on AI vs Human Combined Dataset
self-reported

0.997
F1 on AI vs Human Combined Dataset
self-reported

0.997