Intel/orca_dpo_pairs
Viewer โข Updated โข 12.9k โข 1.91k โข 321
How to use monsterapi/Llama-3_1-8B-Instruct-orca-ORPO with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "monsterapi/Llama-3_1-8B-Instruct-orca-ORPO")Model Used: meta-llama/Meta-Llama-3.1-8B-Instruct
Dataset: Intel/orca_dpo_pairs
The Intel Orca dataset is a specialized version of the OpenOrca dataset, which includes ~1M GPT-4 completions and ~3.2M GPT-3.5 completions. This dataset is tabularized to align with the distributions in the ORCA paper and focuses on preference optimization by clearly indicating which responses are good and which are bad. It is primarily used in natural language processing for training and evaluation.
This finetuning run was performed using MonsterAPI's LLM finetuner with ORPO (Optimized Response Preference Optimization) for enhancing preference optimization.
$2.69 for the entire process.