teknium/OpenHermes-2.5
Viewer • Updated • 1M • 18.6k • 839
How to use mpasila/yi-super-9B-exl2-4bpw with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="mpasila/yi-super-9B-exl2-4bpw") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("mpasila/yi-super-9B-exl2-4bpw")
model = AutoModelForCausalLM.from_pretrained("mpasila/yi-super-9B-exl2-4bpw")How to use mpasila/yi-super-9B-exl2-4bpw with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mpasila/yi-super-9B-exl2-4bpw"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "mpasila/yi-super-9B-exl2-4bpw",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/mpasila/yi-super-9B-exl2-4bpw
How to use mpasila/yi-super-9B-exl2-4bpw with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "mpasila/yi-super-9B-exl2-4bpw" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "mpasila/yi-super-9B-exl2-4bpw",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "mpasila/yi-super-9B-exl2-4bpw" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "mpasila/yi-super-9B-exl2-4bpw",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use mpasila/yi-super-9B-exl2-4bpw with Docker Model Runner:
docker model run hf.co/mpasila/yi-super-9B-exl2-4bpw
This is an ExLlamaV2 quantized model in 4bpw of feeltheAGI/yi-super-9B using the default calibration dataset.
YI-9B-Super
YI-9B-Super is an YI-9B model that has been further fine-tuned with OpenHermes-2.5 dataset.
Results on some benchmarks :
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|---|
| truthfulqa | N/A | none | 0 | rouge1_max | 47.1011 | ± | 0.8016 |
| hellaswag | 1 | none | None | acc | 0.5758 | ± | 0.0049 |
| none | None | acc_norm | 0.7639 | ± | 0.0042 | ||
| gsm8k_cot | 3 | strict-match | 8 | exact_match | 0.5262 | ± | 0.0138 |
| flexible-extract | 8 | exact_match | 0.6027 | ± | 0.0135 | ||
| gsm8k | 3 | strict-match | 5 | exact_match | 0.6073 | ± | 0.0135 |
| flexible-extract | 5 | exact_match | 0.6126 | ± | 0.0134 |
| Groups | Version | Filter | n-shot | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|---|
| truthfulqa | N/A | none | 0 | rouge1_max | 47.1011 | ± | 0.8016 |
| none | 0 | bleu_max | 21.9476 | ± | 0.7162 | ||
| none | 0 | rouge2_acc | 0.3293 | ± | 0.0165 | ||
| none | 0 | bleu_acc | 0.3635 | ± | 0.0168 | ||
| none | 0 | rouge1_acc | 0.3892 | ± | 0.0171 | ||
| none | 0 | rougeL_acc | 0.3782 | ± | 0.0170 | ||
| none | 0 | bleu_diff | -2.3953 | ± | 0.6292 | ||
| none | 0 | rouge2_diff | -4.6929 | ± | 0.9130 | ||
| none | 0 | rougeL_diff | -4.2677 | ± | 0.8034 | ||
| none | 0 | acc | 0.4040 | ± | 0.0113 | ||
| none | 0 | rouge1_diff | -3.8975 | ± | 0.7966 | ||
| none | 0 | rougeL_max | 43.7954 | ± | 0.8145 | ||
| none | 0 | rouge2_max | 32.3573 | ± | 0.9094 | ||
| mmlu | N/A | none | 0 | acc | 0.6726 | ± | 0.0037 |
| - humanities | N/A | none | None | acc | 0.6043 | ± | 0.0067 |
| - other | N/A | none | None | acc | 0.7306 | ± | 0.0077 |
| - social_sciences | N/A | none | None | acc | 0.7741 | ± | 0.0074 |
| - stem | N/A | none | None | acc | 0.6181 | ± | 0.0083 |