Instructions to use p208p2002/llama-3-zhtw-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use p208p2002/llama-3-zhtw-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="p208p2002/llama-3-zhtw-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("p208p2002/llama-3-zhtw-8B")
model = AutoModelForCausalLM.from_pretrained("p208p2002/llama-3-zhtw-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use p208p2002/llama-3-zhtw-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "p208p2002/llama-3-zhtw-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "p208p2002/llama-3-zhtw-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/p208p2002/llama-3-zhtw-8B

SGLang

How to use p208p2002/llama-3-zhtw-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "p208p2002/llama-3-zhtw-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "p208p2002/llama-3-zhtw-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "p208p2002/llama-3-zhtw-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "p208p2002/llama-3-zhtw-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use p208p2002/llama-3-zhtw-8B with Docker Model Runner:
```
docker model run hf.co/p208p2002/llama-3-zhtw-8B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Llama 3 zhtw

在 Llama 3 上試驗中文 Continue Pretraining (CP)，共計訓練 800M tokens。

由於中文預訓練語料品質還有改進空間，CP 後表現未能超越原版 Llama 3，我們比較幾個開源社群訓練的中文 Llama 3 也有類似狀況。

在英文方面 LLaMA 3 zhtw 使用 FineWeb，使得 MMLU 表現高於其他中文CP模型，能力與原版 LLaMA 3 持平。

Benchmarks

Models		↑ TMMLU+ (ACC)	CMMLU (ACC)	MMLU (ACC)
		TC, Knowledge	CN, Knowledge	EN, Knowledge
		5 shot	5 shot	5 shot
Yi-6B	6B	49.63	75.53	65.35
Qwen-7B	7B	42.84	73.1	61.00
Meta-Llama-3-8B	8B	41.97	50.8	65.17
p208p2002/llama-3-zhtw-8B	8B	41.84	50.6	65.31
Breeze-7B-Base-v0_1	7B	40.35	44.05	61.63
hfl/llama-3-chinese-8b	8B	39.64	50.9	61.1

Recipe

Datasets

Dataset	Lang	Weight
FineWeb	en	0.35
Wudao	zh-cn	0.1
C4Tw	zh-tw	0.1
WikiZhTw	zh-tw	0.15
NdltdT10	zh-tw	0.1
GitHubMarkDown	code	0.1
GitHubPython	code	0.1

Hyper Parameters

Learning Rate: 1e-7
Global Batch Size: 60
Sequence Length: 8192

Downloads last month: 7

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for p208p2002/llama-3-zhtw-8B

Quantizations

1 model

Datasets used to train p208p2002/llama-3-zhtw-8B

Spaces using p208p2002/llama-3-zhtw-8B 9

Collection including p208p2002/llama-3-zhtw-8B

LLaMA-zhtw

Collection

6 items • Updated Jun 11, 2024