Instructions to use facebook/opt-iml-max-1.3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use facebook/opt-iml-max-1.3b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="facebook/opt-iml-max-1.3b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("facebook/opt-iml-max-1.3b")
model = AutoModelForCausalLM.from_pretrained("facebook/opt-iml-max-1.3b")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use facebook/opt-iml-max-1.3b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "facebook/opt-iml-max-1.3b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "facebook/opt-iml-max-1.3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/facebook/opt-iml-max-1.3b

SGLang

How to use facebook/opt-iml-max-1.3b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "facebook/opt-iml-max-1.3b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "facebook/opt-iml-max-1.3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "facebook/opt-iml-max-1.3b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "facebook/opt-iml-max-1.3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use facebook/opt-iml-max-1.3b with Docker Model Runner:
```
docker model run hf.co/facebook/opt-iml-max-1.3b
```

OPT-IML

Model Description

OPT-IML (OPT + Instruction Meta-Learning) is a set of instruction-tuned versions of OPT, on a collection of ~2000 NLP tasks gathered from 8 NLP benchmarks, called OPT-IML Bench.

We provide two model versions:

OPT-IML trained on 1500 tasks with several tasks held-out for purposes of downstream evaluation, and
OPT-IML-Max trained on all ~2000 tasks

How to use

You can use this model directly with a pipeline for text generation.

>>> from transformers import pipeline

>>> generator = pipeline('text-generation', model="facebook/opt-iml-max-1.3b")

>>> generator("What is the capital of USA?")

Limitations and bias

While OPT-IML models outperform baseline OPT on an extensive set of evaluations, nevertheless, they are susceptible to the various risks associated with using large language models relating to factual correctness, generation of toxic language and enforcing stereotypes. While we release our OPT-IML models to proliferate future work on instruction-tuning and to improve the availability of large instruction-tuned causal LMs, the use of these models should be accompanied with responsible best practices.

Training data

OPT-IML models are trained on OPT-IML Bench, a large benchmark for Instruction MetaLearning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks include Super-NaturalInstructions, FLAN, PromptSource, etc.

Training procedure

The texts are tokenized using the GPT2 byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a vocabulary size of 50272. The inputs are sequences of 2048 consecutive tokens.

The 30B model was fine-tuned on 64 40GB A100 GPUs. During fine-tuning, models saw approximately 2 billion tokens, which is only 0.6% of the pre-training budget of OPT.

BibTeX entry and citation info

@misc{iyer2022opt,
      title={OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization}, 
      author={Iyer, Srinivasan and Lin, Xi Victoria and Pasunuru, Ramakanth and Mihaylov, Todor and Simig, D{\'a}niel and Yu, Ping and Shuster, Kurt and Wang, Tianlu and Liu, Qing and Koura, Punit Singh and others},
      year={2022},
      eprint={2212.12017},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Downloads last month: 3,081

Model tree for facebook/opt-iml-max-1.3b

Adapters

1 model

Spaces using facebook/opt-iml-max-1.3b 33

Collection including facebook/opt-iml-max-1.3b

OPT

Collection

OPT (Open Pretrained Transformer) is a series of open-sourced large causal language models which perform similar in performance to GPT3. • 12 items • Updated Nov 21, 2024 • 9

Paper for facebook/opt-iml-max-1.3b

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Paper • 2212.12017 • Published Dec 22, 2022 • 1