Model architectures ['DeepseekOCR2ForCausalLM'] are not supported with latest vllm

by avishekjana - opened Jan 27

Jan 27

Getting the following error:

validation error for ModelConfig
vllm-server-1 | (APIServer pid=1) Value error, Model architectures ['DeepseekOCR2ForCausalLM'] are not supported.

GPU: 2x 16GB NVIDIA RTX 5070Ti

My compose.yml:

  vllm-server:
    image: vllm/vllm-openai:latest
    command: >
      --model deepseek-ai/DeepSeek-OCR-2
      --trust-remote-code
      --enable-prefix-caching
      --gpu-memory-utilization 0.9
      --port 8005
      --max-num-seqs 2
      --tensor-parallel-size 2
    environment:
      - VLLM_ATTENTION_BACKEND=FLASH_ATTN
    volumes:
      - ~/.cache:/root/.cache

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    network_mode: host
    ipc: host

sinhduc01

Jan 27

Same error when deploying to vLLM server using Kubernested. Maybe the vLLM don't support?

DuyTa

Jan 28

The current vLLM doesnt support deepseek OCR 2 arch, which mean we need regis new model arch with exact typo (or you guy just need to clone the original github repo, and install vLLM whl)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment