Instructions to use convaiinnovations/medgemma-4b-ecginstruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use convaiinnovations/medgemma-4b-ecginstruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="convaiinnovations/medgemma-4b-ecginstruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("convaiinnovations/medgemma-4b-ecginstruct") model = AutoModelForImageTextToText.from_pretrained("convaiinnovations/medgemma-4b-ecginstruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use convaiinnovations/medgemma-4b-ecginstruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "convaiinnovations/medgemma-4b-ecginstruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "convaiinnovations/medgemma-4b-ecginstruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/convaiinnovations/medgemma-4b-ecginstruct
- SGLang
How to use convaiinnovations/medgemma-4b-ecginstruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "convaiinnovations/medgemma-4b-ecginstruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "convaiinnovations/medgemma-4b-ecginstruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "convaiinnovations/medgemma-4b-ecginstruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "convaiinnovations/medgemma-4b-ecginstruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use convaiinnovations/medgemma-4b-ecginstruct with Docker Model Runner:
docker model run hf.co/convaiinnovations/medgemma-4b-ecginstruct
MedGemma-4B ECGInstruct
Fine-tuned version of Google's MedGemma-4B-it model on the ECGInstruct dataset for automated ECG interpretation.
Model Description
This is a fully merged fine-tuned version of google/medgemma-4b-it trained on the PULSE-ECG/ECGInstruct dataset containing 1.15M ECG instruction-following examples. The LoRA adapter has been merged into the base model for easier deployment.
Developed by: ConvAI Innovations
Base Model: google/medgemma-4b-it
Training Infrastructure: AIRAWAT (C-DAC) - 8x NVIDIA A100 40GB GPUs
Training Duration: 72 hours (3 days)
Final Token Accuracy: 86.83%
Final Training Loss: 0.6188
GPU-Hours: 576
Model Size: ~8.5 GB
Training Details
Training Data
- Dataset: PULSE-ECG/ECGInstruct (1.15M samples)
- Samples: 1,156,110 ECG image-text pairs
- Image Sources: MIMIC-IV-ECG (~800K), PTB-XL (22K), CODE-15% (346K), ChapmanShaoxing
- Task: Vision-language instruction following for ECG interpretation
- Demographics: Age range 0-95 years, 52% male / 48% female
- Disease Classes: 5 superclasses (NORM, MI, STTC, CD, HYP), 24 subclasses
Training Procedure
Hardware:
- 8x NVIDIA A100 40GB GPUs (AIRAWAT supercomputer)
- Distributed training with PyTorch DDP
Training Configuration:
- Fine-tuning method: LoRA (r=32, alpha=64, dropout=0.05)
- Target modules: all-linear (including vision encoder)
- Learning rate: 1.2e-5 with cosine decay
- Batch size: 192 effective (4 per GPU × 8 GPUs × 6 gradient accumulation)
- Optimizer: AdamW (fused)
- Precision: bfloat16
- Gradient checkpointing: Enabled
- Max sequence length: 2048 tokens
- Max new tokens: 512
Training Metrics:
- Final training loss: 0.6188
- Mean token accuracy: 86.83%
- Training throughput: ~9.67 samples/sec
- Total tokens processed: 103M+
Usage
Installation
pip install transformers pillow torch
Loading the Model
from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image
# Load model and processor
model_id = "convaiinnovations/medgemma-4b-ecginstruct"
model = AutoModelForImageTextToText.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(model_id)
Inference Example
# Load ECG image
image = Image.open("ecg_image.png").convert("RGB")
# Prepare prompt using chat template
messages = [
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": "Analyze this ECG and provide a detailed interpretation."}
]
}
]
# Process input
text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = processor(text=[text], images=[[image]], return_tensors="pt", padding=True)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
# Generate interpretation
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False
)
# Decode and print
response = processor.decode(outputs[0], skip_special_tokens=True)
print(response)
Example Prompts
# Detailed interpretation
"Analyze this ECG and provide a detailed interpretation."
# Specific abnormality detection
"What abnormalities are present in this ECG?"
# Diagnosis suggestion
"Based on this ECG, what is the most likely diagnosis?"
# Question answering
"Does this ECG show signs of atrial fibrillation?"
# Rate and rhythm
"What is the heart rate and rhythm in this ECG?"
Model Capabilities
This model can:
- ✅ Interpret 12-lead ECG images
- ✅ Identify cardiac abnormalities (arrhythmias, ischemia, hypertrophy, conduction blocks, etc.)
- ✅ Generate detailed clinical reports
- ✅ Answer specific questions about ECG findings
- ✅ Provide diagnostic suggestions
- ✅ Assess heart rate, rhythm, and axis
- ✅ Detect ST-segment changes and T-wave abnormalities
Performance
Training Metrics:
| Metric | Value |
|---|---|
| Token Accuracy | 86.83% |
| Final Loss | 0.6188 |
| Training Time | 72 hours |
| GPU-Hours | 576 |
Inference Metrics (A100 GPU):
| Metric | Value |
|---|---|
| TTFT (Time to First Token) | ~150ms |
| ISL (Input Sequence Length) | 2048 tokens |
| OSL (Output Sequence Length) | 512 tokens |
| End-to-End Latency | 2-3 seconds |
| Throughput | ~45 tokens/sec |
Limitations
- Trained primarily on adult ECG data
- Performance may vary on pediatric ECGs
- Should not replace professional medical diagnosis
- Requires high-quality ECG images for optimal results
- May struggle with very rare or unusual ECG patterns
- Limited to English language outputs
Ethical Considerations
⚠️ MEDICAL DISCLAIMER
This model is for RESEARCH AND EDUCATIONAL PURPOSES ONLY.
- ❌ NOT validated for clinical use
- ❌ NOT FDA/CE approved
- ❌ NOT a substitute for professional medical diagnosis
- ❌ Should NOT be used for patient care decisions
Always consult qualified healthcare professionals for medical decisions.
Important Notes:
- This is an AI model and can make mistakes
- ECG interpretation requires clinical context
- Model outputs should be verified by trained clinicians
- Not approved for clinical use or diagnostic purposes
- Use responsibly and within appropriate medical oversight
- Has not been tested on external clinical datasets
Intended Use
Appropriate Uses:
- Research in medical AI and computer vision
- Educational demonstrations of ECG interpretation
- Development of clinical decision support prototypes
- Benchmarking ECG analysis algorithms
Inappropriate Uses:
- Direct patient diagnosis without physician review
- Replacement of trained medical professionals
- Use in emergency or critical care settings without oversight
- Commercial deployment without proper validation
Citation
If you use this model in your research, please cite:
@misc{medgemma-ecginstruct,
author = {convaiinnovations},
title = {MedGemma-4B ECGInstruct: Fine-tuned ECG Interpretation Model},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/convaiinnovations/medgemma-4b-ecginstruct}}
}
Acknowledgments
- Base Model: Google's MedGemma team for the foundation model
- Dataset: PULSE-ECG team for the ECGInstruct dataset
- Infrastructure: AIRAWAT AI Innovation Challenge by C-DAC (Centre for Development of Advanced Computing)
- Frameworks: HuggingFace Transformers, PEFT, TRL, PyTorch
Related Resources
- LoRA Adapter Version: convaiinnovations/medgemma-4b-ecginstruct-lora
- Base Model: google/medgemma-4b-it
- Training Dataset: PULSE-ECG/ECGInstruct
License
Apache 2.0 (following base model license)
Contact
For questions or issues, please open an issue on the model repository or contact the maintainers.
- Downloads last month
- 21