HuBERT-ECG as a self-supervised foundation model for broad and scalable cardiac application

Original code at https://github.com/Edoar-do/HuBERT-ECG

License: CC BY-NC 4.0

Abstract

The electrocardiogram (ECG) is a widely accessible tool for cardiovascular assessment, and thegrowing availability of ECG datasets has enabled the emergence of ECG foundation models. However, such foundation models often lack extensive evaluation across clinically heterogeneousdownstream tasks extending beyond conventional rhythm and conduction analysis. We present HuBERT-ECG, a self-supervised foundation ECG model pre-trained on 9.1 million 12-lead ECGsfrom four countries and diverse patient populations, and evaluated through fine-tuning on 21 independent datasets spanning more than 1.6k diagnostic and prognostic targets across adults and paediatric cohorts, including single-lead settings. These tasks cover conditions for which the ECG is the primary diagnostic modality, provides supportive but non-definitive diagnostic information, or enables acute-care prediction and prognostic modelling. Available in three model sizes to characterise scaling behaviour and support diverse computational constraints, HuBERT-ECG achieves AUROC ranging from 84% to 99% on ECG-primary diagnostic tasks, 76% to 97% on supportive diagnostictasks, 74% to 91% on prognostic prediction tasks, and 88% to 92% on single-lead ECG benchmarks. Moreover, a large-scale multitask fine-tuning across 2.4 million subjects and 164 tasks simultaneously shows that AUROC further increases for clinically relevant tasks without extra task-specific supervision. We release pretrained models and code as building baselines.

Models

This repository contains the self-supervised pre-trained hubert-ecg-base

Code

Visit the GitHub repository for more details and information on how to use HuBERT-ECG on your own data.

import hubert_ecg  # registers custom model types with AutoModel
from transformers import AutoModel

model = AutoModel.from_pretrained("Edoardo-BS/hubert-ecg-base")

or alternatively for the .pt file

from hubert_ecg import HuBERTECG
model = HuBERTECG.from_pretrained_legacy("path/to/old_checkpoint.pt")

IMPORTANT NOTE

Don't forget to pre-process your data! Read the paper to know more about it

📚 Citation

If you use our models or find our work useful, please consider citing us:

https://doi.org/10.1101/2024.11.14.24317328

Downloads last month: 1,792

Safetensors

Model size

93.1M params

Tensor type

F32