MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering

MedSteer is a training-free framework for steering a fine-tuned Diffusion Transformer (DiT) at inference time, enabling controllable counterfactual synthesis of endoscopic images. By intercepting cross-attention activations inside the transformer blocks and shifting them along concept directions, MedSteer can generate counterfactual pairs (e.g., removing or adding pathological features like polyps) while preserving the underlying anatomy and texture.

Installation

MedSteer requires a specific environment, including a vendored fork of the diffusers library.

# Clone the repository
git clone https://github.com/phamtrongthang123/medsteer
cd medsteer

# Install the vendored diffusers fork
pip install -e diffusers/

# Install MedSteer and dependencies
pip install -e .

Sample Usage

The following example demonstrates how to load the model and generate a baseline image. Note that using the "suppress" mode for counterfactual generation requires precomputed direction vectors.

import torch
import transformers.utils as _tu
from huggingface_hub import snapshot_download
from medsteer import MedSteerPipeline

# Compatibility shim: newer transformers removed FLAX_WEIGHTS_NAME.
if not hasattr(_tu, "FLAX_WEIGHTS_NAME"):
    _tu.FLAX_WEIGHTS_NAME = "diffusion_flax_model.msgpack"

# 1. Download the LoRA checkpoint from the Hub
lora_path = snapshot_download(
    repo_id="phamtrongthang/medsteer",
    local_dir="medsteer_ckpt",
)

# 2. Load the model
pipe = MedSteerPipeline.from_pretrained(
    model_id="PixArt-alpha/PixArt-XL-2-512x512",
    lora_path=lora_path,
    device="cuda" if torch.cuda.is_available() else "cpu",
)

# 3. Baseline generation
image = pipe.generate(
    prompt="An endoscopic image of dyed lifted polyps",
    seed=42,
    num_steps=20,
    mode="baseline",
)
image.save("baseline.png")

Citation

@article{pham2026medsteer,
  title={MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering},
  author={Pham, Trong-Thang and Nguyen, Loc and Nguyen, Anh and Nguyen, Hien and Le, Ngan},
  journal={arXiv preprint arXiv:2603.07066},
  year={2026}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for phamtrongthang/medsteer

Adapter
(4)
this model

Paper for phamtrongthang/medsteer