Vision-language encoder for chest CT scans and reports.

Project description

license: mit language:

en pipeline_tag: zero-shot-image-classification

COLIPRI

COLIPRI is a 3D vision–language transformer model trained to encode chest CT scans and reports.

Model description

COLIPRI was trained using tens of thousands of chest CT scans and reports, without any annotations, using multiple objectives to learn strong joint representations of 3D images and text. The procedure is described in detail in our manuscript, Comprehensive language-image pre-training for 3D medical image understanding (Wald et al. 2026).

The weights shared here correspond to our best-performing model, COLIPRI-CRM.

Developed by: Microsoft Health Futures
Model type: 3D vision–language encoder
License: MIT

Uses

COLIPRI is shared for research purposes only. It is not meant to be used for clinical practice.

The encoders be plugged to other models, or used independently or jointly for many downstream tasks, such as:

Image classification with text prompts
Image clustering
Text clustering
Text-to-image retrieval
Image-to-image retrieval
Image-to-text retrieval
Text-to-text retrieval
Image classification with a classifier
Text classification with a classifier
Image segmentation with a decoder
Report generation with a language decoder

Fine-tuning COLIPRI is typically not necessary to obtain good performance in downstream tasks.

Getting started

Installation

pip install colipri

Usage examples

Below we share some usage snippets to get started with COLIPRI. A more complete Jupyter notebook is also available.

First, let's get a 3D chest CT we can use for demonstration. The plotted slices intersect a lung nodule near the heart.

>>> from colipri import load_sample_ct
>>> image = load_sample_ct()
>>> image
ScalarImage(shape: (1, 512, 512, 139); spacing: (0.76, 0.76, 2.50); orientation: LPS+; dtype: torch.IntTensor; memory: 139.0 MiB)

The image looks like this:

Input CT

Now, let's instantiate the model and processor.

>>> from colipri import get_model
>>> from colipri import get_processor
>>> model = get_model().cuda()
>>> processor = get_processor()

Zero-shot classification

>>> from colipri import ZeroShotImageClassificationPipeline
>>> pipeline = ZeroShotImageClassificationPipeline(model, processor)
>>> pipeline(image, ["No lung nodules", "Lung nodules"])
[
    {'score': 0.005, 'label': 'No lung nodules'},
    {'score': 0.995, 'label': 'Lung nodules'}
]

Feature extraction

>>> import torch
>>> preprocessed_images = processor.process_images(image)
>>> preprocessed_images[0]
ScalarImage(shape: (1, 192, 192, 192); spacing: (2.00, 2.00, 2.00); orientation: SAR+; dtype: torch.FloatTensor; memory: 27.0 MiB)
>>> images_batch = processor.to_images_batch(preprocessed_images)
images_batch.shape
torch.Size([1, 1, 192, 192, 192])
>>> with torch.no_grad():
...     patch_embeddings = model.encode_image(images_batch)
>>> patch_embeddings.shape
torch.Size([1, 768, 24, 24, 24])
>>> with torch.no_grad():
...     pooled_embeddings = model.encode_image(images_batch, pool=True, project=True)
>>> pooled_embeddings.shape
torch.Size([1, 768])

Biases, risks, and limitations

COLIPRI was trained with data from Turkey and the USA only, therefore it might be biased towards population in the training data. Underlying biases of the training datasets may not be well characterized.

Environmental impact

Hardware type: NVIDIA A100 GPUs
Hours used: 72 hours × 4 GPUs = 288 GPU-hours
Cloud provider: Azure
Compute region: West US 2
Carbon emitted: 21.6 kg CO₂ eq.

Compute infrastructure

COLIPRI was trained on Azure Machine Learning.

Hardware

Stage	Node type	Num. nodes	GPU type	GPUs per node
Pre-training	`Standard_NC96ads_A100_v4`	1	NVIDIA A100 (80 GB)	4
Evaluation	`Standard_NC24ads_A100_v4`	1	NVIDIA A100 (80 GB)	1

Software

The main software libraries used in this work were nnSSL for training, TorchIO for preprocessing and augmentation, nifti-zarr-py for data loading, and nnU-Net for segmentation evaluation.

Citation

BibTeX

@misc{
    wald2026_colipri,
    title={Comprehensive language-image pre-training for 3D medical image understanding},
    author={Tassilo Wald and Ibrahim Ethem Hamamci and Yuan Gao and Sam Bond-Taylor and Harshita Sharma and Maximilian Ilse and Cynthia Lo and Olesya Melnichenko and Anton Schwaighofer and Noel C. F. Codella and Maria Teodora Wetscherek and Klaus H. Maier-Hein and Panagiotis Korfiatis and Valentina Salvatelli and Javier Alvarez-Valle and P{\'e}rez-Garc{\'i}a},
    year={2026},
    eprint={2510.15042},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2510.15042},
}

APA

Wald, T., Hamamci, I. E., Gao, Y., Bond-Taylor, S., Sharma, H., Ilse, M., Lo, C., Melnichenko, O., Schwaighofer, A., Codella, N. C. F., Wetscherek, M. T., Maier-Hein, K. H., Korfiatis, P., Salvatelli, V., Alvarez-Valle, J., & Pérez-García, F. (2026). Comprehensive language-image pre-training for 3D medical image understanding. arXiv. https://doi.org/10.48550/ARXIV.2510.15042

Model card contact

Fernando Pérez-García (fperezgarcia@microsoft.com).

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Jan 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

colipri-0.1.0.tar.gz (11.5 kB view details)

Uploaded Jan 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

colipri-0.1.0-py3-none-any.whl (20.1 kB view details)

Uploaded Jan 25, 2026 Python 3

File details

Details for the file colipri-0.1.0.tar.gz.

File metadata

Download URL: colipri-0.1.0.tar.gz
Upload date: Jan 25, 2026
Size: 11.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for colipri-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b5ed04b704022369653e03b1eae694fee4e94b48d6ad81ee3c35042d16216ff4`
MD5	`c1b6679853fbec39796ab3dd485a4c08`
BLAKE2b-256	`91a69025fc5f37b05f57615d42d5d1d9267b9de21f7be0b364963bfebec10045`

See more details on using hashes here.

File details

Details for the file colipri-0.1.0-py3-none-any.whl.

File metadata

Download URL: colipri-0.1.0-py3-none-any.whl
Upload date: Jan 25, 2026
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for colipri-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`65eecd2287eae4aefcadde4415644bdaceb9692593ba8bb2e7635331eacc0854`
MD5	`dd7a68649be5e7a17da05d5a688fe7cd`
BLAKE2b-256	`1d61b820d620a72bfb522049f96390028f4a2139e41d409447ee6e4eef7488ba`

See more details on using hashes here.

colipri 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

COLIPRI

Model description

Uses

Getting started

Installation

Usage examples

Zero-shot classification

Feature extraction

Biases, risks, and limitations

Environmental impact

Compute infrastructure

Hardware

Software

Citation

BibTeX

APA

Model card contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes