CAJAL: a Python package for the analysis of single-cell morphological data

CAJAL is a Python package designed to explore and analyze the morphology of cells and its relationship with other single-cell data using the Gromov-Wasserstein (GW) distance. This distance quantifies the degree to which the shape of one cell can be transformed into that of another with minimal stretching or bending. One of the key benefits of using the GW distance is that it does not require any prior knowledge or model for the morphology of the cells. This feature makes CAJAL suitable for studying arbitrarily heterogeneous mixtures of cells with highly complex and diverse morphologies that may defy straightforward classification.

The morphological distance produced by CAJAL is a bona-fide mathematical distance in a latent space of cell morphologies. In this latent space, each cell is represented by a point, and distances between cells indicate the amount of physical deformation needed to change the morphology of one cell into that of another. By formulating the problem in this way, CAJAL can make use of standard statistical and machine learning approaches to define cell populations based on their morphology; dimensionally reduce and visualize cell morphology spaces; and integrate cell morphology spaces across tissues, technologies, and with other single-cell data modalities, among other analyses.

Indices and tables