Processing Obj Meshes

CAJAL supports cell morphology data in the form of Wavefront *.obj files.

A *.obj file should consist of a series of lines, either - comments starting with “#” (discarded) - a vertex line, starting with “v” and followed by three floating point xyz coordinates - a face line, starting with f and followed by three integers which are indices for the vertices

All other lines will be ignored or discarded.

For examples of compatible mesh files see the folder /CAJAL/data/obj_files in the CAJAL Git repository.

The sample_mesh.py file contains functions to help the user sample points from an *.obj file and compute the geodesic distances between points.

class cajal.sample_mesh.VertexArray

A sample_mesh.VertexArray is a numpy array of shape (n, 3), where n is the number of vertices in the mesh.

Each row of a sample_mesh.VertexArray is an XYZ coordinate triple for a point in the mesh.

Value

numpy.typing.NDArray[numpy.float_]

class cajal.sample_mesh.FaceArray

A FaceArray is a numpy array of shape (m, 3) where m is the number of faces in the mesh. Each row of a FaceArray is a list of three natural numbers, corresponding to indices in the corresponding VertexArray, representing triangular faces joining those three points.

Value

numpy.typing.NDArray[numpy.int_]

read_obj(file_path: str) Tuple[sample_mesh.VertexArray, sample_mesh.FaceArray]

Reads in the vertices and triangular faces of a .obj file.

Parameters

file_path (str) – Path to .obj file

Returns

Ordered pair (vertices, faces), where:

  • vertices is an array of 3D floating-point coordinates of shape (n,3), where n is the number of vertices in the mesh

  • faces is an array of shape (m,3), where m is the number of faces; the k-th row gives the indices for the vertices in the k-th face.

Return type

Tuple[sample_mesh.VertexArray, sample_mesh.FaceArray]

compute_icdm_all(infolder: str, out_csv: str, metric: Union[Literal['euclidean'], Literal['geodesic']], n_sample: int = 50, num_processes: int = 8, segment: bool = True, method: Union[Literal['networkx'], Literal['heat']] = 'heat') List[str]

Go through every Wavefront *.obj file in the given input directory infolder and compute intracell distances according to the given metric. Write the results to output *.csv file named out_csv.

Parameters
  • infolder (str) – Folder full of *.obj files.

  • out_csv (str) – Output will be written to a *.csv file titled out_csv.

  • metric (Union[Literal['euclidean'], ~typing.Literal['geodesic']]) – How to compute the distance between points.

  • n_sample (int) – How many points to sample from each cell.

  • num_processes (int) – Number of independent processes which will be created. Recommended to set this equal to the number of cores on your machine.

  • method (Union[Literal['networkx'], ~typing.Literal['heat']]) – How to compute geodesic distance. The “networkx” method is more precise, and takes between 5 - 15 seconds for a cell with 50 sample points. The “heat” method is a faster but rougher approximation, and takes between 0.05 - 0.15 seconds for a cell with 50 sample points. This flag is not relevant if the user is sampling Euclidean distances.

  • segment (bool) – If segment is True, each *.obj file will be segmented into its set of connected components before being returned, so an *.obj file with multiple connected components will be understood to contain multiple distinct cells. If segment is False, each *.obj file will be understood to contain a single cell, and points will be sampled accordingly. If segment is False and the user chooses “geodesic”, in the event that an *.obj file contains multiple connected components, the function will attempt to “repair” the *.obj file by adjoining new faces to the complex so that a sensible notion of geodesic distance can be computed between two points. The user is warned that this imputing of data carries the same consequences with regard to scientific interpretation of the results as any other kind of data imputation for incomplete data sets.

Returns

Names of cells for which sampling failed because the cells have fewer than n_sample points.

Return type

List[str]