{ "cells": [ { "cell_type": "markdown", "id": "e7b5a7ab-99b6-4b04-a18d-2e95101802b0", "metadata": {}, "source": [ "Tutorial 1: Predicting the Molecular Type of Neurons\n", "====================================================\n", "To demonstrate some of the main functionalities of CAJAL,\n", "here we perform some basic analysis on a set of neuron\n", "morphological reconstructions obtained from the\n", "[Allen Brain Atlas](https://celltypes.brain-map.org/). To facilitate\n", "the analysis, we provide a compressed \\*.tar.gz file containing the \\*.SWC\n", "files of 509 neurons used in this example, which can be downloaded directly from this\n", "[link](https://www.dropbox.com/s/aq0ovetjtqihf4f/allen_brain_atlas_509_SWCs_mouse_full_or_dendrite_only.tar.gz). In this tutorial we assume that the SWC files are located in the folder `/home/jovyan/swc`. More information about this dataset can be found at:\n", "\n", "\\- Gouwens, N. W. et al. [Classification of electrophysiological and morphological neuron types in the mouse visual cortex.](https://www.nature.com/articles/s41593-019-0417-0) Nat Neurosci 22, 1182-1195 (2019).\n", "\n", "For this analysis, we focus on the morphology of the dendrites and exclude the\n", "axons of the neurons. To achieve this, we set `structure_ids = [1,3,4]`,\n", "which tells CAJAL to only sample points from the soma and the basal and apical\n", "dendrites. We sample 100 points from each neuron and compute the Euclidean distance\n", "between each pair of points in that neuron using the following code:" ] }, { "cell_type": "code", "execution_count": 1, "id": "e901e5a9-ff30-4396-8c5a-e7aa4738dbcc", "metadata": {}, "outputs": [], "source": [ "import cajal.sample_swc\n", "import cajal.swc\n", "from os.path import join" ] }, { "cell_type": "code", "execution_count": 2, "id": "767f67d7-84e5-4a62-b2ed-5b3b273e4ee6", "metadata": {}, "outputs": [], "source": [ "bd = \"/home/jovyan/\" # Base directory" ] }, { "cell_type": "code", "execution_count": 3, "id": "1683a8ff-28a5-408c-a3d6-a010a4db1629", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|█████████▉| 508/509 [00:23<00:00, 21.23it/s]\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cajal.sample_swc.compute_icdm_all_euclidean(\n", " infolder=join(bd, 'swc'),\n", " out_csv=join(bd, 'swc_bdad_100pts_euclidean_icdm.csv'),\n", " out_node_types=join(bd, 'swc_bdad_100pts_euclidean_node_types.npy'),\n", " preprocess=cajal.swc.preprocessor_eu(\n", " structure_ids=[1,3,4],\n", " soma_component_only=False),\n", " n_sample=100)" ] }, { "cell_type": "markdown", "id": "01806ceb-596a-4b46-9a56-1aa895e002a6", "metadata": {}, "source": [ "Once the sampling is completed, we compute the Gromov-Wasserstein distance\n", "between each pair of neurons. To compute the Gromov-Wasserstein distance matrix we use\n", "the code:" ] }, { "cell_type": "code", "execution_count": 5, "id": "94bcc1c3-2c56-4edb-8d4c-61461f2330e1", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d9547a464ba84444b914d23c9c2ce6bb", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/129286 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "(array([[ 0. , 48.00771782, 23.044745 , ..., 31.6661131 ,\n", " 28.23034998, 15.52264516],\n", " [48.00771782, 0. , 58.39010605, ..., 68.06746526,\n", " 66.14810561, 55.62932524],\n", " [23.044745 , 58.39010605, 0. , ..., 19.22607515,\n", " 19.66919727, 23.99896955],\n", " ...,\n", " [31.6661131 , 68.06746526, 19.22607515, ..., 0. ,\n", " 10.71261723, 30.50133611],\n", " [28.23034998, 66.14810561, 19.66919727, ..., 10.71261723,\n", " 0. , 29.15049728],\n", " [15.52264516, 55.62932524, 23.99896955, ..., 30.50133611,\n", " 29.15049728, 0. ]]),\n", " None)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import cajal.run_gw\n", "\n", "cajal.run_gw.compute_gw_distance_matrix(\n", " join(bd, 'swc_bdad_100pts_euclidean_icdm.csv'),\n", " join(bd, 'swc_bdad_100pts_euclidean_GW_dmat.csv'),\n", " gw_coupling_mat_npz_loc=join(bd, 'swc_bdad_100pts_euclidean_GW_couplings.npz'),\n", " num_processes=15)" ] }, { "cell_type": "markdown", "id": "1e91d043-ad59-42db-8da8-34d2312525b9", "metadata": {}, "source": [ "We can visualize the resulting space of cell morphologies using UMAP:" ] }, { "cell_type": "code", "execution_count": 4, "id": "a0edadb2-3e25-48b4-9b55-3ce5d2061202", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/patn/PGC012_Gromov_Wasserstein/venv/lib/python3.12/site-packages/umap/umap_.py:1780: UserWarning:\n", "\n", "using precomputed metric; inverse_transform will be unavailable\n", "\n" ] }, { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "