Skeletonization using the pychunkedgraph
Save time by generating robust neuronal skeletons directly from a ChunkedGraph dynamic segmentations!
What you need
For skeletons you just need a dynamic segmentation running on a PyChunkedGraph server.
To also use annotations, you also need a database backend running the MaterializationEngine.
Ids in the PCG combine information about chunk level, spatial location, and unique object id. This package uses the highest-resolution chunking, level 2, to derive neuronal topology and approximate spatial extent. For clarity, I'll define a few terms:
L2 object: Collection of attributes (supervoxels, mesh, etc) associated with a level 2 id.
L2 mesh: The fragment of mesh associated with a level 2 id.
Chunk index/chunk index space: The integer 3-d index of a chunk in the the grid that covers the volume, i.e. the chunk index space.
Euclidean position/Euclidean space: A position in nanometers / biological space.
Chunk position: The center of a specific chunk index in Euclidean space
Refinement: Mapping the position of an L2 object from its chunk position to a more representitive point on the mesh.
How to use
There are a few potentially useful functions with progressively more features:
This function returns a meshparty skeleton with vertices given in chunk index space.
def chunk_index_skeleton(root_id, client=None, datastack_name=None, cv=None, root_point=None, invalidation_d=3, return_mesh=False, return_l2dict=False, return_mesh_l2dict=False, root_point_resolution=[4, 4, 40], root_point_search_radius=300, n_parallel=1):
datastack_namemust be provided, and a cloudvolume object is suggested for performance reasons, especially if running in bulk.
root_pointis set, the function looks within
root_point_search_radiusfor the closest point in the root_id segmentation and sets the associated level 2 ID as the root vertex.
root_point_search_radiusis in nanometers.
root_pointis in units of
root_point_resolution, such that root_point * root_point_resolution should be correct in Euclidean space.
return_meshwill also return the full graph of level 2 objects as a
meshparty.Meshthat has no faces and uses only link edges. Vertices are in chunk index space.
return_l2dictwill also return a tuple of two dictionaries. The first,
l2dict, is a dict mapping level 2 id to skeleton vertex index. This is one to many, since it includes passing through multiple level 2 ids that belong to the mesh and are collapsed into a common skeleton vertex. The second,
l2dict_reversed, is a dict mapping skeleton vertex to level 2 id. This is one to one.
return_mesh_l2dictwill return an additional tuple of two dictionaries. The first maps level 2 ids to mesh vertices, the second mesh vertex index to level 2 id.
n_parallelis passed directly to cloudvolume and can speed up mesh fragment downloading.
Map a skeleton in chunk index space into Euclidean space.
def refine_chunk_index_skeleton( sk_ch, l2dict_reversed, cv, refine_inds='all', scale_chunk_index=True, root_location=None, nan_rounds=20, return_missing_ids=False, ):
sk_chis assumed to have positions in chunk index space.
l2dict_reversedis the second dict that comes back in the tuple you get by setting
refine_indscan be either
"all"or a specific array of vertex indices. This parameter sets which vertices to refine. A handful of vertices can be processed very quickly, while a more accurate skeleton is generated by refining all vertices.
scale_chunk_index, if True, will still map unrefined vertex indices from their chunk index into the chunk position in eucliden space.
nan_roundsgives how many times to try to smoothly interpolate any vertices that did not get a position due to a missing L2 mesh fragment.
return_missing_idswill also return a list of the ids of any missing L2 mesh fragment, for example if you want to re-run meshing on them.
Directly build a skeleton in Euclidean space, plus optionally handle soma points.
def pcg_skeleton(root_id, client=None, datastack_name=None, cv=None, refine='all', root_point=None, root_point_resolution=[4, 4, 40], root_point_search_radius=300, collapse_soma=False, collapse_radius=10_000.0, invalidation_d=3, return_mesh=False, return_l2dict=False, return_l2dict_mesh=False, return_missing_ids=False, nan_rounds=20, n_parallel=1):
refinecan take five values to determine which vertices to refine.
"all": Refine all vertices
"ep": Refine only end points.
"bp": Refine only branch points.
"epbp": Refine both branch and end points.
None: Don't refine any vertex, but still map chunk positions coarsely.
Options which download few mesh fragments are much faster than
collapse_soma, if set to True, collapses all mesh vertices within
collapse_radiusinto the root point. Note that the root point is not a new point, but rather the closest level 2 object to the root point location.
Build a meshwork file with skeleton out from the level 2 graph. Optionally, attach pre- and/or post-synaptic synapses.
def pcg_meshwork(root_id, datastack_name=None, client=None, cv=None, refine='all', root_point=None, root_point_resolution=[4, 4, 40], root_point_search_radius=300, collapse_soma=False, collapse_radius=DEFAULT_COLLAPSE_RADIUS, synapses=None, synapse_table=None, remove_self_synapse=True, invalidation_d=3, n_parallel=4, ):
The resulting meshwork file comes back with a "mesh" made out of the level 2 graph with vertices mapped to their chunk positions, a skeleton with
collapse_somaoptions as above, and one or more annotations.
All resulting meshworks have the annotation
lvl2_ids, which is based on a dataframe with column
lvl2_idthat has level 2 ids and
mesh_indthat has the associated mesh index. One can use the MeshIndex/SkeletonIndex properties like
nrn.anno.mesh_index.to_skel_indexto see the associated skeleton indices.
synapsesproperty is set to
"all", there is also an attempt to look up the root id's presynaptic synapses, postsynaptic synapses, or both (respectively). Presynaptic synapses are in an annotation
"pre_syn"and postsynaptic synapses are in an annotation called
If returning synapses, you must set the synapse table. By default, synapses whose pre- and post-synaptic ids are both the same root id are excluded, but this can be turned off by setting
A minimal example to get the skeleton of an arbitrary neuron with root id
864691135761488438 and soma at the voxel-space location
253870, 236989, 20517 in the Minnie dataset:
from caveclient import CAVEclient import pcg_skel client = CAVEclient('minnie65_phase3_v1') oid = 864691135761488438 # Root id root_point = [253870, 236989, 20517] # root point in vertex coordinates sk_l2 = pcg_skel.pcg_skeleton(oid, client=client, refine='all', root_point=root_point, root_point_resolution=[4,4,40], collapse_soma=True, n_parallel=8)
To generate a skeleton with 1310 vertices took about 80 seconds on a 2019 MacBook Pro, all refined through the mesh fragments.
Most of the time is spent in refinement.
If you just select the
refine="epbp" argument, it only refines the 79 branch and end points and accordingly takes a mere 12.5 seconds.
It is worth exploring sparse refinement options and interpolation if processing time is extremely important.
A common use case is to have a set of neurons that you are analyzing while proofreading is still ongoing. However, because proofreading events leave most of the neuron unchanged we shouldn't need to do more than update new locations. Conveniently, the level 2 ids let us follow this intuition. While a neuron's root id changes for each proofreading event, the level 2 id only changes if an edit occurs exactly within that region. Therefore, once we've looked up a level 2 id once, we can cache it and save time for future iterations on the same neuron.
The current implementation of caching uses SQLiteDict, a simple means of using a sqlite file as a persistent key-value store.
The cache is used with the functions
pcg_meshwork if you specify the filename of the sqlite database.
sk_l2 = pcg_skel.pcg_skeleton(oid, ..., cache='l2_cache.sqlite', save_to_cache=False, )
Note that if the database is not yet present, it will be created automatically.
In order to avoid unintentional changes, new locations are not saved to the database by default.
If you want to save new ids to the database as you are skeletonizing files, set the additional parameter
For adding data to the database post-hoc, please use the function
Localizing level 2 ids with mesh fragments is much faster than downloading the segmentation for a whole chunk just to get a few supervoxels, but sometimes mesh fragments don't exist. This can occur for both good reasons (the L2 object is too small to get a mesh) and due to bugs in the meshing backend. However, if you really want to get such locations correct, falling back to the segmentation is available.
You can use this by setting the
segmentation_fallback option on any of the functions that localize vertices.
sk_l2 = pcg_skel.pcg_skeleton(oid, ..., segmentation_fallback=True, fallback_mip=2, )
segmentation_fallback=True activates the capability, and setting
fallback_mip will choose the MIP-level to use (2 by default).
If the chosen MIP does not have voxels for the L2 id in it, it will try 0 next.
Segmentation-based points will be cached if both features are used.
Note that this approach is much slower and more memory intensive than using the mesh fragments. I have found it to take 4-8 seconds per vertex with the current implementation, although parallelation helps.
- Improve/document/test tooling for additional annotations.
- Alternative key-value stores for caching
This work is by Casey Schneider-Mizell (firstname.lastname@example.org) with suggestions from Sven Dorkenwald and Forrest Collman.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.