torch_em.data.datasets.electron_microscopy.zebrafinch
Zebrafinch Area X datasets for neuron and organelle segmentation in 3DEM.
Two FIB-SEM volumes of adult male zebra finch (Taeniopygia guttata) area X are available, both from the Kornfeld lab:
- j0251: 10 x 10 x 25 nm native resolution, full extent ~256 x 256 x 384 µm. Labels: neuron instance segmentation (~4.26 M neurons) and endoplasmic reticulum. Cell-type labels (17 types: MSN, GPe, GPi, HVC axons, interneurons, etc.) and synapse coordinates are available via the REST API at https://syconn.esc.mpcdf.mpg.de.
- j0126: 10 x 10 x 20 nm native resolution, full extent ~107 x 109 x 114 µm. Labels: neuron instance segmentation only.
Data is streamed from the Kornfeld lab public server via cloud-volume and cached locally as zarr v3 stores in (z, y, x) axis order.
This dataset is from the publication https://doi.org/10.1101/2025.10.25.684569. Please cite it if you use this dataset in your research.
The dataset is publicly available at https://syconn.esc.mpcdf.mpg.de. Requires cloud-volume: pip install cloud-volume.
1"""Zebrafinch Area X datasets for neuron and organelle segmentation in 3DEM. 2 3Two FIB-SEM volumes of adult male zebra finch (Taeniopygia guttata) area X are 4available, both from the Kornfeld lab: 5 6- j0251: 10 x 10 x 25 nm native resolution, full extent ~256 x 256 x 384 µm. 7 Labels: neuron instance segmentation (~4.26 M neurons) and endoplasmic reticulum. 8 Cell-type labels (17 types: MSN, GPe, GPi, HVC axons, interneurons, etc.) and 9 synapse coordinates are available via the REST API at https://syconn.esc.mpcdf.mpg.de. 10- j0126: 10 x 10 x 20 nm native resolution, full extent ~107 x 109 x 114 µm. 11 Labels: neuron instance segmentation only. 12 13Data is streamed from the Kornfeld lab public server via cloud-volume and cached 14locally as zarr v3 stores in (z, y, x) axis order. 15 16This dataset is from the publication https://doi.org/10.1101/2025.10.25.684569. 17Please cite it if you use this dataset in your research. 18 19The dataset is publicly available at https://syconn.esc.mpcdf.mpg.de. 20Requires cloud-volume: pip install cloud-volume. 21""" 22 23import hashlib 24import os 25from concurrent.futures import ThreadPoolExecutor, as_completed 26from typing import List, Literal, Optional, Tuple, Union 27 28import numpy as np 29from tqdm import tqdm 30from torch.utils.data import Dataset, DataLoader 31 32import torch_em 33from .. import util 34 35 36J0251_BASE_URL = ( 37 "precomputed://https://syconn.esc.mpcdf.mpg.de" 38 "/j0251_72_seg_20210127_agglo2_syn_20220811_celltypes_20230822" 39) 40J0126_BASE_URL = "precomputed://https://syconn.esc.mpcdf.mpg.de" 41 42ZEBRAFINCH_DATASETS = { 43 "j0251": { 44 "em_url": f"{J0251_BASE_URL}/image", 45 "seg_url": f"{J0251_BASE_URL}/segmentation", 46 "er_url": f"{J0251_BASE_URL}/er", 47 # Full extent ~256 x 256 x 384 µm at 10 x 10 x 25 nm native resolution. 48 "bbox_nm": (0, 271190, 0, 273500, 0, 387350), 49 }, 50 "j0126": { 51 "em_url": f"{J0126_BASE_URL}/j0126/volume/image", 52 "seg_url": f"{J0126_BASE_URL}/volume/segmentation", 53 "er_url": None, 54 # Full extent ~107 x 109 x 114 µm at 10 x 10 x 20 nm native resolution. 55 "bbox_nm": (0, 106640, 0, 109130, 0, 114000), 56 }, 57} 58 59ZEBRAFINCH_CHUNK_SHAPE = (64, 128, 128) 60ZEBRAFINCH_SHARD_SHAPE = (128, 512, 512) 61 62 63def _zebrafinch_bbox_to_str(bbox): 64 return hashlib.md5("_".join(str(v) for v in bbox).encode()).hexdigest()[:12] 65 66 67def _zebrafinch_create_array(root, name, shape, dtype, is_label): 68 from zarr.codecs import BloscCodec 69 shuffle = "bitshuffle" if (np.issubdtype(dtype, np.integer) and is_label) else "shuffle" 70 return root.create_array( 71 name, 72 shape=shape, 73 chunks=ZEBRAFINCH_CHUNK_SHAPE, 74 shards=ZEBRAFINCH_SHARD_SHAPE, 75 dtype=dtype, 76 compressors=BloscCodec(cname="zstd", clevel=6, shuffle=shuffle), 77 ) 78 79 80def _zebrafinch_bbox_voxels(cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm): 81 scale = np.array(cv.resolution) 82 x0 = int(np.floor(x_min_nm / scale[0])) 83 x1 = int(np.ceil(x_max_nm / scale[0])) 84 y0 = int(np.floor(y_min_nm / scale[1])) 85 y1 = int(np.ceil(y_max_nm / scale[1])) 86 z0 = int(np.floor(z_min_nm / scale[2])) 87 z1 = int(np.ceil(z_max_nm / scale[2])) 88 return x0, x1, y0, y1, z0, z1, (z1 - z0, y1 - y0, x1 - x0) 89 90 91def _zebrafinch_download_to_zarr(cv, ds, x0g, y0g, z0g, name): 92 shape = ds.shape # (z, y, x) 93 sz, sy, sx = ZEBRAFINCH_SHARD_SHAPE 94 95 tasks = [] 96 for z0_ in range(0, shape[0], sz): 97 for y0_ in range(0, shape[1], sy): 98 for x0_ in range(0, shape[2], sx): 99 z1_ = min(z0_ + sz, shape[0]) 100 y1_ = min(y0_ + sy, shape[1]) 101 x1_ = min(x0_ + sx, shape[2]) 102 tasks.append(( 103 (z0_, z1_), (y0_, y1_), (x0_, x1_), 104 (x0g + x0_, x0g + x1_, y0g + y0_, y0g + y1_, z0g + z0_, z0g + z1_), 105 )) 106 107 target_dtype = np.dtype(ds.dtype) 108 109 def worker(item): 110 (z0_, z1_), (y0_, y1_), (x0_, x1_), (gx0, gx1, gy0, gy1, gz0, gz1) = item 111 block = np.asarray(cv[gx0:gx1, gy0:gy1, gz0:gz1]) 112 if block.ndim == 4: 113 block = block[..., 0] 114 ds[z0_:z1_, y0_:y1_, x0_:x1_] = block.transpose(2, 1, 0).astype(target_dtype) 115 116 with ThreadPoolExecutor(max_workers=8) as ex: 117 futures = [ex.submit(worker, t) for t in tasks] 118 for fut in tqdm(as_completed(futures), total=len(futures), desc=f"Downloading '{name}'", smoothing=0.05): 119 fut.result() 120 121 122def get_zebrafinch_data( 123 path: Union[os.PathLike, str], 124 bounding_box: Optional[Tuple[float, ...]] = None, 125 mip: int = 0, 126 dataset: Literal["j0251", "j0126"] = "j0251", 127 download: bool = False, 128) -> str: 129 """Stream and cache a region of a zebrafinch dataset as a zarr v3 store. 130 131 The zarr store contains: 132 - raw: EM grayscale (uint8, z/y/x) 133 - labels: neuron instance segmentation (uint64, z/y/x) 134 - er: endoplasmic reticulum instance segmentation (uint64, z/y/x) - j0251 only. 135 136 Args: 137 path: Filepath to a folder where the cached zarr store will be saved. 138 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 139 Defaults to the full volume extent for the chosen dataset. 140 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 141 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 142 dataset: Which specimen to use, either "j0251" or "j0126". 143 download: Whether to stream and cache the data if not present. 144 145 Returns: 146 Filepath to the cached zarr store. 147 """ 148 import zarr 149 150 ds_info = ZEBRAFINCH_DATASETS[dataset] 151 os.makedirs(str(path), exist_ok=True) 152 bbox = bounding_box if bounding_box is not None else ds_info["bbox_nm"] 153 bbox_hash = _zebrafinch_bbox_to_str(bbox) 154 zarr_path = os.path.join(str(path), f"{dataset}_mip{mip}_{bbox_hash}.zarr") 155 156 arrays_needed = ["raw", "labels"] + (["er"] if ds_info["er_url"] is not None else []) 157 root = zarr.open_group(zarr_path, mode="a") 158 missing = [k for k in arrays_needed if k not in root] 159 if not missing: 160 return zarr_path 161 if not download: 162 raise RuntimeError( 163 f"No cached data at '{zarr_path}'. Set download=True to stream from the Kornfeld lab server." 164 ) 165 166 try: 167 from cloudvolume import CloudVolume 168 except ImportError: 169 raise ImportError("The 'cloud-volume' package is required: pip install cloud-volume") 170 171 x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm = bbox 172 print(f"Streaming zebrafinch {dataset} at mip={mip} ...") 173 174 cv_kwargs = dict(use_https=True, mip=mip, progress=False, fill_missing=True, provenance={}) 175 em_cv = CloudVolume(ds_info["em_url"], **cv_kwargs) 176 seg_cv = CloudVolume(ds_info["seg_url"], **cv_kwargs) 177 178 ex0, ex1, ey0, ey1, ez0, ez1, em_shape = _zebrafinch_bbox_voxels( 179 em_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 180 ) 181 sx0, sx1, sy0, sy1, sz0, sz1, seg_shape = _zebrafinch_bbox_voxels( 182 seg_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 183 ) 184 shape = tuple(min(e, s) for e, s in zip(em_shape, seg_shape)) 185 186 root.attrs["bounding_box_nm"] = list(bbox) 187 root.attrs["mip"] = mip 188 189 if "raw" not in root: 190 ds_raw = _zebrafinch_create_array(root, "raw", shape, np.dtype("uint8"), is_label=False) 191 _zebrafinch_download_to_zarr(em_cv, ds_raw, ex0, ey0, ez0, name="raw") 192 193 if "labels" not in root: 194 ds_lbl = _zebrafinch_create_array(root, "labels", shape, np.dtype("uint64"), is_label=True) 195 _zebrafinch_download_to_zarr(seg_cv, ds_lbl, sx0, sy0, sz0, name="labels") 196 197 if "er" not in root and ds_info["er_url"] is not None: 198 er_cv = CloudVolume(ds_info["er_url"], **cv_kwargs) 199 rx0, rx1, ry0, ry1, rz0, rz1, er_shape = _zebrafinch_bbox_voxels( 200 er_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 201 ) 202 shape_er = tuple(min(e, r) for e, r in zip(shape, er_shape)) 203 ds_er = _zebrafinch_create_array(root, "er", shape_er, np.dtype("uint64"), is_label=True) 204 _zebrafinch_download_to_zarr(er_cv, ds_er, rx0, ry0, rz0, name="er") 205 206 print(f"Cached to {zarr_path} (shape {shape})") 207 return zarr_path 208 209 210def get_zebrafinch_dataset( 211 path: Union[os.PathLike, str], 212 patch_shape: Tuple[int, int, int], 213 bounding_box: Optional[Tuple[float, ...]] = None, 214 mip: int = 0, 215 dataset: Literal["j0251", "j0126"] = "j0251", 216 label_choice: Literal["neurons", "er"] = "neurons", 217 download: bool = False, 218 offsets: Optional[List[List[int]]] = None, 219 boundaries: bool = False, 220 **kwargs, 221) -> Dataset: 222 """Get a zebrafinch dataset for neuron or organelle segmentation. 223 224 Args: 225 path: Filepath to a folder where the cached zarr store will be saved. 226 patch_shape: The patch shape (z, y, x) to use for training. 227 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 228 Defaults to the full volume extent for the chosen dataset. 229 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 230 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 231 dataset: Which specimen to use, either "j0251" or "j0126". 232 label_choice: Which segmentation to use as target. Either "neurons" or "er". 233 "er" is only available for j0251. 234 download: Whether to stream and cache data if not already present. 235 offsets: Offset values for affinity computation used as target. 236 boundaries: Whether to compute boundaries as the target. 237 kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`. 238 239 Returns: 240 The segmentation dataset. 241 """ 242 assert len(patch_shape) == 3 243 if label_choice == "er" and ZEBRAFINCH_DATASETS[dataset]["er_url"] is None: 244 raise ValueError(f"label_choice='er' is not available for dataset='{dataset}'") 245 zarr_path = get_zebrafinch_data(path, bounding_box, mip, dataset, download) 246 247 label_key = "labels" if label_choice == "neurons" else "er" 248 249 kwargs = util.update_kwargs(kwargs, "is_seg_dataset", True) 250 kwargs, _ = util.add_instance_label_transform( 251 kwargs, add_binary_target=False, boundaries=boundaries, offsets=offsets 252 ) 253 254 return torch_em.default_segmentation_dataset( 255 raw_paths=zarr_path, 256 raw_key="raw", 257 label_paths=zarr_path, 258 label_key=label_key, 259 patch_shape=patch_shape, 260 **kwargs, 261 ) 262 263 264def get_zebrafinch_loader( 265 path: Union[os.PathLike, str], 266 batch_size: int, 267 patch_shape: Tuple[int, int, int], 268 bounding_box: Optional[Tuple[float, ...]] = None, 269 mip: int = 0, 270 dataset: Literal["j0251", "j0126"] = "j0251", 271 label_choice: Literal["neurons", "er"] = "neurons", 272 download: bool = False, 273 offsets: Optional[List[List[int]]] = None, 274 boundaries: bool = False, 275 **kwargs, 276) -> DataLoader: 277 """Get the DataLoader for neuron or organelle segmentation in a zebrafinch dataset. 278 279 Args: 280 path: Filepath to a folder where the cached zarr store will be saved. 281 batch_size: The batch size for training. 282 patch_shape: The patch shape (z, y, x) to use for training. 283 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 284 Defaults to the full volume extent for the chosen dataset. 285 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 286 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 287 dataset: Which specimen to use, either "j0251" or "j0126". 288 label_choice: Which segmentation to use as target. Either "neurons" or "er". 289 "er" is only available for j0251. 290 download: Whether to stream and cache data if not already present. 291 offsets: Offset values for affinity computation used as target. 292 boundaries: Whether to compute boundaries as the target. 293 kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset` 294 or for the PyTorch DataLoader. 295 296 Returns: 297 The DataLoader. 298 """ 299 ds_kwargs, loader_kwargs = util.split_kwargs(torch_em.default_segmentation_dataset, **kwargs) 300 ds = get_zebrafinch_dataset( 301 path=path, 302 patch_shape=patch_shape, 303 bounding_box=bounding_box, 304 mip=mip, 305 dataset=dataset, 306 label_choice=label_choice, 307 download=download, 308 offsets=offsets, 309 boundaries=boundaries, 310 **ds_kwargs, 311 ) 312 return torch_em.get_data_loader(ds, batch_size=batch_size, **loader_kwargs)
123def get_zebrafinch_data( 124 path: Union[os.PathLike, str], 125 bounding_box: Optional[Tuple[float, ...]] = None, 126 mip: int = 0, 127 dataset: Literal["j0251", "j0126"] = "j0251", 128 download: bool = False, 129) -> str: 130 """Stream and cache a region of a zebrafinch dataset as a zarr v3 store. 131 132 The zarr store contains: 133 - raw: EM grayscale (uint8, z/y/x) 134 - labels: neuron instance segmentation (uint64, z/y/x) 135 - er: endoplasmic reticulum instance segmentation (uint64, z/y/x) - j0251 only. 136 137 Args: 138 path: Filepath to a folder where the cached zarr store will be saved. 139 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 140 Defaults to the full volume extent for the chosen dataset. 141 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 142 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 143 dataset: Which specimen to use, either "j0251" or "j0126". 144 download: Whether to stream and cache the data if not present. 145 146 Returns: 147 Filepath to the cached zarr store. 148 """ 149 import zarr 150 151 ds_info = ZEBRAFINCH_DATASETS[dataset] 152 os.makedirs(str(path), exist_ok=True) 153 bbox = bounding_box if bounding_box is not None else ds_info["bbox_nm"] 154 bbox_hash = _zebrafinch_bbox_to_str(bbox) 155 zarr_path = os.path.join(str(path), f"{dataset}_mip{mip}_{bbox_hash}.zarr") 156 157 arrays_needed = ["raw", "labels"] + (["er"] if ds_info["er_url"] is not None else []) 158 root = zarr.open_group(zarr_path, mode="a") 159 missing = [k for k in arrays_needed if k not in root] 160 if not missing: 161 return zarr_path 162 if not download: 163 raise RuntimeError( 164 f"No cached data at '{zarr_path}'. Set download=True to stream from the Kornfeld lab server." 165 ) 166 167 try: 168 from cloudvolume import CloudVolume 169 except ImportError: 170 raise ImportError("The 'cloud-volume' package is required: pip install cloud-volume") 171 172 x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm = bbox 173 print(f"Streaming zebrafinch {dataset} at mip={mip} ...") 174 175 cv_kwargs = dict(use_https=True, mip=mip, progress=False, fill_missing=True, provenance={}) 176 em_cv = CloudVolume(ds_info["em_url"], **cv_kwargs) 177 seg_cv = CloudVolume(ds_info["seg_url"], **cv_kwargs) 178 179 ex0, ex1, ey0, ey1, ez0, ez1, em_shape = _zebrafinch_bbox_voxels( 180 em_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 181 ) 182 sx0, sx1, sy0, sy1, sz0, sz1, seg_shape = _zebrafinch_bbox_voxels( 183 seg_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 184 ) 185 shape = tuple(min(e, s) for e, s in zip(em_shape, seg_shape)) 186 187 root.attrs["bounding_box_nm"] = list(bbox) 188 root.attrs["mip"] = mip 189 190 if "raw" not in root: 191 ds_raw = _zebrafinch_create_array(root, "raw", shape, np.dtype("uint8"), is_label=False) 192 _zebrafinch_download_to_zarr(em_cv, ds_raw, ex0, ey0, ez0, name="raw") 193 194 if "labels" not in root: 195 ds_lbl = _zebrafinch_create_array(root, "labels", shape, np.dtype("uint64"), is_label=True) 196 _zebrafinch_download_to_zarr(seg_cv, ds_lbl, sx0, sy0, sz0, name="labels") 197 198 if "er" not in root and ds_info["er_url"] is not None: 199 er_cv = CloudVolume(ds_info["er_url"], **cv_kwargs) 200 rx0, rx1, ry0, ry1, rz0, rz1, er_shape = _zebrafinch_bbox_voxels( 201 er_cv, x_min_nm, x_max_nm, y_min_nm, y_max_nm, z_min_nm, z_max_nm 202 ) 203 shape_er = tuple(min(e, r) for e, r in zip(shape, er_shape)) 204 ds_er = _zebrafinch_create_array(root, "er", shape_er, np.dtype("uint64"), is_label=True) 205 _zebrafinch_download_to_zarr(er_cv, ds_er, rx0, ry0, rz0, name="er") 206 207 print(f"Cached to {zarr_path} (shape {shape})") 208 return zarr_path
Stream and cache a region of a zebrafinch dataset as a zarr v3 store.
The zarr store contains:
- raw: EM grayscale (uint8, z/y/x)
- labels: neuron instance segmentation (uint64, z/y/x)
- er: endoplasmic reticulum instance segmentation (uint64, z/y/x) - j0251 only.
Arguments:
- path: Filepath to a folder where the cached zarr store will be saved.
- bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). Defaults to the full volume extent for the chosen dataset.
- mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126).
- dataset: Which specimen to use, either "j0251" or "j0126".
- download: Whether to stream and cache the data if not present.
Returns:
Filepath to the cached zarr store.
211def get_zebrafinch_dataset( 212 path: Union[os.PathLike, str], 213 patch_shape: Tuple[int, int, int], 214 bounding_box: Optional[Tuple[float, ...]] = None, 215 mip: int = 0, 216 dataset: Literal["j0251", "j0126"] = "j0251", 217 label_choice: Literal["neurons", "er"] = "neurons", 218 download: bool = False, 219 offsets: Optional[List[List[int]]] = None, 220 boundaries: bool = False, 221 **kwargs, 222) -> Dataset: 223 """Get a zebrafinch dataset for neuron or organelle segmentation. 224 225 Args: 226 path: Filepath to a folder where the cached zarr store will be saved. 227 patch_shape: The patch shape (z, y, x) to use for training. 228 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 229 Defaults to the full volume extent for the chosen dataset. 230 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 231 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 232 dataset: Which specimen to use, either "j0251" or "j0126". 233 label_choice: Which segmentation to use as target. Either "neurons" or "er". 234 "er" is only available for j0251. 235 download: Whether to stream and cache data if not already present. 236 offsets: Offset values for affinity computation used as target. 237 boundaries: Whether to compute boundaries as the target. 238 kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`. 239 240 Returns: 241 The segmentation dataset. 242 """ 243 assert len(patch_shape) == 3 244 if label_choice == "er" and ZEBRAFINCH_DATASETS[dataset]["er_url"] is None: 245 raise ValueError(f"label_choice='er' is not available for dataset='{dataset}'") 246 zarr_path = get_zebrafinch_data(path, bounding_box, mip, dataset, download) 247 248 label_key = "labels" if label_choice == "neurons" else "er" 249 250 kwargs = util.update_kwargs(kwargs, "is_seg_dataset", True) 251 kwargs, _ = util.add_instance_label_transform( 252 kwargs, add_binary_target=False, boundaries=boundaries, offsets=offsets 253 ) 254 255 return torch_em.default_segmentation_dataset( 256 raw_paths=zarr_path, 257 raw_key="raw", 258 label_paths=zarr_path, 259 label_key=label_key, 260 patch_shape=patch_shape, 261 **kwargs, 262 )
Get a zebrafinch dataset for neuron or organelle segmentation.
Arguments:
- path: Filepath to a folder where the cached zarr store will be saved.
- patch_shape: The patch shape (z, y, x) to use for training.
- bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). Defaults to the full volume extent for the chosen dataset.
- mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126).
- dataset: Which specimen to use, either "j0251" or "j0126".
- label_choice: Which segmentation to use as target. Either "neurons" or "er". "er" is only available for j0251.
- download: Whether to stream and cache data if not already present.
- offsets: Offset values for affinity computation used as target.
- boundaries: Whether to compute boundaries as the target.
- kwargs: Additional keyword arguments for
torch_em.default_segmentation_dataset.
Returns:
The segmentation dataset.
265def get_zebrafinch_loader( 266 path: Union[os.PathLike, str], 267 batch_size: int, 268 patch_shape: Tuple[int, int, int], 269 bounding_box: Optional[Tuple[float, ...]] = None, 270 mip: int = 0, 271 dataset: Literal["j0251", "j0126"] = "j0251", 272 label_choice: Literal["neurons", "er"] = "neurons", 273 download: bool = False, 274 offsets: Optional[List[List[int]]] = None, 275 boundaries: bool = False, 276 **kwargs, 277) -> DataLoader: 278 """Get the DataLoader for neuron or organelle segmentation in a zebrafinch dataset. 279 280 Args: 281 path: Filepath to a folder where the cached zarr store will be saved. 282 batch_size: The batch size for training. 283 patch_shape: The patch shape (z, y, x) to use for training. 284 bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). 285 Defaults to the full volume extent for the chosen dataset. 286 mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution 287 (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126). 288 dataset: Which specimen to use, either "j0251" or "j0126". 289 label_choice: Which segmentation to use as target. Either "neurons" or "er". 290 "er" is only available for j0251. 291 download: Whether to stream and cache data if not already present. 292 offsets: Offset values for affinity computation used as target. 293 boundaries: Whether to compute boundaries as the target. 294 kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset` 295 or for the PyTorch DataLoader. 296 297 Returns: 298 The DataLoader. 299 """ 300 ds_kwargs, loader_kwargs = util.split_kwargs(torch_em.default_segmentation_dataset, **kwargs) 301 ds = get_zebrafinch_dataset( 302 path=path, 303 patch_shape=patch_shape, 304 bounding_box=bounding_box, 305 mip=mip, 306 dataset=dataset, 307 label_choice=label_choice, 308 download=download, 309 offsets=offsets, 310 boundaries=boundaries, 311 **ds_kwargs, 312 ) 313 return torch_em.get_data_loader(ds, batch_size=batch_size, **loader_kwargs)
Get the DataLoader for neuron or organelle segmentation in a zebrafinch dataset.
Arguments:
- path: Filepath to a folder where the cached zarr store will be saved.
- batch_size: The batch size for training.
- patch_shape: The patch shape (z, y, x) to use for training.
- bounding_box: Region in nm as (x_min, x_max, y_min, y_max, z_min, z_max). Defaults to the full volume extent for the chosen dataset.
- mip: MIP level for both EM and segmentation. Default mip=0 gives native resolution (10 x 10 x 25 nm for j0251, 10 x 10 x 20 nm for j0126).
- dataset: Which specimen to use, either "j0251" or "j0126".
- label_choice: Which segmentation to use as target. Either "neurons" or "er". "er" is only available for j0251.
- download: Whether to stream and cache data if not already present.
- offsets: Offset values for affinity computation used as target.
- boundaries: Whether to compute boundaries as the target.
- kwargs: Additional keyword arguments for
torch_em.default_segmentation_datasetor for the PyTorch DataLoader.
Returns:
The DataLoader.