torch_em.data.datasets.electron_microscopy.humanneurons

The Human Neurons (H01) dataset contains a petascale FIB-SEM volume of human cerebral cortex with dense automated neuron instance segmentation (C3 release).

The volume covers ~1 mm³ of human temporal cortex at 4 x 4 x 33 nm resolution (~1.4 PB raw uncompressed). The C3 automated segmentation is provided at 8 x 8 x 33 nm resolution, covering the same physical region.

The data is hosted on Google Cloud Storage and described in: Shapson-Coe et al. (2021), https://www.biorxiv.org/content/10.1101/2021.05.29.446289v4. Please cite this publication if you use the dataset in your research.

NOTE: Accessing this dataset requires the cloud-volume package (pip install cloud-volume).

NOTE (on data size): the full volume is 515,892 x 356,400 x 5,293 voxels at 8 x 8 x 33 nm (~350 TB raw, ~1.4 PB at 4 nm). Downloading the entire volume is not feasible. Data is instead streamed and cached locally as HDF5 files by specifying bounding boxes (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.

The volume is highly anisotropic: 8 nm in-plane (xy) and 33 nm in z. Patch shapes should account for this — e.g. patch_shape=(8, 512, 512) corresponds to a ~264 nm x 4 µm x 4 µm volume. The full z-extent is only 5,293 slices (~175 µm), so bounding boxes spanning the complete z range are feasible.

  1"""The Human Neurons (H01) dataset contains a petascale FIB-SEM volume of human cerebral
  2cortex with dense automated neuron instance segmentation (C3 release).
  3
  4The volume covers ~1 mm³ of human temporal cortex at 4 x 4 x 33 nm resolution
  5(~1.4 PB raw uncompressed). The C3 automated segmentation is provided at 8 x 8 x 33 nm
  6resolution, covering the same physical region.
  7
  8The data is hosted on Google Cloud Storage and described in:
  9Shapson-Coe et al. (2021), https://www.biorxiv.org/content/10.1101/2021.05.29.446289v4.
 10Please cite this publication if you use the dataset in your research.
 11
 12NOTE: Accessing this dataset requires the `cloud-volume` package (pip install cloud-volume).
 13
 14NOTE (on data size): the full volume is 515,892 x 356,400 x 5,293 voxels at 8 x 8 x 33 nm
 15(~350 TB raw, ~1.4 PB at 4 nm). Downloading the entire volume is not feasible.
 16Data is instead streamed and cached locally as HDF5 files by specifying bounding boxes
 17(x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
 18
 19The volume is highly anisotropic: 8 nm in-plane (xy) and 33 nm in z. Patch shapes should
 20account for this — e.g. patch_shape=(8, 512, 512) corresponds to a ~264 nm x 4 µm x 4 µm
 21volume. The full z-extent is only 5,293 slices (~175 µm), so bounding boxes spanning the
 22complete z range are feasible.
 23"""
 24
 25import hashlib
 26import os
 27from typing import List, Optional, Tuple, Union
 28
 29import numpy as np
 30
 31from torch.utils.data import DataLoader, Dataset
 32
 33import torch_em
 34
 35from .. import util
 36
 37
 38EM_URL = "gs://h01-release/data/20210601/4nm_raw"
 39SEG_URL = "gs://h01-release/data/20210601/c3"
 40
 41# A 2048 × 2048 × 64 subvolume (8 nm xy, 33 nm z) in a neuron-dense region of the cortex.
 42# Physical size: ~16 µm × 16 µm × 2.1 µm.  Units: 8 nm voxels in (x, y, z) order.
 43DEFAULT_BOUNDING_BOX = (271360, 273408, 201728, 203776, 2614, 2678)
 44
 45
 46def _bbox_to_str(bbox):
 47    """Create a short unique filename stem from a bounding box tuple."""
 48    key = "_".join(str(v) for v in bbox)
 49    return hashlib.md5(key.encode()).hexdigest()[:12]
 50
 51
 52def _fetch(cv, x_min, x_max, y_min, y_max, z_min, z_max):
 53    """Fetch a subvolume and return it as a (z, y, x) array."""
 54    arr = np.array(cv[x_min:x_max, y_min:y_max, z_min:z_max])[..., 0]
 55    return arr.transpose(2, 1, 0)
 56
 57
 58def get_humanneurons_data(
 59    path: Union[os.PathLike, str],
 60    bounding_box: Tuple[int, int, int, int, int, int] = DEFAULT_BOUNDING_BOX,
 61    download: bool = False,
 62) -> str:
 63    """Stream a subvolume from the H01 Human Neurons dataset and cache it as an HDF5 file.
 64
 65    The HDF5 file contains:
 66      - raw:    EM grayscale (uint8, 8 nm xy / 33 nm z, z/y/x)
 67      - labels: neuron instance segmentation (uint64, 8 nm xy / 33 nm z, z/y/x)
 68
 69    Both layers are stored at the same 8 x 8 x 33 nm resolution. The raw image is
 70    fetched from the 4 nm source at mip=1 (native 8 nm downsampled scale).
 71
 72    Args:
 73        path: Filepath to a folder where the cached HDF5 file will be saved.
 74        bounding_box: The region to fetch as (x_min, x_max, y_min, y_max, z_min, z_max)
 75            in 8 nm voxel coordinates. Defaults to a 2048 x 2048 x 64 training region.
 76        download: Whether to stream and cache the data if it is not present.
 77
 78    Returns:
 79        The filepath to the cached HDF5 file.
 80    """
 81    import h5py
 82
 83    os.makedirs(path, exist_ok=True)
 84
 85    stem = _bbox_to_str(bounding_box)
 86    h5_path = os.path.join(path, f"{stem}.h5")
 87
 88    if os.path.exists(h5_path):
 89        return h5_path
 90
 91    if not download:
 92        raise RuntimeError(
 93            f"No cached data found at '{h5_path}'. Set download=True to stream it from GCS."
 94        )
 95
 96    try:
 97        import cloudvolume
 98    except ImportError:
 99        raise ImportError(
100            "The 'cloud-volume' package is required to access the Human Neurons dataset. "
101            "Install it with: 'pip install cloud-volume'."
102        )
103
104    x_min, x_max, y_min, y_max, z_min, z_max = bounding_box
105
106    print(f"Streaming H01 Human Neurons EM + segmentation for bbox {bounding_box} ...")
107
108    # EM at mip=1 gives 8×8×33 nm — same resolution as the C3 segmentation at mip=0.
109    em_vol = cloudvolume.CloudVolume(EM_URL,  use_https=True, mip=1, progress=True)
110    seg_vol = cloudvolume.CloudVolume(SEG_URL, use_https=True, mip=0, progress=True, fill_missing=True)
111
112    raw = _fetch(em_vol,  x_min, x_max, y_min, y_max, z_min, z_max)
113    labels = _fetch(seg_vol, x_min, x_max, y_min, y_max, z_min, z_max)
114
115    # Relabel to consecutive integers so IDs fit in uint32 (required for napari and float32 training).
116    from skimage.segmentation import relabel_sequential
117    labels, _, _ = relabel_sequential(labels)
118
119    resolution_nm = em_vol.mip_resolution(1).tolist()  # [8, 8, 33] nm
120
121    with h5py.File(h5_path, "w", locking=False) as f:
122        f.attrs["bounding_box"] = bounding_box
123        f.attrs["crop_size"] = raw.shape  # (z, y, x)
124        f.attrs["resolution_nm"] = resolution_nm   # [x, y, z] in nm
125        f.create_dataset("raw", data=raw.astype("uint8"),   compression="gzip", chunks=True)
126        f.create_dataset("labels", data=labels.astype("uint32"), compression="gzip", chunks=True)
127
128    print(f"Cached to {h5_path}  (raw {raw.shape}, labels {labels.shape})")
129    return h5_path
130
131
132def get_humanneurons_paths(
133    path: Union[os.PathLike, str],
134    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
135    download: bool = False,
136) -> List[str]:
137    """Get paths to the Human Neurons HDF5 cache files.
138
139    Args:
140        path: Filepath to a folder where the cached HDF5 files will be saved.
141        bounding_boxes: List of regions to fetch, each as
142            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
143            Defaults to [DEFAULT_BOUNDING_BOX].
144        download: Whether to stream and cache the data if it is not present.
145
146    Returns:
147        List of filepaths to the cached HDF5 files.
148    """
149    if bounding_boxes is None:
150        bounding_boxes = [DEFAULT_BOUNDING_BOX]
151
152    return [get_humanneurons_data(path, bbox, download) for bbox in bounding_boxes]
153
154
155def get_humanneurons_dataset(
156    path: Union[os.PathLike, str],
157    patch_shape: Tuple[int, int, int],
158    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
159    download: bool = False,
160    offsets: Optional[List[List[int]]] = None,
161    boundaries: bool = False,
162    **kwargs,
163) -> Dataset:
164    """Get the Human Neurons (H01) dataset for neuron instance segmentation.
165
166    Args:
167        path: Filepath to a folder where the cached HDF5 files will be saved.
168        patch_shape: The patch shape (z, y, x) to use for training.
169            The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical,
170            e.g. patch_shape=(8, 512, 512).
171        bounding_boxes: List of subvolumes to use, each as
172            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
173            Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
174        download: Whether to stream and cache data if not already present.
175        offsets: Offset values for affinity computation used as target.
176        boundaries: Whether to compute boundaries as the target.
177        kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`.
178
179    Returns:
180        The segmentation dataset.
181    """
182    assert len(patch_shape) == 3
183
184    paths = get_humanneurons_paths(path, bounding_boxes, download)
185
186    kwargs = util.update_kwargs(kwargs, "is_seg_dataset", True)
187    kwargs, _ = util.add_instance_label_transform(
188        kwargs, add_binary_target=False, boundaries=boundaries, offsets=offsets
189    )
190
191    return torch_em.default_segmentation_dataset(
192        raw_paths=paths,
193        raw_key="raw",
194        label_paths=paths,
195        label_key="labels",
196        patch_shape=patch_shape,
197        **kwargs,
198    )
199
200
201def get_humanneurons_loader(
202    path: Union[os.PathLike, str],
203    patch_shape: Tuple[int, int, int],
204    batch_size: int,
205    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
206    download: bool = False,
207    offsets: Optional[List[List[int]]] = None,
208    boundaries: bool = False,
209    **kwargs,
210) -> DataLoader:
211    """Get the DataLoader for neuron instance segmentation in the H01 Human Neurons dataset.
212
213    Args:
214        path: Filepath to a folder where the cached HDF5 files will be saved.
215        patch_shape: The patch shape (z, y, x) to use for training.
216            The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical,
217            e.g. patch_shape=(8, 512, 512).
218        batch_size: The batch size for training.
219        bounding_boxes: List of subvolumes to use, each as
220            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
221            Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
222        download: Whether to stream and cache data if not already present.
223        offsets: Offset values for affinity computation used as target.
224        boundaries: Whether to compute boundaries as the target.
225        kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`
226            or for the PyTorch DataLoader.
227
228    Returns:
229        The DataLoader.
230    """
231    ds_kwargs, loader_kwargs = util.split_kwargs(torch_em.default_segmentation_dataset, **kwargs)
232    dataset = get_humanneurons_dataset(
233        path, patch_shape, bounding_boxes, download, offsets, boundaries, **ds_kwargs
234    )
235    return torch_em.get_data_loader(dataset, batch_size, **loader_kwargs)
EM_URL = 'gs://h01-release/data/20210601/4nm_raw'
SEG_URL = 'gs://h01-release/data/20210601/c3'
DEFAULT_BOUNDING_BOX = (271360, 273408, 201728, 203776, 2614, 2678)
def get_humanneurons_data( path: Union[os.PathLike, str], bounding_box: Tuple[int, int, int, int, int, int] = (271360, 273408, 201728, 203776, 2614, 2678), download: bool = False) -> str:
 59def get_humanneurons_data(
 60    path: Union[os.PathLike, str],
 61    bounding_box: Tuple[int, int, int, int, int, int] = DEFAULT_BOUNDING_BOX,
 62    download: bool = False,
 63) -> str:
 64    """Stream a subvolume from the H01 Human Neurons dataset and cache it as an HDF5 file.
 65
 66    The HDF5 file contains:
 67      - raw:    EM grayscale (uint8, 8 nm xy / 33 nm z, z/y/x)
 68      - labels: neuron instance segmentation (uint64, 8 nm xy / 33 nm z, z/y/x)
 69
 70    Both layers are stored at the same 8 x 8 x 33 nm resolution. The raw image is
 71    fetched from the 4 nm source at mip=1 (native 8 nm downsampled scale).
 72
 73    Args:
 74        path: Filepath to a folder where the cached HDF5 file will be saved.
 75        bounding_box: The region to fetch as (x_min, x_max, y_min, y_max, z_min, z_max)
 76            in 8 nm voxel coordinates. Defaults to a 2048 x 2048 x 64 training region.
 77        download: Whether to stream and cache the data if it is not present.
 78
 79    Returns:
 80        The filepath to the cached HDF5 file.
 81    """
 82    import h5py
 83
 84    os.makedirs(path, exist_ok=True)
 85
 86    stem = _bbox_to_str(bounding_box)
 87    h5_path = os.path.join(path, f"{stem}.h5")
 88
 89    if os.path.exists(h5_path):
 90        return h5_path
 91
 92    if not download:
 93        raise RuntimeError(
 94            f"No cached data found at '{h5_path}'. Set download=True to stream it from GCS."
 95        )
 96
 97    try:
 98        import cloudvolume
 99    except ImportError:
100        raise ImportError(
101            "The 'cloud-volume' package is required to access the Human Neurons dataset. "
102            "Install it with: 'pip install cloud-volume'."
103        )
104
105    x_min, x_max, y_min, y_max, z_min, z_max = bounding_box
106
107    print(f"Streaming H01 Human Neurons EM + segmentation for bbox {bounding_box} ...")
108
109    # EM at mip=1 gives 8×8×33 nm — same resolution as the C3 segmentation at mip=0.
110    em_vol = cloudvolume.CloudVolume(EM_URL,  use_https=True, mip=1, progress=True)
111    seg_vol = cloudvolume.CloudVolume(SEG_URL, use_https=True, mip=0, progress=True, fill_missing=True)
112
113    raw = _fetch(em_vol,  x_min, x_max, y_min, y_max, z_min, z_max)
114    labels = _fetch(seg_vol, x_min, x_max, y_min, y_max, z_min, z_max)
115
116    # Relabel to consecutive integers so IDs fit in uint32 (required for napari and float32 training).
117    from skimage.segmentation import relabel_sequential
118    labels, _, _ = relabel_sequential(labels)
119
120    resolution_nm = em_vol.mip_resolution(1).tolist()  # [8, 8, 33] nm
121
122    with h5py.File(h5_path, "w", locking=False) as f:
123        f.attrs["bounding_box"] = bounding_box
124        f.attrs["crop_size"] = raw.shape  # (z, y, x)
125        f.attrs["resolution_nm"] = resolution_nm   # [x, y, z] in nm
126        f.create_dataset("raw", data=raw.astype("uint8"),   compression="gzip", chunks=True)
127        f.create_dataset("labels", data=labels.astype("uint32"), compression="gzip", chunks=True)
128
129    print(f"Cached to {h5_path}  (raw {raw.shape}, labels {labels.shape})")
130    return h5_path

Stream a subvolume from the H01 Human Neurons dataset and cache it as an HDF5 file.

The HDF5 file contains:

  • raw: EM grayscale (uint8, 8 nm xy / 33 nm z, z/y/x)
  • labels: neuron instance segmentation (uint64, 8 nm xy / 33 nm z, z/y/x)

Both layers are stored at the same 8 x 8 x 33 nm resolution. The raw image is fetched from the 4 nm source at mip=1 (native 8 nm downsampled scale).

Arguments:
  • path: Filepath to a folder where the cached HDF5 file will be saved.
  • bounding_box: The region to fetch as (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates. Defaults to a 2048 x 2048 x 64 training region.
  • download: Whether to stream and cache the data if it is not present.
Returns:

The filepath to the cached HDF5 file.

def get_humanneurons_paths( path: Union[os.PathLike, str], bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None, download: bool = False) -> List[str]:
133def get_humanneurons_paths(
134    path: Union[os.PathLike, str],
135    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
136    download: bool = False,
137) -> List[str]:
138    """Get paths to the Human Neurons HDF5 cache files.
139
140    Args:
141        path: Filepath to a folder where the cached HDF5 files will be saved.
142        bounding_boxes: List of regions to fetch, each as
143            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
144            Defaults to [DEFAULT_BOUNDING_BOX].
145        download: Whether to stream and cache the data if it is not present.
146
147    Returns:
148        List of filepaths to the cached HDF5 files.
149    """
150    if bounding_boxes is None:
151        bounding_boxes = [DEFAULT_BOUNDING_BOX]
152
153    return [get_humanneurons_data(path, bbox, download) for bbox in bounding_boxes]

Get paths to the Human Neurons HDF5 cache files.

Arguments:
  • path: Filepath to a folder where the cached HDF5 files will be saved.
  • bounding_boxes: List of regions to fetch, each as (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates. Defaults to [DEFAULT_BOUNDING_BOX].
  • download: Whether to stream and cache the data if it is not present.
Returns:

List of filepaths to the cached HDF5 files.

def get_humanneurons_dataset( path: Union[os.PathLike, str], patch_shape: Tuple[int, int, int], bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None, download: bool = False, offsets: Optional[List[List[int]]] = None, boundaries: bool = False, **kwargs) -> torch.utils.data.dataset.Dataset:
156def get_humanneurons_dataset(
157    path: Union[os.PathLike, str],
158    patch_shape: Tuple[int, int, int],
159    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
160    download: bool = False,
161    offsets: Optional[List[List[int]]] = None,
162    boundaries: bool = False,
163    **kwargs,
164) -> Dataset:
165    """Get the Human Neurons (H01) dataset for neuron instance segmentation.
166
167    Args:
168        path: Filepath to a folder where the cached HDF5 files will be saved.
169        patch_shape: The patch shape (z, y, x) to use for training.
170            The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical,
171            e.g. patch_shape=(8, 512, 512).
172        bounding_boxes: List of subvolumes to use, each as
173            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
174            Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
175        download: Whether to stream and cache data if not already present.
176        offsets: Offset values for affinity computation used as target.
177        boundaries: Whether to compute boundaries as the target.
178        kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`.
179
180    Returns:
181        The segmentation dataset.
182    """
183    assert len(patch_shape) == 3
184
185    paths = get_humanneurons_paths(path, bounding_boxes, download)
186
187    kwargs = util.update_kwargs(kwargs, "is_seg_dataset", True)
188    kwargs, _ = util.add_instance_label_transform(
189        kwargs, add_binary_target=False, boundaries=boundaries, offsets=offsets
190    )
191
192    return torch_em.default_segmentation_dataset(
193        raw_paths=paths,
194        raw_key="raw",
195        label_paths=paths,
196        label_key="labels",
197        patch_shape=patch_shape,
198        **kwargs,
199    )

Get the Human Neurons (H01) dataset for neuron instance segmentation.

Arguments:
  • path: Filepath to a folder where the cached HDF5 files will be saved.
  • patch_shape: The patch shape (z, y, x) to use for training. The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical, e.g. patch_shape=(8, 512, 512).
  • bounding_boxes: List of subvolumes to use, each as (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates. Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
  • download: Whether to stream and cache data if not already present.
  • offsets: Offset values for affinity computation used as target.
  • boundaries: Whether to compute boundaries as the target.
  • kwargs: Additional keyword arguments for torch_em.default_segmentation_dataset.
Returns:

The segmentation dataset.

def get_humanneurons_loader( path: Union[os.PathLike, str], patch_shape: Tuple[int, int, int], batch_size: int, bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None, download: bool = False, offsets: Optional[List[List[int]]] = None, boundaries: bool = False, **kwargs) -> torch.utils.data.dataloader.DataLoader:
202def get_humanneurons_loader(
203    path: Union[os.PathLike, str],
204    patch_shape: Tuple[int, int, int],
205    batch_size: int,
206    bounding_boxes: Optional[List[Tuple[int, int, int, int, int, int]]] = None,
207    download: bool = False,
208    offsets: Optional[List[List[int]]] = None,
209    boundaries: bool = False,
210    **kwargs,
211) -> DataLoader:
212    """Get the DataLoader for neuron instance segmentation in the H01 Human Neurons dataset.
213
214    Args:
215        path: Filepath to a folder where the cached HDF5 files will be saved.
216        patch_shape: The patch shape (z, y, x) to use for training.
217            The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical,
218            e.g. patch_shape=(8, 512, 512).
219        batch_size: The batch size for training.
220        bounding_boxes: List of subvolumes to use, each as
221            (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates.
222            Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
223        download: Whether to stream and cache data if not already present.
224        offsets: Offset values for affinity computation used as target.
225        boundaries: Whether to compute boundaries as the target.
226        kwargs: Additional keyword arguments for `torch_em.default_segmentation_dataset`
227            or for the PyTorch DataLoader.
228
229    Returns:
230        The DataLoader.
231    """
232    ds_kwargs, loader_kwargs = util.split_kwargs(torch_em.default_segmentation_dataset, **kwargs)
233    dataset = get_humanneurons_dataset(
234        path, patch_shape, bounding_boxes, download, offsets, boundaries, **ds_kwargs
235    )
236    return torch_em.get_data_loader(dataset, batch_size, **loader_kwargs)

Get the DataLoader for neuron instance segmentation in the H01 Human Neurons dataset.

Arguments:
  • path: Filepath to a folder where the cached HDF5 files will be saved.
  • patch_shape: The patch shape (z, y, x) to use for training. The volume is anisotropic (8 nm xy, 33 nm z), so small z values are typical, e.g. patch_shape=(8, 512, 512).
  • batch_size: The batch size for training.
  • bounding_boxes: List of subvolumes to use, each as (x_min, x_max, y_min, y_max, z_min, z_max) in 8 nm voxel coordinates. Defaults to [DEFAULT_BOUNDING_BOX] — a 2048 x 2048 x 64 cortex region.
  • download: Whether to stream and cache data if not already present.
  • offsets: Offset values for affinity computation used as target.
  • boundaries: Whether to compute boundaries as the target.
  • kwargs: Additional keyword arguments for torch_em.default_segmentation_dataset or for the PyTorch DataLoader.
Returns:

The DataLoader.