Published
Generate a multiscale, chunked, multi-dimensional spatial image data structure that can be serialized to OME-NGFF.
pip install multiscale-spatial-image
Package Downloads
Authors
Requires Python
<3.13,>=3.10
Dependencies
- dask
- numpy
- python-dateutil
- spatial-image
>=0.2.1
- xarray
>=2024.10.0
- zarr
- dask-image
; extra == "dask-image"
- pyimagej
; extra == "imagej"
- itk-filtering
>=5.3.0; extra == "itk"
- matplotlib
<4,>=3.9.2; extra == "notebooks"
- ome-types
<0.6,>=0.5.1.post1; extra == "notebooks"
- tqdm
<5,>=4.66.4; extra == "notebooks"
- dask-image
; extra == "test"
- fsspec
; extra == "test"
- ipfsspec
; extra == "test"
- itk-filtering
>=5.3.0; extra == "test"
- jsonschema
; extra == "test"
- nbmake
; extra == "test"
- pooch
; extra == "test"
- pytest
; extra == "test"
- pytest-mypy
; extra == "test"
- urllib3
; extra == "test"
multiscale-spatial-image
Generate a multiscale, chunked, multi-dimensional spatial image data structure that can serialized to OME-NGFF.
Each scale is a scientific Python Xarray spatial-image Dataset, organized into nodes of an Xarray Datatree.
Installation
pip install multiscale_spatial_image
Usage
import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import to_multiscale
import zarr
# Image pixels
array = np.random.randint(0, 256, size=(128,128), dtype=np.uint8)
image = to_spatial_image(array)
print(image)
An Xarray spatial-image DataArray. Spatial metadata can also be passed during construction.
<xarray.DataArray 'image' (y: 128, x: 128)> Size: 16kB
array([[170, 79, 215, ..., 31, 151, 150],
[ 77, 181, 1, ..., 217, 176, 228],
[193, 91, 240, ..., 132, 152, 41],
...,
[ 50, 140, 231, ..., 80, 236, 28],
[ 89, 46, 180, ..., 84, 42, 140],
[ 96, 148, 240, ..., 61, 43, 255]], dtype=uint8)
Coordinates:
* y (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
* x (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
# Create multiscale pyramid, downscaling by a factor of 2, then 4
multiscale = to_multiscale(image, [2, 4])
print(multiscale)
A chunked Dask Array MultiscaleSpatialImage Xarray Datatree.
<xarray.DataTree>
Group: /
├── Group: /scale0
│ Dimensions: (y: 128, x: 128)
│ Coordinates:
│ * y (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│ * x (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│ Data variables:
│ image (y, x) uint8 16kB dask.array<chunksize=(128, 128), meta=np.ndarray>
├── Group: /scale1
│ Dimensions: (y: 64, x: 64)
│ Coordinates:
│ * y (y) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│ * x (x) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│ Data variables:
│ image (y, x) uint8 4kB dask.array<chunksize=(64, 64), meta=np.ndarray>
└── Group: /scale2
Dimensions: (y: 16, x: 16)
Coordinates:
* y (y) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
* x (x) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
Data variables:
image (y, x) uint8 256B dask.array<chunksize=(16, 16), meta=np.ndarray>
Map a function over datasets while skipping nodes that do not contain dimensions
import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import skip_non_dimension_nodes, to_multiscale
data = np.zeros((2, 200, 200))
dims = ("c", "y", "x")
scale_factors = [2, 2]
image = to_spatial_image(array_like=data, dims=dims)
multiscale = to_multiscale(image, scale_factors=scale_factors)
@skip_non_dimension_nodes
def transpose(ds, *args, **kwargs):
return ds.transpose(*args, **kwargs)
multiscale = multiscale.map_over_datasets(transpose, "y", "x", "c")
print(multiscale)
A transposed MultiscaleSpatialImage.
<xarray.DataTree>
Group: /
├── Group: /scale0
│ Dimensions: (c: 2, y: 200, x: 200)
│ Coordinates:
│ * c (c) int32 8B 0 1
│ * y (y) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│ * x (x) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│ Data variables:
│ image (y, x, c) float64 640kB dask.array<chunksize=(200, 200, 2), meta=np.ndarray>
├── Group: /scale1
│ Dimensions: (c: 2, y: 100, x: 100)
│ Coordinates:
│ * c (c) int32 8B 0 1
│ * y (y) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│ * x (x) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│ Data variables:
│ image (y, x, c) float64 160kB dask.array<chunksize=(100, 100, 2), meta=np.ndarray>
└── Group: /scale2
Dimensions: (c: 2, y: 50, x: 50)
Coordinates:
* c (c) int32 8B 0 1
* y (y) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
* x (x) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
Data variables:
image (y, x, c) float64 40kB dask.array<chunksize=(50, 50, 2), meta=np.ndarray>
While the decorator allows you to define your own methods to map over datasets
in the DataTree
while ignoring those datasets not having dimensions, this
library also provides a few convenience methods. For example, the transpose
method we saw earlier can also be applied as follows:
multiscale = multiscale.msi.transpose("y", "x", "c")
Other methods implemented this way are reindex
, equivalent to the
xr.DataArray
reindex
method and assign_coords
, equivalent to xr.Dataset
assign_coords
method.
Store as an Open Microscopy Environment-Next Generation File Format (OME-NGFF) / netCDF Zarr store.
It is highly recommended to use dimension_separator='/'
in the construction of
the Zarr stores.
store = zarr.storage.DirectoryStore('multiscale.zarr', dimension_separator='/')
multiscale.to_zarr(store)
Note: The API is under development, and it may change until 1.0.0 is released. We mean it :-).
Examples
- Hello MultiscaleSpatialImage World!
- Convert itk.Image
- Convert imageio ImageResource
- Convert pyimagej Dataset
Development
Contributions are welcome and appreciated.
Get the source code
git clone https://github.com/spatial-image/multiscale-spatial-image
cd multiscale-spatial-image
Install dependencies
First install pixi. Then, install project dependencies:
pixi install -a
pixi run pre-commit-install
Run the test suite
The unit tests:
pixi run -e test test
The notebooks tests:
pixi run test-notebooks
Update test data
To add new or update testing data, such as a new baseline for this block:
dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
verify_against_baseline(test_data_dir, dataset_name, baseline_name, multiscale)
Add a store_new_image
call in your test block:
dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
store_new_image(dataset_name, baseline_name, multiscale)
verify_against_baseline(dataset_name, baseline_name, multiscale)
Run the tests to generate the output. Remove the store_new_image
call.
Then, create a tarball of the current testing data
cd test/data
tar cvf ../data.tar *
gzip -9 ../data.tar
python3 -c 'import pooch; print(pooch.file_hash("../data.tar.gz"))'
Update the test_data_sha256
variable in the test/_data.py file. Upload the
data to web3.storage. And update the
test_data_ipfs_cid
Content Identifier (CID) variable,
which is available in the web3.storage web page interface.
Submit the patch
We use the standard GitHub flow.
Create a release
This section is relevant only for maintainers.
- Pull
git
'smain
branch. pixi install -a
pixi run pre-commit-install
pixi run -e test test
pixi shell
hatch version <new-version>
git add .
git commit -m "ENH: Bump version to <version>"
hatch build
hatch publish
git push upstream main
- Create a new tag and Release via the GitHub UI. Auto-generate release notes and add additional notes as needed.