Quick start#

import xbitinfo as xb

import xarray as xr
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
ds = xr.tutorial.load_dataset("eraint_uvz").astype("float32")

xb.plot_distribution(ds)
ds
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'z' has non-conforming '_FillValue' np.float64(nan) defined, dropping '_FillValue' entirely.
  var = coder.decode(var, name=name)
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'u' has non-conforming '_FillValue' np.float64(nan) defined, dropping '_FillValue' entirely.
  var = coder.decode(var, name=name)
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'v' has non-conforming '_FillValue' np.float64(nan) defined, dropping '_FillValue' entirely.
  var = coder.decode(var, name=name)
<xarray.Dataset> Size: 8MB
Dimensions:    (month: 2, level: 3, latitude: 241, longitude: 480)
Coordinates:
  * longitude  (longitude) float32 2kB -180.0 -179.2 -178.5 ... 178.5 179.2
  * latitude   (latitude) float32 964B 90.0 89.25 88.5 ... -88.5 -89.25 -90.0
  * level      (level) int32 12B 200 500 850
  * month      (month) int32 8B 1 7
Data variables:
    z          (month, level, latitude, longitude) float32 3MB 1.068e+05 ... ...
    u          (month, level, latitude, longitude) float32 3MB 1.282 ... 3.539
    v          (month, level, latitude, longitude) float32 3MB -0.04676 ... 3...
Attributes:
    Conventions:  CF-1.0
    Info:         Monthly ERA-Interim data. Downloaded and edited by fabien.m...
_images/ccb2f0aa306f78a105dd638cd7be8a6084466bc91e5b419e3dfdaaf984a926df.png

NOTE: If you plan to use the example datasets provided by xarray, you will need to install the pooch package separately using the following command:

pip install pooch
Requirement already satisfied: pooch in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (1.8.2)
Requirement already satisfied: platformdirs>=2.5.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (4.2.2)
Requirement already satisfied: packaging>=20.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (24.1)
Requirement already satisfied: requests>=2.19.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (2.32.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.8)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (2024.8.30)
WARNING: Error parsing dependencies of prefect: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
    kubernetes (>=9.0.0a1.0) ; extra == 'all_extras'
               ~~~~~~~~~~^

Note: you may need to restart the kernel to use updated packages.

Without installing pooch, you will not be able to download and load the example datasets, which may result in errors or unexpected behavior.

Get information content per bit#

using xbitinfo.xbitinfo.get_bitinformation()

info_per_bit = xb.get_bitinformation(ds, dim="longitude", implementation="python")

info_per_bit
  0%|          | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude:   0%|          | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude:  33%|███▎      | 1/3 [00:01<00:02,  1.39s/it]
Processing var: u for dim: longitude:  33%|███▎      | 1/3 [00:01<00:02,  1.39s/it]
Processing var: u for dim: longitude:  67%|██████▋   | 2/3 [00:02<00:01,  1.26s/it]
Processing var: v for dim: longitude:  67%|██████▋   | 2/3 [00:02<00:01,  1.26s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00,  1.22s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00,  1.24s/it]

<xarray.Dataset> Size: 1kB
Dimensions:     (bitfloat32: 32)
Coordinates:
  * bitfloat32  (bitfloat32) <U3 384B '±' 'e1' 'e2' 'e3' ... 'm21' 'm22' 'm23'
    dim         <U9 36B 'longitude'
Data variables:
    z           (bitfloat32) float64 256B 0.0 0.0 0.0 ... 0.005199 0.007699
    u           (bitfloat32) float64 256B 0.7816 0.4274 0.0 ... 0.01148 0.1475
    v           (bitfloat32) float64 256B 0.8752 0.7756 0.0 ... 0.06165 0.05304
Attributes:
    xbitinfo_description:       bitinformation calculated by xbitinfo.get_bit...
    python_repository:          https://github.com/observingClouds/xbitinfo
    julia_repository:           https://github.com/milankl/BitInformation.jl
    reference_paper:            http://www.nature.com/articles/s43588-021-001...
    xbitinfo_version:           0.1.dev199+ge3cffa5.d20240904
    BitInformation.jl_version:  implementation='python'

Visualize information content#

using xbitinfo.graphics.plot_bitinformation()

fig = xb.plot_bitinformation(info_per_bit)
_images/3e5a1b7889d0a77df0f0c1648111bf842a1d5c42dc406126fb491c8e51f44bae.png

Get keepbits#

using xbitinfo.xbitinfo.get_keepbits()

keepbits = xb.get_keepbits(info_per_bit, 0.99)
keepbits
<xarray.Dataset> Size: 68B
Dimensions:   (inflevel: 1)
Coordinates:
    dim       <U9 36B 'longitude'
  * inflevel  (inflevel) float64 8B 0.99
Data variables:
    z         (inflevel) int64 8B 10
    u         (inflevel) int64 8B 3
    v         (inflevel) int64 8B 2

Apply bitrounding#

using xbitinfo.bitround.xr_bitround() or xbitinfo.bitround.jl_bitround() (does not work for chunked data)

ds_bitrounded = xb.xr_bitround(ds, keepbits)
xr.concat([ds, ds_bitrounded], "bitround").isel(level=0)["v"].plot(
    col="bitround", row="month"
)
<xarray.plot.facetgrid.FacetGrid at 0x7f156dab3740>
_images/be97ea885242ed5dd31ca6954938e7cca7fabbc26a498b353a0783f494076c62.png

Save compressed#

using xbitinfo.save_compressed.ToCompressed_Netcdf or xbitinfo.save_compressed.ToCompressed_Zarr

NetCDF#

ds_bitrounded.to_compressed_netcdf("bitrounded_compressed.nc")
ds.to_compressed_netcdf("compressed.nc")
ds.to_netcdf("original.nc")
!du -hs *.nc
7.5M	0.air_original.nc
532K	bitrounded_compressed.nc
4.1M	compressed.nc
8.0M	original.nc
!rm *.nc

Zarr#

ds_bitrounded.to_compressed_zarr("bitrounded_compressed.zarr", mode="w")
ds.to_compressed_zarr("compressed.zarr", mode="w")
ds.to_zarr(
    "original.zarr", mode="w", encoding={v: {"compressor": None} for v in ds.data_vars}
);
!du -hs *.zarr
812K	air_bitrounded.zarr
1.1M	air_bitrounded_by_chunks.zarr
7.8M	air_compressed.zarr
916K	bitrounded_compressed.zarr
4.8M	compressed.zarr
11M	original.zarr
!rm -r *.zarr