Quick start#

import xbitinfo as xb

import xarray as xr
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
ds = xr.tutorial.load_dataset("eraint_uvz").astype("float32")

xb.plot_distribution(ds)
ds
<xarray.Dataset>
Dimensions:    (month: 2, level: 3, latitude: 241, longitude: 480)
Coordinates:
  * longitude  (longitude) float32 -180.0 -179.2 -178.5 ... 177.8 178.5 179.2
  * latitude   (latitude) float32 90.0 89.25 88.5 87.75 ... -88.5 -89.25 -90.0
  * level      (level) int32 200 500 850
  * month      (month) int32 1 7
Data variables:
    z          (month, level, latitude, longitude) float32 1.068e+05 ... 1.17...
    u          (month, level, latitude, longitude) float32 1.282 1.282 ... 3.539
    v          (month, level, latitude, longitude) float32 -0.04676 ... 3.383
Attributes:
    Conventions:  CF-1.0
    Info:         Monthly ERA-Interim data. Downloaded and edited by fabien.m...
_images/3bb49be35209770f50fff60b2fb438087efe21e07ab3aba4d14ea2985967b304.png

NOTE: If you plan to use the example datasets provided by xarray, you will need to install the pooch package separately using the following command:

pip install pooch
Requirement already satisfied: pooch in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (1.8.0)
Requirement already satisfied: platformdirs>=2.5.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from pooch) (4.2.0)
Requirement already satisfied: packaging>=20.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from pooch) (23.2)
Requirement already satisfied: requests>=2.19.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from pooch) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from requests>=2.19.0->pooch) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from requests>=2.19.0->pooch) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from requests>=2.19.0->pooch) (2.2.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.9/site-packages (from requests>=2.19.0->pooch) (2024.2.2)
Note: you may need to restart the kernel to use updated packages.

Without installing pooch, you will not be able to download and load the example datasets, which may result in errors or unexpected behavior.

Get information content per bit#

using xbitinfo.xbitinfo.get_bitinformation()

info_per_bit = xb.get_bitinformation(ds, dim="longitude", implementation="python")

info_per_bit
  0%|          | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude:   0%|          | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude:  33%|███▎      | 1/3 [00:01<00:02,  1.14s/it]
Processing var: u for dim: longitude:  33%|███▎      | 1/3 [00:01<00:02,  1.14s/it]
Processing var: u for dim: longitude:  67%|██████▋   | 2/3 [00:02<00:01,  1.13s/it]
Processing var: v for dim: longitude:  67%|██████▋   | 2/3 [00:02<00:01,  1.13s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00,  1.13s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00,  1.13s/it]

<xarray.Dataset>
Dimensions:     (bitfloat32: 32)
Coordinates:
  * bitfloat32  (bitfloat32) <U3 '±' 'e1' 'e2' 'e3' ... 'm20' 'm21' 'm22' 'm23'
    dim         <U9 'longitude'
Data variables:
    z           (bitfloat32) float64 0.0 0.0 0.0 ... 0.008876 0.005199 0.007699
    u           (bitfloat32) float64 0.7816 0.4274 0.0 ... 0.01148 0.1475
    v           (bitfloat32) float64 0.8752 0.7756 0.0 ... 0.06165 0.05304
Attributes:
    xbitinfo_description:       bitinformation calculated by xbitinfo.get_bit...
    python_repository:          https://github.com/observingClouds/xbitinfo
    julia_repository:           https://github.com/milankl/BitInformation.jl
    reference_paper:            http://www.nature.com/articles/s43588-021-001...
    xbitinfo_version:           0.1.dev201+g6fd035b.d20240206
    BitInformation.jl_version:  implementation='python'

Visualize information content#

using xbitinfo.graphics.plot_bitinformation()

fig = xb.plot_bitinformation(info_per_bit)
_images/9534e980c3bae7657359e67e81fbc9779d6469eabbc43dda226227ab77c4c268.png

Get keepbits#

using xbitinfo.xbitinfo.get_keepbits()

keepbits = xb.get_keepbits(info_per_bit, 0.99)
keepbits
<xarray.Dataset>
Dimensions:   (inflevel: 1)
Coordinates:
    dim       <U9 'longitude'
  * inflevel  (inflevel) float64 0.99
Data variables:
    z         (inflevel) int64 10
    u         (inflevel) int64 3
    v         (inflevel) int64 2

Apply bitrounding#

using xbitinfo.bitround.xr_bitround() or xbitinfo.bitround.jl_bitround() (does not work for chunked data)

ds_bitrounded = xb.xr_bitround(ds, keepbits)
xr.concat([ds, ds_bitrounded], "bitround").isel(level=0)["v"].plot(
    col="bitround", row="month"
)
<xarray.plot.facetgrid.FacetGrid at 0x7f9db12b3ee0>
_images/f4fce36a9d39c140386f691b91787a2fe3691f07f04bf48632da63a3718ce029.png

Save compressed#

using xbitinfo.save_compressed.ToCompressed_Netcdf or xbitinfo.save_compressed.ToCompressed_Zarr

NetCDF#

ds_bitrounded.to_compressed_netcdf("bitrounded_compressed.nc")
ds.to_compressed_netcdf("compressed.nc")
ds.to_netcdf("original.nc")
!du -hs *.nc
7.5M	0.air_original.nc
532K	bitrounded_compressed.nc
4.1M	compressed.nc
8.0M	original.nc
!rm *.nc

Zarr#

ds_bitrounded.to_compressed_zarr("bitrounded_compressed.zarr", mode="w")
ds.to_compressed_zarr("compressed.zarr", mode="w")
ds.to_zarr(
    "original.zarr", mode="w", encoding={v: {"compressor": None} for v in ds.data_vars}
);
!du -hs *.zarr
1.2M	air_bitrounded.zarr
1.1M	air_bitrounded_by_chunks.zarr
7.0M	air_compressed.zarr
912K	bitrounded_compressed.zarr
4.8M	compressed.zarr
11M	original.zarr
!rm -r *.zarr