Quick start#
import xbitinfo as xb
import xarray as xr
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
ds = xr.tutorial.load_dataset("eraint_uvz").astype("float32")
xb.plot_distribution(ds)
ds
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'z' has non-conforming '_FillValue' nan defined, dropping '_FillValue' entirely.
var = coder.decode(var, name=name)
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'u' has non-conforming '_FillValue' nan defined, dropping '_FillValue' entirely.
var = coder.decode(var, name=name)
/home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages/xarray/conventions.py:286: SerializationWarning: variable 'v' has non-conforming '_FillValue' nan defined, dropping '_FillValue' entirely.
var = coder.decode(var, name=name)
<xarray.Dataset> Size: 8MB Dimensions: (month: 2, level: 3, latitude: 241, longitude: 480) Coordinates: * longitude (longitude) float32 2kB -180.0 -179.2 -178.5 ... 178.5 179.2 * latitude (latitude) float32 964B 90.0 89.25 88.5 ... -88.5 -89.25 -90.0 * level (level) int32 12B 200 500 850 * month (month) int32 8B 1 7 Data variables: z (month, level, latitude, longitude) float32 3MB 1.068e+05 ... ... u (month, level, latitude, longitude) float32 3MB 1.282 ... 3.539 v (month, level, latitude, longitude) float32 3MB -0.04676 ... 3... Attributes: Conventions: CF-1.0 Info: Monthly ERA-Interim data. Downloaded and edited by fabien.m...
NOTE: If you plan to use the example datasets provided by xarray, you will need to install the pooch package separately using the following command:
pip install pooch
Requirement already satisfied: pooch in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (1.8.1)
Requirement already satisfied: platformdirs>=2.5.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (4.2.0)
Requirement already satisfied: packaging>=20.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (24.0)
Requirement already satisfied: requests>=2.19.0 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from pooch) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/xbitinfo/conda/latest/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (2024.2.2)
Note: you may need to restart the kernel to use updated packages.
Without installing pooch, you will not be able to download and load the example datasets, which may result in errors or unexpected behavior.
Get information content per bit#
using xbitinfo.xbitinfo.get_bitinformation()
info_per_bit = xb.get_bitinformation(ds, dim="longitude", implementation="python")
info_per_bit
0%| | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude: 0%| | 0/3 [00:00<?, ?it/s]
Processing var: z for dim: longitude: 33%|███▎ | 1/3 [00:01<00:02, 1.23s/it]
Processing var: u for dim: longitude: 33%|███▎ | 1/3 [00:01<00:02, 1.23s/it]
Processing var: u for dim: longitude: 67%|██████▋ | 2/3 [00:02<00:01, 1.15s/it]
Processing var: v for dim: longitude: 67%|██████▋ | 2/3 [00:02<00:01, 1.15s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00, 1.13s/it]
Processing var: v for dim: longitude: 100%|██████████| 3/3 [00:03<00:00, 1.14s/it]
<xarray.Dataset> Size: 1kB Dimensions: (bitfloat32: 32) Coordinates: * bitfloat32 (bitfloat32) <U3 384B '±' 'e1' 'e2' 'e3' ... 'm21' 'm22' 'm23' dim <U9 36B 'longitude' Data variables: z (bitfloat32) float64 256B 0.0 0.0 0.0 ... 0.005199 0.007699 u (bitfloat32) float64 256B 0.7816 0.4274 0.0 ... 0.01148 0.1475 v (bitfloat32) float64 256B 0.8752 0.7756 0.0 ... 0.06165 0.05304 Attributes: xbitinfo_description: bitinformation calculated by xbitinfo.get_bit... python_repository: https://github.com/observingClouds/xbitinfo julia_repository: https://github.com/milankl/BitInformation.jl reference_paper: http://www.nature.com/articles/s43588-021-001... xbitinfo_version: 0.1.dev193+g27c240d.d20240417 BitInformation.jl_version: implementation='python'
Visualize information content#
using xbitinfo.graphics.plot_bitinformation()
fig = xb.plot_bitinformation(info_per_bit)
Get keepbits#
using xbitinfo.xbitinfo.get_keepbits()
keepbits = xb.get_keepbits(info_per_bit, 0.99)
keepbits
<xarray.Dataset> Size: 68B Dimensions: (inflevel: 1) Coordinates: dim <U9 36B 'longitude' * inflevel (inflevel) float64 8B 0.99 Data variables: z (inflevel) int64 8B 10 u (inflevel) int64 8B 3 v (inflevel) int64 8B 2
Apply bitrounding#
using xbitinfo.bitround.xr_bitround()
or xbitinfo.bitround.jl_bitround()
(does not work for chunked data)
ds_bitrounded = xb.xr_bitround(ds, keepbits)
xr.concat([ds, ds_bitrounded], "bitround").isel(level=0)["v"].plot(
col="bitround", row="month"
)
<xarray.plot.facetgrid.FacetGrid at 0x7f3e35996000>
Save compressed#
using xbitinfo.save_compressed.ToCompressed_Netcdf
or xbitinfo.save_compressed.ToCompressed_Zarr
NetCDF#
ds_bitrounded.to_compressed_netcdf("bitrounded_compressed.nc")
ds.to_compressed_netcdf("compressed.nc")
ds.to_netcdf("original.nc")
!du -hs *.nc
7.5M 0.air_original.nc
532K bitrounded_compressed.nc
4.1M compressed.nc
8.0M original.nc
!rm *.nc
Zarr#
ds_bitrounded.to_compressed_zarr("bitrounded_compressed.zarr", mode="w")
ds.to_compressed_zarr("compressed.zarr", mode="w")
ds.to_zarr(
"original.zarr", mode="w", encoding={v: {"compressor": None} for v in ds.data_vars}
);
!du -hs *.zarr
812K air_bitrounded.zarr
1.1M air_bitrounded_by_chunks.zarr
7.7M air_compressed.zarr
912K bitrounded_compressed.zarr
4.8M compressed.zarr
11M original.zarr
!rm -r *.zarr