Getting Started#
Installation#
The only dependencies are Dask and boost-histogram.
Install dask-histogram with pip:
pip install dask-histogram
Or with conda via the conda-forge channel:
conda install dask-histogram -c conda-forge
We test dask-histogram on GNU/Linux, macOS, and Windows.
Overview#
Dask-histogram provides a new collection type for lazily constructing histogram objects. The API provided by boost-histogram is leveraged to calculate histograms on chunked/partitioned data from the core Dask Array and DataFrame collections.
The main component is the dask_histogram.AggHistogram class.
Users will typically create AggHistogram objects via the
dask_histogram.factory() function, or the
NumPy/dask.array-like functions in the
dask_histogram.routines module. Another histogram class
exists in the dask_histogram.boost module
(dask_histogram.boost.Histogram) which inherits from
boost_histogram.Histogram and overrides the fill function
such that it is aware of chunked/partitioned Dask collections. This
class is backed by dask_histogram.AggHistogram.