Getting Started
Contents
Getting Started¶
Installation¶
The only dependencies are Dask and boost-histogram.
Install dask-histogram with pip:
pip install dask-histogram
Or with conda via the conda-forge channel:
conda install dask-histogram -c conda-forge
We test dask-histogram on GNU/Linux, macOS, and Windows.
Overview¶
Dask-histogram provides a new collection type for lazily constructing histogram objects. The API provided by boost-histogram is leveraged to calculate histograms on chunked/partitioned data from the core Dask Array and DataFrame collections.
The main component is the dask_histogram.AggHistogram
class.
Users will typically create AggHistogram
objects via the
dask_histogram.factory()
function, or the
NumPy/dask.array-like functions in the
dask_histogram.routines
module. Another histogram class
exists in the dask_histogram.boost
module
(dask_histogram.boost.Histogram
) which inherits from
boost_histogram.Histogram
and overrides the fill
function
such that it is aware of chunked/partitioned Dask collections. This
class is backed by dask_histogram.AggHistogram
.