Getting Started

Getting Started

Installation

The only dependencies are Dask and boost-histogram.

Install dask-histogram with pip:

pip install dask-histogram

Or with conda via the conda-forge channel:

conda install dask-histogram -c conda-forge

We test dask-histogram on GNU/Linux, macOS, and Windows.

Overview

Dask-histogram provides a new collection type for lazily constructing histogram objects. The API provided by boost-histogram is leveraged to calculate histograms on chunked/partitioned data from the core Dask Array and DataFrame collections.

The main component is the dask_histogram.AggHistogram class. Users will typically create AggHistogram objects via the dask_histogram.factory() function, or the NumPy/dask.array-like functions in the dask_histogram.routines module. Another histogram class exists in the dask_histogram.boost module (dask_histogram.boost.Histogram) which inherits from boost_histogram.Histogram and overrides the fill function such that it is aware of chunked/partitioned Dask collections. This class is backed by dask_histogram.AggHistogram.