dask_histogram.boost.Histogram

dask_histogram.boost.Histogram

class dask_histogram.boost.Histogram(*axes, storage=Double(), metadata=None, split_every=None)[source]

Histogram object capable of lazy computation.

Parameters

Examples

A two dimensional histogram with one fixed bin width axis and another variable bin width axis:

Note that (for convenience) the boost_histogram.axis namespace is mirrored as dask_histogram.axis and the boost_histogram.storage namespace is mirrored as dask_histogram.storage.

>>> import dask.array as da
>>> import dask_histogram.boost as dhb
>>> x = da.random.standard_normal(size=(1000,), chunks=200)
>>> y = da.random.standard_normal(size=(1000,), chunks=200)
>>> w = da.random.uniform(0.2, 0.8, size=(1000,), chunks=200)
>>> h = dhb.Histogram(
...     dhb.axis.Regular(10, -3, 3),
...     dhb.axis.Variable([-3, -2, -1, 0, 1.1, 2.2, 3.3]),
...     storage=dhb.storage.Weight()
... ).fill(x, y, weight=w).compute()
__init__(*axes, storage=Double(), metadata=None, split_every=None)[source]

Construct a Histogram object.

Methods

__init__(*axes[, storage, metadata, split_every])

Construct a Histogram object.

agg_histogram()

compute(**kwargs)

Compute this dask collection

copy(*[, deep])

Make a copy of the histogram.

counts([flow])

Returns the number of entries in each bin for an unweighted histogram or profile and an effective number of entries (defined below) for a weighted histogram or profile.

empty([flow])

Check to see if the histogram has any non-default values.

fill(*args[, weight, sample, threads])

Stage a fill call using a Dask collection as input.

persist(**kwargs)

Persist this dask collection into memory

project(*args)

Project to a single axis or several axes on a multidimensional histogram.

reset()

Clear the bin counters.

staged_fills()

Check if histogram has staged fills.

sum([flow])

Compute the sum over the histogram bins (optionally including the flow bins).

to_dask_array([flow, dd])

Convert to dask.array style of return arrays.

to_delayed()

Histogram as a delayed object.

to_numpy([flow, dd, view])

Convert to a NumPy style tuple of return arrays.

values([flow])

Returns the accumulated values.

variances([flow])

Returns the estimated variance of the accumulated values.

view([flow])

Return a view into the data, optionally with overflow turned on.

visualize([filename, format, optimize_graph])

Render the computation of this object's task graph using graphviz.

Attributes

axes

dask

dask_name

kind

Returns Kind.COUNT if this is a normal summing histogram, and Kind.MEAN if this is a mean histogram.

ndim

Number of axes (dimensions) of the histogram.

shape

Tuple of axis sizes (not including underflow/overflow).

size

Total number of bins in the histogram (including underflow/overflow).

storage_type