popmon.hist package

Submodules

popmon.hist.hist_splitter module

class popmon.hist.hist_splitter.HistSplitter(read_key, store_key, features=None, ignore_features=None, feature_begins_with='', project_on_axes=True, flatten_output=False, short_keys=True, var_timestamp=None, index_col='date', hist_col='histogram', filter_empty_split_hists=True)

Bases: Module

Module divides a histogram along first axis encountered, eg. time.

For example, split histogram time:x:y along time axis. This will produce a data-frame summarizing the split information, where time is the index and each row is a x:y histogram.

__init__(read_key, store_key, features=None, ignore_features=None, feature_begins_with='', project_on_axes=True, flatten_output=False, short_keys=True, var_timestamp=None, index_col='date', hist_col='histogram', filter_empty_split_hists=True)

Initialize an instance.

Parameters
  • read_key (str) – key of input histogram-dict to read from data store

  • store_key (str) – key of output data to store in data store

  • features (list) – features of histograms to pick up from input data (optional)

  • ignore_features (list) – ignore list of features to compare with reference, if present (optional)

  • feature_begins_with (str) – require feature to begin with a given string (optional)

  • project_on_axes (bool) – histogram time:x:y will also be divided along x and y. default is true.

  • flatten_output (bool) – if true, flatten_output instead of add histogram-dict.

  • short_keys (bool) – if true, use short descriptive dict keys in storage dict.

  • var_timestamp (list) – list of variables that are converted timestamps (in ns since 1970).

  • index_col (str) – key for index in split dictionary. default is ‘date’

  • hist_col (str) – key in output dict that contains the histogram. default is ‘histogram’

  • filter_empty_split_hists (bool) – filter out empty sub-histograms after splitting. default is True.

transform(data)

Central function of the module.

Typically transform() takes something from the datastore, does something to it, and puts the results back into the datastore again, to be passed on to the next module in the pipeline.

Parameters

datastore (dict) – input datastore

Returns

updated output datastore

Return type

dict

popmon.hist.hist_utils module

popmon.hist.hist_utils.get_bin_centers(hist)

Get bin centers or labels of histogram

popmon.hist.hist_utils.get_histogram(hist_obj)

Parse input and convert to histogrammar object

Parameters

hist_obj – input histogrammar object. Can also be a corresponding json object or str.

Returns

histogrammar histogram

popmon.hist.hist_utils.project_on_x(hist)

Project n-dim histogram onto x-axis

Parameters

hist – input histogrammar histogram

Returns

on x-axis projected histogram (1d)

popmon.hist.hist_utils.project_split2dhist_on_axis(splitdict, axis='x')

Project a split 2d-histogram onto one axis

Project a 2d hist that’s been split with function split_hist_along_first_dimension onto x or y axis.

Parameters
  • splitdict (dict) – input split histogram to be projected.

  • axis (str) – name of axis to project on, should be x or y. default is x.

Returns

sorted dictionary of sub-histograms, with as keys the x-axis name and bin-number

Return type

SortedDict

popmon.hist.hist_utils.sparse_bin_centers_x(hist)

Get x-axis bin centers of sparse histogram

popmon.hist.hist_utils.split_hist_along_first_dimension(hist, xname='x', yname='y', short_keys=True, convert_time_index=True, filter_empty_split_hists=True)

Split (multi-dimensional) hist into sub-hists along x-axis

Function to split a (multi-dimensional) histogram into sub-histograms along the first dimension encountered.

Parameters
  • xname (str) – name of x-axis. default is x.

  • yname (str) – name of y-axis. default is y.

  • short_keys (bool) – if false, use long descriptive dict keys.

  • convert_time_index (bool) – if first dimension is a datetime, convert to pandas timestamp. default is true.

  • filter_empty_split_hists (bool) – filter out empty sub-histograms after splitting. default is True.

Returns

sorted dictionary of sub-histograms, with as keys the x-axis name and bin-number

Return type

SortedDict

popmon.hist.hist_utils.sum_entries(hist, default=True)

Recursively get sum of entries of histogram

Sometimes hist.entries gives zero as answer? This function always works though.

Parameters
  • hist – input histogrammar histogram

  • default (bool) – if false, do not use default HG method for evaluating entries, but exclude nans, of, uf.

Returns

total sum of entries of histogram

Return type

int

popmon.hist.hist_utils.sum_over_x(hist)

Integrate histogram over first dimension

Parameters

hist – input histogrammar histogram

Returns

integrated histogram