popmon.hist package
Subpackages
Submodules
popmon.hist.hist_splitter module
- class popmon.hist.hist_splitter.HistSplitter(read_key, store_key, features=None, ignore_features=None, feature_begins_with='', project_on_axes=True, flatten_output=False, short_keys=True, var_timestamp=None, index_col='date', hist_col='histogram', filter_empty_split_hists=True)
Bases:
Module
Module divides a histogram along first axis encountered, eg. time.
For example, split histogram time:x:y along time axis. This will produce a data-frame summarizing the split information, where time is the index and each row is a x:y histogram.
- __init__(read_key, store_key, features=None, ignore_features=None, feature_begins_with='', project_on_axes=True, flatten_output=False, short_keys=True, var_timestamp=None, index_col='date', hist_col='histogram', filter_empty_split_hists=True)
Initialize an instance.
- Parameters
read_key (str) – key of input histogram-dict to read from data store
store_key (str) – key of output data to store in data store
features (list) – features of histograms to pick up from input data (optional)
ignore_features (list) – ignore list of features to compare with reference, if present (optional)
feature_begins_with (str) – require feature to begin with a given string (optional)
project_on_axes (bool) – histogram time:x:y will also be divided along x and y. default is true.
flatten_output (bool) – if true, flatten_output instead of add histogram-dict.
short_keys (bool) – if true, use short descriptive dict keys in storage dict.
var_timestamp (list) – list of variables that are converted timestamps (in ns since 1970).
index_col (str) – key for index in split dictionary. default is ‘date’
hist_col (str) – key in output dict that contains the histogram. default is ‘histogram’
filter_empty_split_hists (bool) – filter out empty sub-histograms after splitting. default is True.
- transform(data)
Central function of the module.
Typically transform() takes something from the datastore, does something to it, and puts the results back into the datastore again, to be passed on to the next module in the pipeline.
- Parameters
datastore (dict) – input datastore
- Returns
updated output datastore
- Return type
dict
popmon.hist.hist_utils module
- popmon.hist.hist_utils.get_bin_centers(hist)
Get bin centers or labels of histogram
- popmon.hist.hist_utils.get_histogram(hist_obj)
Parse input and convert to histogrammar object
- Parameters
hist_obj – input histogrammar object. Can also be a corresponding json object or str.
- Returns
histogrammar histogram
- popmon.hist.hist_utils.project_on_x(hist)
Project n-dim histogram onto x-axis
- Parameters
hist – input histogrammar histogram
- Returns
on x-axis projected histogram (1d)
- popmon.hist.hist_utils.project_split2dhist_on_axis(splitdict, axis='x')
Project a split 2d-histogram onto one axis
Project a 2d hist that’s been split with function split_hist_along_first_dimension onto x or y axis.
- Parameters
splitdict (dict) – input split histogram to be projected.
axis (str) – name of axis to project on, should be x or y. default is x.
- Returns
sorted dictionary of sub-histograms, with as keys the x-axis name and bin-number
- Return type
SortedDict
- popmon.hist.hist_utils.sparse_bin_centers_x(hist)
Get x-axis bin centers of sparse histogram
- popmon.hist.hist_utils.split_hist_along_first_dimension(hist, xname='x', yname='y', short_keys=True, convert_time_index=True, filter_empty_split_hists=True)
Split (multi-dimensional) hist into sub-hists along x-axis
Function to split a (multi-dimensional) histogram into sub-histograms along the first dimension encountered.
- Parameters
xname (str) – name of x-axis. default is x.
yname (str) – name of y-axis. default is y.
short_keys (bool) – if false, use long descriptive dict keys.
convert_time_index (bool) – if first dimension is a datetime, convert to pandas timestamp. default is true.
filter_empty_split_hists (bool) – filter out empty sub-histograms after splitting. default is True.
- Returns
sorted dictionary of sub-histograms, with as keys the x-axis name and bin-number
- Return type
SortedDict
- popmon.hist.hist_utils.sum_entries(hist, default=True)
Recursively get sum of entries of histogram
Sometimes hist.entries gives zero as answer? This function always works though.
- Parameters
hist – input histogrammar histogram
default (bool) – if false, do not use default HG method for evaluating entries, but exclude nans, of, uf.
- Returns
total sum of entries of histogram
- Return type
int
- popmon.hist.hist_utils.sum_over_x(hist)
Integrate histogram over first dimension
- Parameters
hist – input histogrammar histogram
- Returns
integrated histogram