PyUnfold API

Iterative unfolding

pyunfold.iterative_unfold(data=None, data_err=None, response=None, response_err=None, efficiencies=None, efficiencies_err=None, prior=None, ts='ks', ts_stopping=0.01, max_iter=100, cov_type='multinomial', return_iterations=False, callbacks=None)[source]

Performs iterative unfolding

Parameters:
data : array_like

Input observed data distribution.

data_err : array_like

Uncertainties of the input observed data distribution. Must be the same shape as data.

response : array_like

Response matrix.

response_err : array_like

Uncertainties of response matrix. Must be the same shape as response.

efficiencies : array_like

Detection efficiencies for the cause distribution.

efficiencies_err : array_like

Uncertainties of detection efficiencies. Must be the same shape as efficiencies.

prior : array_like, optional

Prior distribution to use in unfolding. If None, then a uniform (or flat) prior will be used. If array_like, then must have the same shape as efficiencies (default is None).

ts : {‘ks’, ‘chi2’, ‘bf’, ‘rmd’}

Test statistic to use for stopping condition (default is ‘ks’). For more information about the available test statistics, see the Test Statistics API documentation.

ts_stopping : float, optional

Test statistic stopping condition. At each unfolding iteration, the test statistic is computed between the current and previous iteration. Once the test statistic drops below ts_stopping, the unfolding procedure is stopped (default is 0.01).

max_iter : int, optional

Maximum number of iterations to allow (default is 100).

cov_type : {‘multinomial’, ‘poisson’}

Whether to use the Multinomial or Poisson form for the covariance matrix (default is ‘multinomial’).

return_iterations : bool, optional

Whether to return unfolded distributions for each iteration (default is False).

callbacks : list, optional

List of pyunfold.callbacks.Callback instances to be applied during unfolding (default is None, which means no Callbacks are applied).

Returns:
unfolded_result : dict

Returned if return_iterations is False (default). Dictionary containing the final unfolded distribution, associated uncertainties, and test statistic information.

The returned dict has the following keys:

unfolded

Final unfolded cause distribution

stat_err

Statistical uncertainties on the unfolded cause distribution

sys_err

Systematic uncertainties on the unfolded cause distribution associated with limited statistics in the response matrix

ts_iter

Final test statistic value

ts_stopping

Test statistic stopping criterion

num_iterations

Number of unfolding iterations

unfolding_matrix

Unfolding matrix

unfolding_iters : pandas.DataFrame

Returned if return_iterations is True. DataFrame containing the unfolded distribution, associated uncertainties, test statistic information, etc. at each iteration.

Examples

>>> from pyunfold import iterative_unfold
>>> data = [100, 150]
>>> data_err = [10, 12.2]
>>> response = [[0.9, 0.1],
...             [0.1, 0.9]]
>>> response_err = [[0.01, 0.01],
...                 [0.01, 0.01]]
>>> efficiencies = [1, 1]
>>> efficiencies_err = [0.01, 0.01]
>>> unfolded = iterative_unfold(data=data,
...                             data_err=data_err,
...                             response=response,
...                             response_err=response_err,
...                             efficiencies=efficiencies,
...                             efficiencies_err=efficiencies_err)
>>> unfolded
{'num_iterations': 4,
 'stat_err': array([11.16853268, 13.65488168]),
 'sys_err': array([0.65570621, 0.65570621]),
 'ts_iter': 0.0038300087456445975,
 'ts_stopping': 0.01,
 'unfolded': array([ 94.32086967, 155.67913033]),
 'unfolding_matrix': array([[0.8471473 , 0.1528527 ],
                            [0.06404093, 0.93595907]])}

Callbacks

class pyunfold.callbacks.Logger[source]

Logger callback

Writes test statistic information for each iteration to sys.stdout.

Methods

on_iteration_end(iteration, status) Writes to sys.stdout
class pyunfold.callbacks.SplineRegularizer(degree=3, smooth=None, groups=None)[source]

Spline regularization callback

Smooths the unfolded distribution at each iteration using UnivariateSpline from scipy.interpolate. For more information about UnivariateSpline, see the UnivariateSpline API documentation.

Parameters:
degree : int, optional

Degree of the smoothing spline. Must be <= 5 (default is 3, a cubic spline).

smooth : float or None, optional

Positive smoothing factor used to choose the number of knots. If 0, spline will interpolate through all data points (default is None).

groups : array_like, optional

Group labels for each cause bin. If groups are specified, then each cause group will be regularized independently (default is None).

Notes

The number of causes must be larger than the spline degree.

Examples

Specify the spline degree and smoothing factor:

>>> from pyunfold.callbacks import SplineRegularizer
>>> reg = SplineRegularizer(degree=3, smooth=1.25)

Different cause groups are also supported. For instance, in a problem with seven cause bins, if the first three cause bins belong to their own group, the next two cause bins belong to another group, and the last two cause bins belong to yet another group, an array can be constructed that identifies the group each cause bin belongs to. E.g.

>>> groups = [0, 0, 0, 1, 1, 2, 2]
>>> reg = SplineRegularizer(degree=3, smooth=1.25, groups=groups)

If provided with a groups parameter, SplineRegularizer will regularize the unfolded distribution for each group independently.

Methods

on_iteration_end  

Priors

pyunfold.priors.uniform_prior(num_causes)[source]

Convenience function to calculate uniform prior distribution

Parameters:
num_causes : int

Number of cause bins.

Returns:
prior : numpy.ndarray

Normalized uniform prior distribution.

Examples

>>> from pyunfold.priors import uniform_prior
>>> uniform_prior(num_causes=4)
array([0.25, 0.25, 0.25, 0.25])
pyunfold.priors.jeffreys_prior(causes)[source]

Convenience function to calculate Jeffreys prior distribution

Parameters:
causes : array_like

Midpoint value of cause bins. For instance if cause bin edges are given by [0, 2, 4], then causes is [1, 3].

Returns:
prior : numpy.ndarray

Normalized Jeffreys prior distribution.

Notes

The Jeffreys prior is defined as

\[P(C_{\mu})^{\text{Jeffreys}} = \frac{1}{\log(C_{\text{max}}/C_\text{min})C_{\mu}}\]

for cause bin values \(C_{\mu}\) and maximum/minimum cause values \(C_{\text{max}}\)/\(C_{\text{min}}\). For more details regarding Jeffreys prior see [1].

References

[1](1, 2) Jeffreys, H. “An Invariant Form for the Prior Probability in Estimation Problems”. Proc. of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 186 (1007). London, England:453-61. https://doi.org/10.1098/rspa.1946.0056.

Examples

>>> from pyunfold.priors import jeffreys_prior
>>> causes = [1, 2, 3, 4]
>>> jeffreys_prior(causes=causes)
array([0.48, 0.24, 0.16, 0.12])

Test Statistics

pyunfold.teststat.get_ts(name='ks')[source]

Convenience function for retrieving test statisitc calculators

Parameters:
name : {‘ks’, ‘chi2’, ‘bf’, ‘rmd’}

Name of test statistic.

Returns:
ts : TestStat

Test statistics calculator

class pyunfold.teststat.KS(tol=None, num_causes=None, test_range=None, **kwargs)[source]

Kolmogorov-Smirnov (KS) two-sided test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions
calc(dist1, dist2)[source]

Calculate the test statistic between two input distributions

Parameters:
dist1 : array_like

Input distribution.

dist2 : array_like

Input distribution.

Returns:
stat : float

Test statistic

class pyunfold.teststat.Chi2(tol=None, num_causes=None, test_range=None, **kwargs)[source]

Reduced chi-squared test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions
calc(dist1, dist2)[source]

Calculate the test statistic between two input distributions

Parameters:
dist1 : array_like

Input distribution.

dist2 : array_like

Input distribution.

Returns:
stat : float

Test statistic

class pyunfold.teststat.RMD(tol=None, num_causes=None, test_range=None, **kwargs)[source]

Maximum relative difference test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions
calc(dist1, dist2)[source]

Calculate the test statistic between two input distributions

Parameters:
dist1 : array_like

Input distribution.

dist2 : array_like

Input distribution.

Returns:
stat : float

Test statistic

class pyunfold.teststat.BF(tol=None, num_causes=None, test_range=None, **kwargs)[source]

Bayes factor test statistic

Notes

For details related to the Bayes fator see [1].

References

[1](1, 2) S. Y. BenZvi and B. M. Connolly and C. G. Pfendner and S. Westerhoff. “A Bayesian Approach to Comparing Cosmic Ray Energy Spectra”. The Astrophysical Journal 738 (1):82. https://doi.org/10.1088/0004-637X/738/1/82.

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions
calc(dist1, dist2)[source]

Calculate the test statistic between two input distributions

Parameters:
dist1 : array_like

Input distribution.

dist2 : array_like

Input distribution.

Returns:
stat : float

Test statistic