PyUnfold API¶

Iterative unfolding¶

pyunfold.iterative_unfold(data=None, data_err=None, response=None, response_err=None, efficiencies=None, efficiencies_err=None, prior=None, ts='ks', ts_stopping=0.01, max_iter=100, cov_type='multinomial', return_iterations=False, callbacks=None)[source]¶

Performs iterative unfolding

Parameters:

data : array_like: Input observed data distribution.
data_err : array_like: Uncertainties of the input observed data distribution. Must be the same shape as data.
response : array_like: Response matrix.
response_err : array_like: Uncertainties of response matrix. Must be the same shape as response.
efficiencies : array_like: Detection efficiencies for the cause distribution.
efficiencies_err : array_like: Uncertainties of detection efficiencies. Must be the same shape as efficiencies.
prior : array_like, optional: Prior distribution to use in unfolding. If None, then a uniform (or flat) prior will be used. If array_like, then must have the same shape as efficiencies (default is None).
ts : {‘ks’, ‘chi2’, ‘bf’, ‘rmd’}: Test statistic to use for stopping condition (default is ‘ks’). For more information about the available test statistics, see the Test Statistics API documentation.
ts_stopping : float, optional: Test statistic stopping condition. At each unfolding iteration, the test statistic is computed between the current and previous iteration. Once the test statistic drops below ts_stopping, the unfolding procedure is stopped (default is 0.01).
max_iter : int, optional: Maximum number of iterations to allow (default is 100).
cov_type : {‘multinomial’, ‘poisson’}: Whether to use the Multinomial or Poisson form for the covariance matrix (default is ‘multinomial’).
return_iterations : bool, optional: Whether to return unfolded distributions for each iteration (default is False).
callbacks : list, optional: List of pyunfold.callbacks.Callback instances to be applied during unfolding (default is None, which means no Callbacks are applied).

Returns:

unfolded_result : dict

Returned if return_iterations is False (default). Dictionary containing the final unfolded distribution, associated uncertainties, and test statistic information.

The returned dict has the following keys:

unfolded

Final unfolded cause distribution

stat_err

Statistical uncertainties on the unfolded cause distribution

sys_err

Systematic uncertainties on the unfolded cause distribution associated with limited statistics in the response matrix

ts_iter

Final test statistic value

ts_stopping

Test statistic stopping criterion

num_iterations

Number of unfolding iterations

unfolding_matrix

Unfolding matrix

unfolding_iters : pandas.DataFrame

Returned if return_iterations is True. DataFrame containing the unfolded distribution, associated uncertainties, test statistic information, etc. at each iteration.

Examples

>>> from pyunfold import iterative_unfold
>>> data = [100, 150]
>>> data_err = [10, 12.2]
>>> response = [[0.9, 0.1],
...             [0.1, 0.9]]
>>> response_err = [[0.01, 0.01],
...                 [0.01, 0.01]]
>>> efficiencies = [1, 1]
>>> efficiencies_err = [0.01, 0.01]
>>> unfolded = iterative_unfold(data=data,
...                             data_err=data_err,
...                             response=response,
...                             response_err=response_err,
...                             efficiencies=efficiencies,
...                             efficiencies_err=efficiencies_err)
>>> unfolded
{'num_iterations': 4,
 'stat_err': array([11.16853268, 13.65488168]),
 'sys_err': array([0.65570621, 0.65570621]),
 'ts_iter': 0.0038300087456445975,
 'ts_stopping': 0.01,
 'unfolded': array([ 94.32086967, 155.67913033]),
 'unfolding_matrix': array([[0.8471473 , 0.1528527 ],
                            [0.06404093, 0.93595907]])}

Callbacks¶

class pyunfold.callbacks.Logger[source]¶

Logger callback

Writes test statistic information for each iteration to sys.stdout.

Methods

on_iteration_end(iteration, status) Writes to sys.stdout

class pyunfold.callbacks.SplineRegularizer(degree=3, smooth=None, groups=None)[source]¶

Spline regularization callback

Smooths the unfolded distribution at each iteration using UnivariateSpline from scipy.interpolate. For more information about UnivariateSpline, see the UnivariateSpline API documentation.

Parameters:

degree : int, optional: Degree of the smoothing spline. Must be <= 5 (default is 3, a cubic spline).
smooth : float or None, optional: Positive smoothing factor used to choose the number of knots. If 0, spline will interpolate through all data points (default is None).
groups : array_like, optional: Group labels for each cause bin. If groups are specified, then each cause group will be regularized independently (default is None).

Notes

The number of causes must be larger than the spline degree.

Examples

Specify the spline degree and smoothing factor:

>>> from pyunfold.callbacks import SplineRegularizer
>>> reg = SplineRegularizer(degree=3, smooth=1.25)

Different cause groups are also supported. For instance, in a problem with seven cause bins, if the first three cause bins belong to their own group, the next two cause bins belong to another group, and the last two cause bins belong to yet another group, an array can be constructed that identifies the group each cause bin belongs to. E.g.

>>> groups = [0, 0, 0, 1, 1, 2, 2]
>>> reg = SplineRegularizer(degree=3, smooth=1.25, groups=groups)

If provided with a groups parameter, SplineRegularizer will regularize the unfolded distribution for each group independently.

Methods

on_iteration_end

Priors¶

pyunfold.priors.uniform_prior(num_causes)[source]¶

Convenience function to calculate uniform prior distribution

Parameters:	num_causes : int Number of cause bins.
Returns:	prior : numpy.ndarray Normalized uniform prior distribution.

Examples

>>> from pyunfold.priors import uniform_prior
>>> uniform_prior(num_causes=4)
array([0.25, 0.25, 0.25, 0.25])

pyunfold.priors.jeffreys_prior(causes)[source]¶

Convenience function to calculate Jeffreys prior distribution

Parameters:	causes : array_like Midpoint value of cause bins. For instance if cause bin edges are given by [0, 2, 4], then `causes` is [1, 3].
Returns:	prior : numpy.ndarray Normalized Jeffreys prior distribution.

Notes

The Jeffreys prior is defined as

\[P(C_{\mu})^{\text{Jeffreys}} = \frac{1}{\log(C_{\text{max}}/C_\text{min})C_{\mu}}\]

for cause bin values \(C_{\mu}\) and maximum/minimum cause values \(C_{\text{max}}\)/\(C_{\text{min}}\). For more details regarding Jeffreys prior see [1].

References

[1]	(1, 2) Jeffreys, H. “An Invariant Form for the Prior Probability in Estimation Problems”. Proc. of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 186 (1007). London, England:453-61. https://doi.org/10.1098/rspa.1946.0056.

Examples

>>> from pyunfold.priors import jeffreys_prior
>>> causes = [1, 2, 3, 4]
>>> jeffreys_prior(causes=causes)
array([0.48, 0.24, 0.16, 0.12])

Test Statistics¶

pyunfold.teststat.get_ts(name='ks')[source]¶

Convenience function for retrieving test statisitc calculators

Parameters:	name : {‘ks’, ‘chi2’, ‘bf’, ‘rmd’} Name of test statistic.
Returns:	ts : TestStat Test statistics calculator

class pyunfold.teststat.KS(tol=None, num_causes=None, test_range=None, **kwargs)[source]¶

Kolmogorov-Smirnov (KS) two-sided test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions

calc(dist1, dist2)[source]¶

Calculate the test statistic between two input distributions

Parameters:	dist1 : array_like Input distribution. dist2 : array_like Input distribution.
Returns:	stat : float Test statistic

class pyunfold.teststat.Chi2(tol=None, num_causes=None, test_range=None, **kwargs)[source]¶

Reduced chi-squared test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions

calc(dist1, dist2)[source]¶

Calculate the test statistic between two input distributions

Parameters:	dist1 : array_like Input distribution. dist2 : array_like Input distribution.
Returns:	stat : float Test statistic

class pyunfold.teststat.RMD(tol=None, num_causes=None, test_range=None, **kwargs)[source]¶

Maximum relative difference test statistic

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions

calc(dist1, dist2)[source]¶

Calculate the test statistic between two input distributions

Parameters:	dist1 : array_like Input distribution. dist2 : array_like Input distribution.
Returns:	stat : float Test statistic

class pyunfold.teststat.BF(tol=None, num_causes=None, test_range=None, **kwargs)[source]¶

Bayes factor test statistic

Notes

For details related to the Bayes fator see [1].

References

[1]	(1, 2) S. Y. BenZvi and B. M. Connolly and C. G. Pfendner and S. Westerhoff. “A Bayesian Approach to Comparing Cosmic Ray Energy Spectra”. The Astrophysical Journal 738 (1):82. https://doi.org/10.1088/0004-637X/738/1/82.

Methods

calc(dist1, dist2) Calculate the test statistic between two input distributions

calc(dist1, dist2)[source]¶

Calculate the test statistic between two input distributions

Parameters:	dist1 : array_like Input distribution. dist2 : array_like Input distribution.
Returns:	stat : float Test statistic