本文整理汇总了Python中skbio.diversity._util._validate_counts_vector函数的典型用法代码示例。如果您正苦于以下问题:Python _validate_counts_vector函数的具体用法?Python _validate_counts_vector怎么用?Python _validate_counts_vector使用的例子?那么恭喜您, 这里精选的函数代码示例或许可以为您提供帮助。
在下文中一共展示了_validate_counts_vector函数的15个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: osd
def osd(counts):
"""Calculate observed OTUs, singles, and doubles.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
osd : tuple
Observed OTUs, singles, and doubles.
See Also
--------
observed_otus
singles
doubles
Notes
-----
This is a convenience function used by many of the other measures that rely
on these three measures.
"""
counts = _validate_counts_vector(counts)
return observed_otus(counts), singles(counts), doubles(counts)
示例2: robbins
def robbins(counts):
r"""Calculate Robbins' estimator for the probability of unobserved outcomes.
Robbins' estimator is defined as:
.. math::
\frac{F_1}{n+1}
where :math:`F_1` is the number of singleton OTUs.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
double
Robbins' estimate.
Notes
-----
Robbins' estimator is defined in [1]_. The estimate computed here is for
:math:`n-1` counts, i.e. the x-axis is off by 1.
References
----------
.. [1] Robbins, H. E (1968). Ann. of Stats. Vol 36, pp. 256-257.
"""
counts = _validate_counts_vector(counts)
return singles(counts) / counts.sum()
示例3: goods_coverage
def goods_coverage(counts):
r"""Calculate Good's coverage of counts.
Good's coverage estimator is defined as
.. math::
1-\frac{F_1}{N}
where :math:`F_1` is the number of singleton OTUs and :math:`N` is the
total number of individuals (sum of abundances for all OTUs).
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
double
Good's coverage estimator.
"""
counts = _validate_counts_vector(counts)
f1 = singles(counts)
N = counts.sum()
return 1 - (f1 / N)
示例4: fisher_alpha
def fisher_alpha(counts):
r"""Calculate Fisher's alpha, a metric of diversity.
Fisher's alpha is estimated by solving the following equation for
:math:`\alpha`:
.. math::
S=\alpha\ln(1+\frac{N}{\alpha})
where :math:`S` is the number of OTUs and :math:`N` is the
total number of individuals in the sample.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
double
Fisher's alpha.
Raises
------
RuntimeError
If the optimizer fails to converge (error > 1.0).
Notes
-----
The implementation here is based on the description given in the SDR-IV
online manual [1]_. Uses ``scipy.optimize.minimize_scalar`` to find
Fisher's alpha.
References
----------
.. [1] http://www.pisces-conservation.com/sdrhelp/index.html
"""
counts = _validate_counts_vector(counts)
n = counts.sum()
s = observed_otus(counts)
def f(alpha):
return (alpha * np.log(1 + (n / alpha)) - s) ** 2
# Temporarily silence RuntimeWarnings (invalid and division by zero) during
# optimization in case invalid input is provided to the objective function
# (e.g. alpha=0).
orig_settings = np.seterr(divide='ignore', invalid='ignore')
try:
alpha = minimize_scalar(f).x
finally:
np.seterr(**orig_settings)
if f(alpha) > 1.0:
raise RuntimeError("Optimizer failed to converge (error > 1.0), so "
"could not compute Fisher's alpha.")
return alpha
示例5: chao1
def chao1(counts, bias_corrected=True):
r"""Calculate chao1 richness estimator.
Uses the bias-corrected version unless `bias_corrected` is ``False`` *and*
there are both singletons and doubletons.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
bias_corrected : bool, optional
Indicates whether or not to use the bias-corrected version of the
equation. If ``False`` *and* there are both singletons and doubletons,
the uncorrected version will be used. The biased-corrected version will
be used otherwise.
Returns
-------
double
Computed chao1 richness estimator.
See Also
--------
chao1_ci
Notes
-----
The uncorrected version is based on Equation 6 in [1]_:
.. math::
chao1=S_{obs}+\frac{F_1^2}{2F_2}
where :math:`F_1` and :math:`F_2` are the count of singletons and
doubletons, respectively.
The bias-corrected version is defined as
.. math::
chao1=S_{obs}+\frac{F_1(F_1-1)}{2(F_2+1)}
References
----------
.. [1] Chao, A. 1984. Non-parametric estimation of the number of classes in
a population. Scandinavian Journal of Statistics 11, 265-270.
"""
counts = _validate_counts_vector(counts)
o, s, d = osd(counts)
if not bias_corrected and s and d:
return o + s ** 2 / (d * 2)
else:
return o + s * (s - 1) / (2 * (d + 1))
示例6: chao1_ci
def chao1_ci(counts, bias_corrected=True, zscore=1.96):
"""Calculate chao1 confidence interval.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
bias_corrected : bool, optional
Indicates whether or not to use the bias-corrected version of the
equation. If ``False`` *and* there are both singletons and doubletons,
the uncorrected version will be used. The biased-corrected version will
be used otherwise.
zscore : scalar, optional
Score to use for confidence. Default of 1.96 is for a 95% confidence
interval.
Returns
-------
tuple
chao1 confidence interval as ``(lower_bound, upper_bound)``.
See Also
--------
chao1
Notes
-----
The implementation here is based on the equations in the EstimateS manual
[1]_. Different equations are employed to calculate the chao1 variance and
confidence interval depending on `bias_corrected` and the presence/absence
of singletons and/or doubletons.
Specifically, the following EstimateS equations are used:
1. No singletons, Equation 14.
2. Singletons but no doubletons, Equations 7, 13.
3. Singletons and doubletons, ``bias_corrected=True``, Equations 6, 13.
4. Singletons and doubletons, ``bias_corrected=False``, Equations 5, 13.
References
----------
.. [1] http://viceroy.eeb.uconn.edu/estimates/
"""
counts = _validate_counts_vector(counts)
o, s, d = osd(counts)
if s:
chao = chao1(counts, bias_corrected)
chaovar = _chao1_var(counts, bias_corrected)
return _chao_confidence_with_singletons(chao, o, chaovar, zscore)
else:
n = counts.sum()
return _chao_confidence_no_singletons(n, o, zscore)
示例7: test_validate_counts_vector
def test_validate_counts_vector(self):
# python list
obs = _validate_counts_vector([0, 2, 1, 3])
npt.assert_array_equal(obs, np.array([0, 2, 1, 3]))
self.assertEqual(obs.dtype, int)
# numpy array (no copy made)
data = np.array([0, 2, 1, 3])
obs = _validate_counts_vector(data)
npt.assert_array_equal(obs, data)
self.assertEqual(obs.dtype, int)
self.assertTrue(obs is data)
# single element
obs = _validate_counts_vector([42])
npt.assert_array_equal(obs, np.array([42]))
self.assertEqual(obs.dtype, int)
self.assertEqual(obs.shape, (1,))
# suppress casting to int
obs = _validate_counts_vector([42.2, 42.1, 0], suppress_cast=True)
npt.assert_array_equal(obs, np.array([42.2, 42.1, 0]))
self.assertEqual(obs.dtype, float)
# all zeros
obs = _validate_counts_vector([0, 0, 0])
npt.assert_array_equal(obs, np.array([0, 0, 0]))
self.assertEqual(obs.dtype, int)
# all zeros (single value)
obs = _validate_counts_vector([0])
npt.assert_array_equal(obs, np.array([0]))
self.assertEqual(obs.dtype, int)
示例8: esty_ci
def esty_ci(counts):
r"""Calculate Esty's CI.
Esty's CI is defined as
.. math::
F_1/N \pm z\sqrt{W}
where :math:`F_1` is the number of singleton OTUs, :math:`N` is the total
number of individuals (sum of abundances for all OTUs), and :math:`z` is a
constant that depends on the targeted confidence and based on the normal
distribution.
:math:`W` is defined as
.. math::
\frac{F_1(N-F_1)+2NF_2}{N^3}
where :math:`F_2` is the number of doubleton OTUs.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
tuple
Esty's confidence interval as ``(lower_bound, upper_bound)``.
Notes
-----
Esty's CI is defined in [1]_. :math:`z` is hardcoded for a 95% confidence
interval.
References
----------
.. [1] Esty, W. W. (1983). "A normal limit law for a nonparametric
estimator of the coverage of a random sample". Ann Statist 11: 905-912.
"""
counts = _validate_counts_vector(counts)
f1 = singles(counts)
f2 = doubles(counts)
n = counts.sum()
z = 1.959963985
W = (f1 * (n - f1) + 2 * n * f2) / (n ** 3)
return f1 / n - z * np.sqrt(W), f1 / n + z * np.sqrt(W)
示例9: _setup_faith_pd
def _setup_faith_pd(counts, otu_ids, tree, validate, single_sample):
if validate:
if single_sample:
# only validate count if operating in single sample mode, they
# will have already been validated otherwise
counts = _validate_counts_vector(counts)
_validate_otu_ids_and_tree(counts, otu_ids, tree)
else:
_validate_otu_ids_and_tree(counts[0], otu_ids, tree)
counts_by_node, tree_index, branch_lengths = _vectorize_counts_and_tree(counts, otu_ids, tree)
return counts_by_node, branch_lengths
示例10: lladser_ci
def lladser_ci(counts, r, alpha=0.95, f=10, ci_type='ULCL'):
"""Calculate single CI of the conditional uncovered probability.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
r : int
Number of new colors that are required for the next prediction.
alpha : float, optional
Desired confidence level.
f : float, optional
Ratio between upper and lower bound.
ci_type : {'ULCL', 'ULCU', 'U', 'L'}
Type of confidence interval. If ``'ULCL'``, upper and lower bounds with
conservative lower bound. If ``'ULCU'``, upper and lower bounds with
conservative upper bound. If ``'U'``, upper bound only, lower bound
fixed to 0.0. If ``'L'``, lower bound only, upper bound fixed to 1.0.
Returns
-------
tuple
Confidence interval as ``(lower_bound, upper_bound)``.
See Also
--------
lladser_pe
Notes
-----
This function is just a wrapper around the full CI estimator described
in Theorem 2 (iii) in [1]_, intended to be called for a single best CI
estimate on a complete sample.
References
----------
.. [1] Lladser, Gouet, and Reeder, "Extrapolation of Urn Models via
Poissonization: Accurate Measurements of the Microbial Unknown" PLoS
2011.
"""
counts = _validate_counts_vector(counts)
sample = _expand_counts(counts)
np.random.shuffle(sample)
try:
ci = list(_lladser_ci_series(sample, r, alpha, f, ci_type))[-1]
except IndexError:
ci = (np.nan, np.nan)
return ci
示例11: test_validate_counts_vector_invalid_input
def test_validate_counts_vector_invalid_input(self):
# wrong dtype
with self.assertRaises(TypeError):
_validate_counts_vector([0, 2, 1.2, 3])
# wrong number of dimensions (2-D)
with self.assertRaises(ValueError):
_validate_counts_vector([[0, 2, 1, 3], [4, 5, 6, 7]])
# wrong number of dimensions (scalar)
with self.assertRaises(ValueError):
_validate_counts_vector(1)
# negative values
with self.assertRaises(ValueError):
_validate_counts_vector([0, 0, 2, -1, 3])
示例12: mcintosh_d
def mcintosh_d(counts):
r"""Calculate McIntosh dominance index D.
McIntosh dominance index D is defined as:
.. math::
D = \frac{N - U}{N - \sqrt{N}}
where :math:`N` is the total number of individuals in the sample and
:math:`U` is defined as:
.. math::
U = \sqrt{\sum{{n_i}^2}}
where :math:`n_i` is the number of individuals in the :math:`i^{\text{th}}`
OTU.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
double
McIntosh dominance index D.
See Also
--------
mcintosh_e
Notes
-----
The index was proposed in [1]_. The implementation here is based on the
description given in the SDR-IV online manual [2]_.
References
----------
.. [1] McIntosh, R. P. 1967 An index of diversity and the relation of
certain concepts to diversity. Ecology 48, 1115-1126.
.. [2] http://www.pisces-conservation.com/sdrhelp/index.html
"""
counts = _validate_counts_vector(counts)
u = np.sqrt((counts * counts).sum())
n = counts.sum()
return (n - u) / (n - np.sqrt(n))
示例13: kempton_taylor_q
def kempton_taylor_q(counts, lower_quantile=0.25, upper_quantile=0.75):
"""Calculate Kempton-Taylor Q index of alpha diversity.
Estimates the slope of the cumulative abundance curve in the interquantile
range. By default, uses lower and upper quartiles, rounding inwards.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
lower_quantile : float, optional
Lower bound of the interquantile range. Defaults to lower quartile.
upper_quantile : float, optional
Upper bound of the interquantile range. Defaults to upper quartile.
Returns
-------
double
Kempton-Taylor Q index of alpha diversity.
Notes
-----
The index is defined in [1]_. The implementation here is based on the
description given in the SDR-IV online manual [2]_.
The implementation provided here differs slightly from the results given in
Magurran 1998. Specifically, we have 14 in the numerator rather than 15.
Magurran recommends counting half of the OTUs with the same # counts as the
point where the UQ falls and the point where the LQ falls, but the
justification for this is unclear (e.g. if there were a very large # OTUs
that just overlapped one of the quantiles, the results would be
considerably off). Leaving the calculation as-is for now, but consider
changing.
References
----------
.. [1] Kempton, R. A. and Taylor, L. R. (1976) Models and statistics for
species diversity. Nature, 262, 818-820.
.. [2] http://www.pisces-conservation.com/sdrhelp/index.html
"""
counts = _validate_counts_vector(counts)
n = len(counts)
lower = int(np.ceil(n * lower_quantile))
upper = int(n * upper_quantile)
sorted_counts = np.sort(counts)
return (upper - lower) / np.log(sorted_counts[upper] /
sorted_counts[lower])
示例14: observed_otus
def observed_otus(counts):
"""Calculate the number of distinct OTUs.
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
int
Distinct OTU count.
"""
counts = _validate_counts_vector(counts)
return (counts != 0).sum()
示例15: singles
def singles(counts):
"""Calculate number of single occurrences (singletons).
Parameters
----------
counts : 1-D array_like, int
Vector of counts.
Returns
-------
int
Singleton count.
"""
counts = _validate_counts_vector(counts)
return (counts == 1).sum()