U Gv fú<ã @sdZddddddddd d g ZddlZdd lmZmZmZddlmZddlm Z ddl mZddl mZmZmZmZedddgƒddfdd„Zd$dd„Zedddgƒdfdd„Zd%dd „Zdddgdfdd„Zdddgddfdd„Zd&d d„Zd'd!d„Zd(d"d„Zd)d#d „ZdS)*zB Additional statistics functions with support for masked arrays. Úcompare_medians_msÚhdquantilesÚhdmedianÚhdquantiles_sdÚidealfourthsÚmedian_cihsÚmjciÚmquantiles_cimjÚrshÚtrimmed_mean_ciéN)Úfloat_Úint_Úndarray)ÚMaskedArrayé)Ú _mstats_basic)ÚnormÚbetaÚtÚbinomgÐ?çà?gè?FcCs€dd„}tj|dtd}tj|ddd}|dks:|jdkrH||||ƒ}n*|jdkr`td |jƒ‚t |||||¡}tj|dd S)a Computes quantile estimates with the Harrell-Davis method. The quantile estimates are calculated as a weighted linear combination of order statistics. Parameters ---------- data : array_like Data array. prob : sequence, optional Sequence of quantiles to compute. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. var : bool, optional Whether to return the variance of the estimate. Returns ------- hdquantiles : MaskedArray A (p,) array of quantiles (if `var` is False), or a (2,p) array of quantiles and variances (if `var` is True), where ``p`` is the number of quantiles. See Also -------- hdquantiles_sd c SsJt t | ¡ t¡¡¡}|j}t dt|ƒft ¡}|dkrTtj |_|rL|S|dSt |d¡t |ƒ}tj}t|ƒD]t\}} |||d| |dd| ƒ} | dd…| dd…}t ||¡}||d|f<t |||d¡|d|f<qx|d|d|dkf<|d|d|dkf<|rBtj |d|dkf<|d|dkf<|S|dS)zGComputes the HD quantiles for a 1D array. Returns nan for invalid data.érrNéÿÿÿÿ)ÚnpÚsqueezeÚsortÚ compressedÚviewrÚsizeÚemptyÚlenrÚnanÚflatÚarangeÚfloatrÚcdfÚ enumerateÚdot) ÚdataÚprobÚvarÚxsortedÚnZhdÚvÚbetacdfÚiÚpÚ_wÚwZhd_mean©r3ú>/tmp/pip-unpacked-wheel-96ln3f52/scipy/stats/_mstats_extras.pyÚ_hd_1D;s, "zhdquantiles.._hd_1DF©ÚcopyÚdtyper©r7ZndminNrúDArray 'data' must be at most two dimensional, but got data.ndim = %d©r7)ÚmaÚarrayrrÚndimÚ ValueErrorÚapply_along_axisÚfix_invalid)r(r)Úaxisr*r5r0Úresultr3r3r4rs ÿrcCst|dg||d}| ¡S)a9 Returns the Harrell-Davis estimate of the median along the given axis. Parameters ---------- data : ndarray Data array. axis : int, optional Axis along which to compute the quantiles. If None, use a flattened array. var : bool, optional Whether to return the variance of the estimate. Returns ------- hdmedian : MaskedArray The median values. If ``var=True``, the variance is returned inside the masked array. E.g. for a 1-D array the shape change from (1,) to (2,). r)rBr*)rr)r(rBr*rCr3r3r4rgscCsvdd„}tj|dtd}tj|ddd}|dkr<|||ƒ}n(|jdkrTtd |jƒ‚t ||||¡}tj|dd ¡S)aý The standard error of the Harrell-Davis quantile estimates by jackknife. Parameters ---------- data : array_like Data array. prob : sequence, optional Sequence of quantiles to compute. axis : int, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- hdquantiles_sd : MaskedArray Standard error of the Harrell-Davis quantile estimates. See Also -------- hdquantiles cSst | ¡¡}t|ƒ}t t|ƒt¡}|dkr6tj|_t |¡t |dƒ}t j}t|ƒD]¶\}}|||||d|ƒ} | dd…| dd…} t |¡}t | |dd…¡|dd…<|dd…t | ddd…|ddd…¡ddd…7<t | ¡|d¡||<qZ|S)z%Computes the std error for 1D arrays.rrNrr)rrrr rrr!r"r#r$rr%r&Z zeros_likeZcumsumÚsqrtr*)r(r)r+r,ZhdsdÚvvr.r/r0r1r2Zmx_r3r3r4Ú_hdsd_1D™s <z hdquantiles_sd.._hdsd_1DFr6rr9Nrr:r;) r<r=rrr>r?r@rAZravel)r(r)rBrFr0rCr3r3r4rs ÿ©çš™™™™™É?rH©TTçš™™™™™©?c Cs|tj|dd}tj||||d}| |¡}tj||||d}| |¡d}t d|d|¡} t || ||| |f¡S)a³ Selected confidence interval of the trimmed mean along the given axis. Parameters ---------- data : array_like Input data. limits : {None, tuple}, optional None or a two item tuple. Tuple of the percentages to cut on each side of the array, with respect to the number of unmasked data, as floats between 0. and 1. If ``n`` is the number of unmasked data before trimming, then (``n * limits[0]``)th smallest data and (``n * limits[1]``)th largest data are masked. The total number of unmasked data after trimming is ``n * (1. - sum(limits))``. The value of one limit can be set to None to indicate an open interval. Defaults to (0.2, 0.2). inclusive : (2,) tuple of boolean, optional If relative==False, tuple indicating whether values exactly equal to the absolute limits are allowed. If relative==True, tuple indicating whether the number of data being masked on each side should be rounded (True) or truncated (False). Defaults to (True, True). alpha : float, optional Confidence level of the intervals. Defaults to 0.05. axis : int, optional Axis along which to cut. If None, uses a flattened version of `data`. Defaults to None. Returns ------- trimmed_mean_ci : (2,) ndarray The lower and upper confidence intervals of the trimmed data. Fr;)ÚlimitsÚ inclusiverBrç@) r<r=ÚmstatsZtrimrZmeanZtrimmed_stdeÚcountrÚppfr) r(rKrLÚalpharBZtrimmedZtmeanZtstdeZdfZtppfr3r3r4r Às* cCsddd„}tj|dd}|jdkr.td|jƒ‚tj|ddd}|d krP|||ƒSt ||||¡Sd S) a„ Returns the Maritz-Jarrett estimators of the standard error of selected experimental quantiles of the data. Parameters ---------- data : ndarray Data array. prob : sequence, optional Sequence of quantiles to compute. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. c SsÖt | ¡¡}|j}t |¡|d t¡}tj}t t |ƒt¡}tjd|dtd|}|d|}t |ƒD]b\}} ||| d|| ƒ||| d|| ƒ} t | |¡}t | |d¡}t ||d¡||<qn|S)Nrr)r8gð?r)rrrrr=Zastyper rr%rr rr#r&r'rD) r(r0r,r)r.ZmjÚxÚyr/ÚmÚWZC1ZC2r3r3r4Ú_mjci_1Ds(zmjci.._mjci_1DFr;rr:rr9N)r<r=r>r?rr@)r(r)rBrVr0r3r3r4rós ÿ cCsZt|d|ƒ}t d|d¡}tj||dd|d}t|||d}||||||fS)aÕ Computes the alpha confidence interval for the selected quantiles of the data, with Maritz-Jarrett estimators. Parameters ---------- data : ndarray Data array. prob : sequence, optional Sequence of quantiles to compute. alpha : float, optional Confidence level of the intervals. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- ci_lower : ndarray The lower boundaries of the confidence interval. Of the same length as `prob`. ci_upper : ndarray The upper boundaries of the confidence interval. Of the same length as `prob`. rrMr)ZalphapZbetaprB©rB)ÚminrrPrNZ mquantilesr)r(r)rQrBÚzZxqZsmjr3r3r4r s cCsVdd„}tj|dd}|dkr*|||ƒ}n(|jdkrBtd|jƒ‚t ||||¡}|S)aA Computes the alpha-level confidence interval for the median of the data. Uses the Hettmasperger-Sheather method. Parameters ---------- data : array_like Input data. Masked values are discarded. The input should be 1D only, or `axis` should be set to None. alpha : float, optional Confidence level of the intervals. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- median_cihs Alpha level confidence interval. c Ss>t | ¡¡}t|ƒ}t|d|ƒ}tt |d|d¡ƒ}t |||d¡t |d|d¡}|d|kr–|d8}t |||d¡t |d|d¡}t ||d|d¡t ||d¡}|d|||}|||t ||d||ƒ}|||d|||d||||dd||||f}|S)NrrMrr) rrrr rXÚintrZ_ppfr%r$) r(rQr,ÚkZgkZgkkÚIÚlambdZlimsr3r3r4Ú_cihs_1DYs$$$$&ÿzmedian_cihs.._cihs_1DFr;Nrr:)r<r=r>r?r@)r(rQrBr^rCr3r3r4rBs ÿcCsntj||dtj||d}}tj||dtj||d}}t ||¡t |d|d¡}dt |¡S)a" Compares the medians from two independent groups along the given axis. The comparison is performed using the McKean-Schrader estimate of the standard error of the medians. Parameters ---------- group_1 : array_like First dataset. Has to be of size >=7. group_2 : array_like Second dataset. Has to be of size >=7. axis : int, optional Axis along which the medians are estimated. If None, the arrays are flattened. If `axis` is not None, then `group_1` and `group_2` should have the same shape. Returns ------- compare_medians_ms : {float, ndarray} If `axis` is None, then returns a float, otherwise returns a 1-D ndarray of floats with a length equal to the length of `group_1` along `axis`. Examples -------- >>> from scipy import stats >>> a = [1, 2, 3, 4, 5, 6, 7] >>> b = [8, 9, 10, 11, 12, 13, 14] >>> stats.mstats.compare_medians_ms(a, b, axis=None) 1.0693225866553746e-05 The function is vectorized to compute along a given axis. >>> import numpy as np >>> rng = np.random.default_rng() >>> x = rng.random(size=(3, 7)) >>> y = rng.random(size=(3, 8)) >>> stats.mstats.compare_medians_ms(x, y, axis=1) array([0.36908985, 0.36092538, 0.2765313 ]) References ---------- .. [1] McKean, Joseph W., and Ronald M. Schrader. "A comparison of methods for studentizing the sample median." Communications in Statistics-Simulation and Computation 13.6 (1984): 751-773. rWrr) r<ZmedianrNZstde_medianrÚabsrDrr%)Zgroup_1Zgroup_2rBZmed_1Zmed_2Zstd_1Zstd_2rUr3r3r4rus2ÿ$cCs>dd„}tj||d t¡}|dkr,||ƒSt |||¡SdS)aC Returns an estimate of the lower and upper quartiles. Uses the ideal fourths algorithm. Parameters ---------- data : array_like Input array. axis : int, optional Axis along which the quartiles are estimated. If None, the arrays are flattened. Returns ------- idealfourths : {list of floats, masked array} Returns the two internal values that divide `data` into four parts using the ideal fourths algorithm either along the flattened array (if `axis` is None) or along `axis` of `data`. cSs’| ¡}t|ƒ}|dkr$tjtjgSt|dddƒ\}}t|ƒ}d|||d|||}||}d||||||d}||gS)Nég@g«ªªªªªÚ?r)rr rr!ÚdivmodrZ)r(rRr,ÚjÚhZqlor[Zqupr3r3r4Ú_idfÄs zidealfourths.._idfrWN)r<rrrr@)r(rBrdr3r3r4r®s cCsÖtj|dd}|dkr|}ntj|ddd}|jdkr>tdƒ‚| ¡}t|dd}d|d |d |d}|dd…df|ddd…f|k d ¡}|dd…df|ddd…f|k d ¡}||d||S) aé Evaluates Rosenblatt's shifted histogram estimators for each data point. Rosenblatt's estimator is a centered finite-difference approximation to the derivative of the empirical cumulative distribution function. Parameters ---------- data : sequence Input data, should be 1-D. Masked values are ignored. points : sequence or None, optional Sequence of points where to evaluate Rosenblatt shifted histogram. If None, use the data. Fr;Nrr9z#The input array should be 1D only !rWg333333ó?rrrHrM)r<r=rr>ÚAttributeErrorrOrÚsum)r(Zpointsr,ÚrrcZnhiZnlor3r3r4r Ös **)rF)rGrIrJN)rJN)N)N)N)Ú__doc__Ú__all__Znumpyrrr rZnumpy.mar<rÚrrNZscipy.stats.distributionsrrrrÚlistrrrr rrrrrr r3r3r3r4Ús<ûK ?ÿ 3-" 3 9 (