statsmodels.nonparametric.kde.KDEUnivariate.fit

KDEUnivariate.fit(kernel='gau', bw='normal_reference', fft=True, weights=None, gridsize=None, adjust=1, cut=3, clip=(-inf, inf))[source]

Attach the density estimate to the KDEUnivariate class.

Parameters:
  • kernel (str) –

    The Kernel to be used. Choices are:

    • ”biw” for biweight

    • ”cos” for cosine

    • ”epa” for Epanechnikov

    • ”gau” for Gaussian.

    • ”tri” for triangular

    • ”triw” for triweight

    • ”uni” for uniform

  • bw (str, float, callable) –

    The bandwidth to use. Choices are:

    • ”scott” - 1.059 * A * nobs ** (-1/5.), where A is min(std(x),IQR/1.34)

    • ”silverman” - .9 * A * nobs ** (-1/5.), where A is min(std(x),IQR/1.34)

    • ”normal_reference” - C * A * nobs ** (-1/5.), where C is calculated from the kernel. Equivalent (up to 2 dp) to the “scott” bandwidth for gaussian kernels. See bandwidths.py

    • If a float is given, its value is used as the bandwidth.

    • If a callable is given, it’s return value is used. The callable should take exactly two parameters, i.e., fn(x, kern), and return a float, where:

      • x - the clipped input data

      • kern - the kernel instance used

  • fft (bool) – Whether or not to use FFT. FFT implementation is more computationally efficient. However, only the Gaussian kernel is implemented. If FFT is False, then a ‘nobs’ x ‘gridsize’ intermediate array is created.

  • gridsize (int) – If gridsize is None, max(len(x), 50) is used.

  • cut (float) – Defines the length of the grid past the lowest and highest values of x so that the kernel goes to zero. The end points are min(x) - cut * adjust * bw and max(x) + cut * adjust * bw.

  • adjust (float) – An adjustment factor for the bw. Bandwidth becomes bw * adjust.

Returns:

The instance fit,

Return type:

KDEUnivariate