# Kolmogorov-Smirnov Statistic

## Definition of Kolmogorov-Smirnov Statistic

Kolmogorov-Smirnov Statistic: The Kolmogorov-Smirnov statistic is a measure of the difference between two distributions. It is used to determine whether the two distributions are statistically different from each other.

## How is Kolmogorov-Smirnov Statistic used?

The Kolmogorov-Smirnov Statistic (K-S Test) is a nonparametric test used to compare the distributions of two samples. It can be used to determine whether two samples are drawn from the same distribution or different distributions. The K-S Test is applied by comparing the two cumulative distribution functions (CDFs). The statistic measures the maximum distance between the CDFs and the larger this distance, the more likely it is that the two samples are drawn from different distributions.

The K-S Test has a variety of applications in data science and machine learning. It can be used to compare continuous distributions, such as bootstrap samples, to check whether they accurately represent their parent populations. It can also be used to check for good model fitting, by comparing observed data with expected values generated by a given model. In addition, this test may be used in anomaly detection, where it can help identify suspicious points in a dataset that stand out from their neighbors.

The procedure for calculating the K-S Test involves generating an empirical estimate of each sample’s CDF. Next, these estimates are compared against one another to create a Kolmogorov-Smirnov Statistic value which represents how close together or far apart these two CDFs are from each other. Finally, if this statistic exceeds some predetermined threshold then it indicates that there is evidence for rejecting the null hypothesis that both samples come from the same population. Alternatively, if this statistic does not exceed this predetermined threshold then we fail to reject the null hypothesis and conclude instead that both samples come from similar distributions.