Kolmogorov - Smirnov test (idea) by Gorgonzola

This is probably the most basic and widely used of the non-parametric statistical tests. Developed in the 1930s by Andrei Nikolaevich Kolmogorov and Nikolai Vasilyevich Smirnov, test allows the comparison of a frequency distribution to some other known (continuous one-dimensional) distribution, such as a Gaussian normal distribution.

Some people use the K-S test as an alternative to the student T-test. Unfortunately, this use violates basic constraints of the K-S test (i.e. that one distribution must be known beforehand), and is meaningless. (However, this is OK if you're comparing a sample to an entire population).

The test works by comparing cumulative frequency values for the data points against the cumulative frequencies expected for the distribution in question.

Although it's called a 'goodness of fit' test, it's really a 'badness of fit' test, as all it can tell you is whether the data deviate significantly from the distribution. Because of this, we have to formulate our hypotheses in a counterintuituve way: The null hypothesis H₀ is that the data do not significantly deviate from the distribution.

Before you can perform the test, you must have an "empirical cumulative distribution function" (ECDF) describing the expected cumulative frequencies. After deciding on a confidence interval you can then order the data, and calculate the actual cumulative frequencies for the points. You then determine a statistic

D = max |ECDF(i) - F(i)|

from these values. If this D value is less than a critical value for the number of points and the confidence interval you chose, you cannot reject your null hypothesis, and must proceed as if the data fit the distribution.

Obviously, if you want the data to fit the distribution, it's tempting to jump to the conclusion that the data do fit the distribution. Unfortunately, this latter fact isn't determined by the test. A rather unsatisfying conclusion, and the only way to achieve a higher comfort level is to use more data points (which lowers the critical values).

For an example, let's test a sample of random numbers. I generated 41 random numbers (in the Data column of the table below) to represent sample data points. We'll pick the confidence level of 0.05 used commonly in the social sciences. The critical value for D is thus 0.3004.

To test the sample distribution against a normal distribution, we first calculate a Z score for each sample point. Then generate the expected cumulative frequency for each Z from a formula which estimates this value for the normal distribution¹. F(Z) is the actual cumulative frequency encountered (the point index divided by 41). Finally, calculate the difference between ECDF(Z) and F(Z) for each Z:

 
Point    Data     Z     ECDF(z)  F(z)   |F(z)-ECDF(z)|
----- -------  -------- ------  ------  --------  
1      0.7807  -1.5248  0.0608  0.0244  0.0364
2      2.5164  -1.4670  0.0690  0.0488  0.0202
3      7.9011  -1.2875  0.0981  0.0732  0.0249
4      8.0303  -1.2832  0.0989  0.0976  0.0013
5     10.6726  -1.1951  0.1155  0.1220  0.0064
6     12.8190  -1.1235  0.1303  0.1463  0.0161
7     14.6225  -1.0634  0.1436  0.1707  0.0271
8     17.1340  -0.9797  0.1635  0.1951  0.0316
9     20.6383  -0.8629  0.1941  0.2195  0.0254
10    21.2010  -0.8441  0.1993  0.2439  0.0446
11    21.7543  -0.8257  0.2045  0.2683  0.0638
12    23.5697  -0.7652  0.2221  0.2927  0.0706
13    25.2114  -0.7104  0.2387  0.3171  0.0783
14    29.7567  -0.5589  0.2881  0.3415  0.0533
15    30.1014  -0.5474  0.2921  0.3659  0.0738
16    33.3612  -0.4388  0.3304  0.3902  0.0598
17    33.5060  -0.4340  0.3322  0.4146  0.0825
18    35.2678  -0.3752  0.3538  0.4390  0.0853
19    38.5917  -0.2644  0.3957  0.4634  0.0677
20    39.3931  -0.2377  0.4061  0.4878  0.0818
21    39.5253  -0.2333  0.4078  0.5122  0.1044
22    41.9588  -0.1522  0.4395  0.5366  0.0971
23    44.3276  -0.0732  0.4708  0.5610  0.0902
24    45.0259  -0.0500  0.4801  0.5854  0.1053
25    48.1024   0.0526  0.5210  0.6098  0.0888
26    49.3623   0.0946  0.5377  0.6341  0.0965
27    49.7351   0.1070  0.5426  0.6585  0.1159
28    50.6010   0.1359  0.5540  0.6829  0.1289
29    62.2574   0.5244  0.7000  0.7073  0.0073
30    68.9057   0.7460  0.7722  0.7317  0.0405
31    76.8052   1.0094  0.8436  0.7561  0.0875
32    78.6105   1.0695  0.8576  0.7805  0.0771
33    79.5384   1.1005  0.8644  0.8049  0.0596
34    84.1996   1.2559  0.8954  0.8293  0.0661
35    87.4173   1.3631  0.9136  0.8537  0.0599
36    91.1650   1.4880  0.9316  0.8780  0.0536
37    94.2047   1.5894  0.9440  0.9024  0.0416
38    95.7160   1.6397  0.9495  0.9268  0.0226
39    96.4987   1.6658  0.9521  0.9512  0.0009
40    96.8248   1.6767  0.9532  0.9756  0.0224
41    99.8979   1.7791  0.9624  1.0000  0.0376

The maximum value in the last column occurs at point 28, and so our D statistic is 0.1289. This is well below our critical value. We cannot reject our null hypothesis, and must proceed as if the points were normally distributed.

The test can also be performed by picking Z values at regular intervals, and generating F(Z) by counting the number of points whose Z values are less than each critical Z:

  Z   ECDF(Z) pts<z   F(z)   |F(z)-ECDF(z)|
----  ------- -----  ------  ------
-2.0  0.0065      0  0.0000  0.0065
-1.8  0.0276      0  0.0000  0.0276
-1.6  0.0509      0  0.0000  0.0509
-1.4  0.0792      2  0.0488  0.0304
-1.2  0.1145      4  0.0976  0.0170
-1.0  0.1585      7  0.1707  0.0122
-0.8  0.2119     11  0.2683  0.0564
-0.6  0.2743     13  0.3171  0.0428
-0.4  0.3446     17  0.4146  0.0700
-0.2  0.4207     21  0.5122  0.0915
 0.0  0.5000     24  0.5854  0.0854
 0.2  0.5793     28  0.6829  0.1037
 0.4  0.6554     28  0.6829  0.0275
 0.6  0.7257     29  0.7073  0.0184
 0.8  0.7881     30  0.7317  0.0564
 1.0  0.8413     30  0.7317  0.1096
 1.2  0.8849     33  0.8049  0.0801
 1.4  0.9192     35  0.8537  0.0656
 1.6  0.9452     37  0.9024  0.0428
 1.8  0.9641     41  1.0000  0.0359
 2.0  0.9772     41  1.0000  0.0228

(US) National Institute of Standards and Technology, Engineering Statistics Handbook. 1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm

¹US Department of Commerce, Handbook of Mathematical Functions, June 1964 26.2 (especially 26.2.18), Normal or Gaussian Probability Function, pp. 931-933

The original references appear to be

Kolmogorov, A. N. (1933) "On the empirical determination of a distribution function," (Italian) Giornale dell’Instituto Italiano degli Attuari, 4, 83-91.

Smirnov, N. V. (1939), "On the estimation of the discrepancy between empirical curves of distribution for two independent samples." (Russian) Bulletin of Moscow University, 2, 3-16.

non parametric test	z score	t-test	Nikolai Smirnov
frequency distribution	Confidence Interval	normal distribution	null hypothesis
Andrei Kolmogorov	chi square	National Institute of Standards and Technology	chi-square test
Rouge	infinite series