Go to: CoHort Software | CoStat | CoStat Statistics  

Nonparametric Tests in CoStat
(Including Percentiles, Rank Correlation, Runs Tests, and ANOVAs)

Most statistical procedures in CoStat (including Statistics : Correlation, Statistics : Descriptive, and parts of Statistics : Frequency Analysis and Statistics : Miscellaneous) assume that the data is normally distributed. Sometimes there are other assumptions; for example, standard ANOVAs assume that the variances of the subgroups are homogeneous. These assumptions allow the tests to make powerful inferences about the data.

For some datafiles, the assumptions are not valid. Several other tests have been devised ("nonparametric" tests) which do not make assumptions about the distribution of the data. Most of these tests rank the data and then do statistical tests with the ranked values. These tests are generally not as powerful (that is, not as good at rejecting the null hypothesis) as the traditional tests, but they are very useful when you can't use the traditional tests.

Unfortunately, there aren't replacement nonparametric tests for all of the traditional tests. CoStat has these options (on the Statistics : Nonparametric menu):

  • Percentiles - calculates nonparametric descriptive statistics: mode and percentiles.
  • Rank Correlation - Kendall's and Spearman's tests are analogous to the Pearson product moment correlation coefficient.
  • Runs Tests - 2 Runs Tests: Up and Down, and Above and Below the Median
  • Tied Ranks - This ranks the values in a column, replaces ties with the average rank, then inserts a new column with the tied rank values.
  • 1 Way, Completely Randomized ANOVA - the Kruskal-Wallis Test.
  • 1 Way, 2 Treatment, Completely Randomized ANOVA - Mann-Whitney U-test and Wilcoxon Two Sample Test.
  • 1 Way, Randomized Blocks ANOVA - Friedman's Method for Randomized Blocks.
  • 1 Way, 2 Treatment, Randomized Blocks ANOVA - Wilcoxon's Signed-Ranks Test for Two Groups.


Nonparametric Tests in the CoStat Manual

CoStat's manual has:

  • An introduction to nonparametric testing.
  • A description of the calculation methods that are used by the program.
  • 8 complete sample runs.
The sample runs show how to do 8 different types of nonparametric tests. Here is sample run #2:

Statistics : Nonparametric : Rank Correlation

Correlation is a measure of the linear association of two independent variables (X1 and X2). This procedure is analogous to the Pearson product moment correlation coefficient, but it works with the ranks of the values in each column, so it makes no assumptions about the distribution of the values.

Related Procedures

Read the general description of Statistics : Nonparametric (page 333).

Statistics : Correlation (page 275) calculates the Pearson product moment correlation coefficient.

References

See Sokal and Rohlf (1981 and 1995) "Box 15.6 (1981) (or Box 15.7, 1995) Kendall's Coefficient of Rank Correlation, tau" and "Section 15.8 (1981 or 1995) Nonparametric for association" (for Spearman's Coefficient of Rank Correlation).

Data Format

The data file must have two or more columns. The correlation of all pairs of columns will be tested for the whole data file. Missing values (NaN's, page 70) are allowed; only missing values of either of the two columns currently being tested cause rejection of the row of data.

Options

X1:
Choose the first data column.
X2:
Choose the second data column.
Keep If:
lets you enter a boolean expression (for example, (col(1)>50) and (col(2)<col(3))). Each row of the data file is tested. If the equation evaluates to true, that row of data will be used in the calculations. If false, that row of data will be ignored. See "Using Equations" (page 66).
A
This leads to a list of characters (#32 to #255, as defined by the ISO 8859-1 Character Encoding). If you click on a character, it will be inserted into the equation at the current insertion point.
f()
The f() button leads to a list of built-in functions and other parts of equations. If you click on an item, it will be inserted into the equation at the current insertion point. The list includes:
  • Data file column numbers and names (for example, "col(3) Height") - so you can refer to values in various columns in the data file. Note that equations shouldn't refer to column names, for example ("col(3)" is inserted, not "col(3) Height").
  • Built-in Functions (for example, "sin(x) d") - The parameters for the functions are described tersely, but basically: b=any boolean expression, d=any numeric (double) expression, i=any integer expression, s=any string expression, and v=void (no return value). The letter at the end of the function's signature indicates the type of the return value.
  • Constants (for example, "pi").
  • Operators (for example, "*").
See "Using Equations" (page 66).
OK
Run the procedure.
Close
Close the dialog box.

Details

For both the Kendall and Spearman correlation tests, the test statistics are similar to the product moment correlation coefficient, r, and range from -1 to 1.

If n>40, the significance of Kendall's tau can be tested by calculating a test statistic, ts, which the procedure compares to tabulated values of Student's t distribution:

ts = tau / sqrt(2*(2*n+5)/(9*n*(n-1)))

where n is the number of data pairs.

If n>10, the significance of Spearman's r can be tested by calculating a test statistic, ts, which the procedure compares to tabulated values of Student's t distribution:

ts = r / sqrt( (1-r^2) / (n-2) )

If n<=10, Spearman's r must be compared to tabular values which are not included with CoStat, but can be found in Sokal and Rohlf (1995).

The Sample Run

Data for the sample run is from Sokal and Rohlf (Box 15.6, 1981; or Box 15.7, 1995): "Computation of rank correlation coefficient between the total length (Y1) of 15 aphid stem mothers and the mean thorax length (Y2) of their parthenogenetic offspring."

PRINT DATA
2000-08-04 14:11:40
Using: c:\cohort6\box156.dt
  First Column: 1) Y1
  Last Column:  2) Y2
  First Row:    1
  Last Row:     15

   Y1        Y2     
--------- --------- 
      8.7      5.95 
      8.5      5.65 
      9.4         6 
       10       5.7 
      6.3       4.7 
      7.8      5.53 
     11.9       6.4 
      6.5      4.18 
      6.6      6.15 
     10.6      5.93 
     10.2       5.7 
      7.2      5.68 
      8.6      6.13 
     11.1       6.3 
     11.6      6.03 

For the sample run, use File : Open to open the file called box156.dt in the cohort directory and specify:

  1. From the menu bar, choose: Statistics : Nonparametric : Rank Correlation
  2. X1: 1) Y1
  3. X2: 2) Y2
  4. Keep If:
  5. OK
RANK CORRELATION (Kendall and Spearman Tests)
2000-08-04 14:13:05
Using: c:\cohort6\box156.dt
  Y1 Column: 1) Y1
  Y2 Column: 2) Y2
Keep If: 

The test statistics, Kendall's tau and Spearman's r, are similar to
  the product moment correlation coefficient, r, ranging from -1 to 1.
If the sample size is large enough (n>40 for tau and n>10 for r),
  additional test statistics can be calculated and compared to
  Student's t distribution (two-tailed, df=infinity).  Otherwise, see
  specially tabulated critical values of tau in Table S in 'Statistical
  Tables' (F.J. Rohlf and R.R. Sokal, 1995).
If P<=0.05, tau or r is significantly different from 0 and the values
  in the two columns probably are correlated.

Y1 column: 1) Y1

Y2 column                 n   Kendall tau     P        Spearman r     P
------------------- ------- ------------- --------- ------------- ---------
2) Y2                    15 0.49761335153 (n<=40)   0.64910714286 .0088 ** 

P is the probability that the variates are not correlated. The low P value (<=0.05) for this data set indicates that the two variates probably are correlated.

 


Go to: CoHort Software | CoStat | CoStat Statistics | Top