Introduction
 Nominal Data
नाम मात्रको measurement scale लाई Nominal Data भनिन्छ यो सबैभन्दा आधारभूत प्रकारको Data हो, जहाँ अंकले अन्तर्निहित क्रम वा संख्को मान बिना लेबलको मात्र प्रतिनिधित्व गर्दछ। जस्तै लिङ्गः पुरुष, महिला
 वैवाहिक स्थितिः अविवाहित विवाहित, सम्बन्धविच्छेद
 रक्त समूहः A, B, AB, O
 Ordinal Data
क्रम मात्रको measurement scale लाई Ordinal Data भनिन्छ जसले एक विशिष्ट क्रम वा श्रेणी जनाउछ। यस्ता क्रमबद्ध डेटाका उदाहरणहरू तल दिएको छ। शिक्षा स्तर :हाई स्कूल, स्नातक, स्नातकोत्तर, डॉक्टरेट
 सन्तुष्टि मूल्याङ्कन :धेरै असन्तुष्ट, असन्तुष्ट, तटस्थ, सन्तुष्ट, धेरै सन्तुष्ट
 उत्पादनहरूको श्रेणीकरण :1st, 2nd, 3rd
 Continuous Data
संख्याको मात्रा बुझाउने measurement scale लाई Continuous Data भनिन्छ जसले अंशात्मक मानहरू सहित दिइएको दायरा भित्र कुनै पनि मान लिन सक्छ। जस्तै तापक्रम
 समय

ratio Data
अनुपात डेटा एक प्रकारको निरन्तर डेटा हो जसमा absolute zero हुन्छ, जस्तै उचाइ
 वजन
 बालबालिकाको संख्या
Introduction
A statistical method to test hypothesis where data is often nominal or ordinal, is called Nonparametric test. Therefore “fewer and weaker” than parametric tests. For this reason, we often use parametric tests if/when possible
Nonparametric test पनि hypothesis test गर्ने statistical method हो । यस test को प्रयोग साधारणतया निम्न अवस्थामा गरिन्छ ।
 nominal or ordinal scale मा data भएमा
 population को distribution उल्लेख नभएमा वा population को distribution normal नभएमा
Parametric Tests  Nonparametric Tests 
Population is normal,assumed normal, approximate normal  Population is not normal, but continuous 
Uses the population parameter  Do not uses population parameter 
Data are on interval or ratio scale  Data can be of nominal or ordinary scale 
Shape of the distribution is required  Shape of the distribution is not required 
Populations have equal variance  Populations may not have equal variance 
Mean is index of central location  Median is index of central location 
Types of NonParametric test
The common nonparametric tests are
 Sign test
 Wilcoxon's test
 MannWhitney U test
 KruskalWallis test
 Friedman's test
 Run test
Sign Test
The sign test is one of the simplest nonparametric test. It is used for
 one sample
 two repeated (or correlated) samples
The usual null hypothesis for this test is that there is no difference between the two treatments. If this is so, then the number of + signs (or  signs) should have a binomial distribution with \( p=0.5 \) and \( q=0.5 \) and the number of subjects \( n \) .
Therefore the sign test will be proceed according to
 Small sample  binomial distribution
 Large sample  normal approximation with \( z= \frac{xn p}{\sqrt{npq}} \)
Scoring procedure
To operate sign test, we execute following procedure
 subtract score from mean \( (x \mu) \)
 write down the sign of difference \( (x \mu) \)
 write “” if the difference score is negative, and “+” if the difference score is positive
 if the difference score is zero, discard sign, there will be no sign
 In the case of tied scores, make one a “+” and another one “” sign.
if there is an even number of subjects with tied scores, make half of them “+” signs, and half “” signs. For an odd number, drop one randomly selected subject, and then proceed as for an even number.
Test Procedure
 Count total observations of plus signs = \( x \)
 Count total signed observations =\( n \)
 Use Binomial statistic if \( n \leq 20 \)
 Use Z statistic if \( n > 20 \)
 right tail test को लागी, calculate \( P (X \geq x) \)
 left tail test को लागी, calculate \( P (X \leq x) \)
 two tail test को लागी, calculate \( P (X = x) \)

The following are measurements of height in cm of a college students. Use the sign test to test the null hypothesis \( \mu= 160 \) against the alternative hypothesis \( \mu < 160 \) at 0.05 level of significance.
163, 165, 160, 189, 161 , 171 , 158, 151, 169, 162, 163
139, 172, 165, 148 , 166 , 172, 163, 187 173
Solution
Given that, \( \mu= 160 \), therefore, we assign the “+” sign for the data value \( > 160 \) and “”sign for the data value \( < 160 \), we discard the sign for data value \( =160 \)
Then we get
163+, 165+, 160 (Discard), 189+, 161+, 171+, 158, 151, 169+, 162+, 163+
139, 172+, 165+, 148, 166+, 172+, 163+, 187+, 173+
Reading, the sign , we get
number of positive sign x= 15
number of sign n=19
Now, \( H_0:\mu \ge 160 \)
\( H_1:\mu < 160 \) [Lefttailed]  \( \alpha =0.05\)
 Since sample size is small, we use binomial distribution
Thus, we reject \(H_0\) if pvalue is less or equal to 0.05  According to binomial distribution table
\( P(x \leq 15, n=19,p=0.5) \)
= \(1 P(x =16,x =17,x =18,x =19) \)
= \( 0.99\)  Since probability is P =0.99, which is greater than \( \alpha =0 .05 \)
Thus, we cannot reject \(H_0\).
Interpretation: \(\mu \ge 160 \).
 \( H_0:\mu \ge 160 \)
 The following are measurements of height in cm of a college students. Use the sign test to test the null hypothesis \( \mu= 160 \) against the alternative hypothesis \( \mu > 160 \) at 0.05 level of significance.
163, 165, 160, 189, 161, 171 , 158, 151, 169, 162, 163, 139
172, 165, 148, 166 , 172, 163, 187, 173Solution
163+, 165+ , 160 (Discard), 189+, 161+, 171+ , 158, 151, 169+, 162+
Given that, \( \mu= 160 \), therefore, we assign the “+” sign for the data value \( > 160 \) and “”sign for the data value \( > 160 \), we discard the sign for data value \( =160 \)
Then we get
163+, 139, 172+, 165+, 148, 166+, 172+ , 163+, 187+, 173+
Reading, the sign , we get
number of positive sign x= 15
number of sign n=19
Now, \( H0:\mu \le 160 \)
\( H1:\mu > 160 \) [Righttailed]  \( \alpha =0.05\)
 Since sample size is small, we use binomial distribution
Thus, we reject \(H_0\) if pvalue is less or equal to 0.05  According to binomial distribution table
\( P(x \geq 15, n=19,p=0.5)\)
= \( P(x =15, x =16,x =17,x =18,x =19)\)
= \( 0.0096\)  Since probability is P =0.0096, which is less than \( \alpha =0 .05 \)
Thus, we reject \(H_0\).
Interpretation: \( \mu > 160 \).
 \( H0:\mu \le 160 \)
 The following are measurements of height in cm of a college students. Use the sign test to test the null hypothesis \( \mu =160 \) against the alternative hypothesis \( \mu \ne 160 \) at 0.05 level of significance.
163, 165, 160, 189, 161, 171 , 158, 151, 169, 162, 163, 139
172, 165, 148, 166 , 172, 163, 187, 173Solution
163+, 165+ , 160 (Discard), 189+, 161+, 171+ , 158, 151, 169+, 162+, 163+, 139
Given that, \( \mu= 160 \), therefore, we assign the “+” sign for the data value \( > 160 \) and “”sign for the data value \( < 160 \), we discard the sign for data value \( =160 \)
Then we get
172+, 165+, 148, 166+, 172+ , 163+, 187+, 173+
Reading, the sign , we get
number of positive sign x= 15
number of sign n=19
Now, \( H0:\mu= 160 \)
\( H1:\mu \ne 160 \) [Twotailed]  Since sample size is small, we use binomial distribution
Thus, we reject \(H_0\) if pvalue is less or equal to 0.05  According to binomial distribution table
\( P(x = 15, n=19,p=0.5) =0.0074 \)  Since probability is P =0.007, which is less than \( \alpha =0 .05 \)
Thus, we reject \(H_0\).
Interpretation: \(\mu \neq 160 \).
 \( H0:\mu= 160 \)

The following data, are amount of sulfur oxides emitted by a large industrial plant. Use sign test if \( \mu=21.5 \) against \( \mu \neq 21.5 \) at 0.01 level.
17, 15, 20, 29, 19, 18, 22, 25
27, 9, 24 ,20, 17, 6, 24, 14
15, 23, 24, 26, 19, 23, 28, 19
16, 22, 24, 17, 20, 13, 19, 10
Solution
Given that, \( \mu=21.5 \), therefore, we assign the “+” sign for the data value \( \mu > 21.5 \) and “”sign for the data value \( \mu < 21.5 \).
We discard the sign for data value \( \mu=21.5 \).
Then we get
17 15, 20, 29+, 19, 18, 22+, 25+
27+, 9, 24+, 20, 17, 6, 24+, 14
15, 23+ ,24+, 26+, 19, 23+, 28+, 19
16, 22+, 24+ ,17 ,20, 13, 19 ,10
23+, 18, 31+, 13, 20, 17, 24+ ,14
Reading, the sign , we get
number of positive sign x= 16
number of sign n=40
Now \( H0:\mu= 21.5 \)
\( H1:\mu \neq 21.5 \) [Twotailed]  Since sample size is large, we use z statistic
Thus,
\( Z_{\frac {\alpha}{2}}=Z_{\frac{0.05}{2}}=Z_{0.025}=1.96 \)
Upper critical value 1.96 (we reject H0 if Zvalue is greater or equal to 1.96)
Lower critical value 1.96 (we reject H0 if Zvalue is less or equal to 1.96)  According to formula
\( Z=\frac{xnp}{\sqrt{npq}}=\frac{1640 \times 0.5}{\sqrt{40 \times 0.5 \times 0.5}}=\frac{4}{\sqrt{10}}=1.26 \)  Since Z =1.26, which does not lie in critical region
Thus, we cannot reject H0.
Interpretation: \( \mu= 21.5 \).
 \( H0:\mu= 21.5 \)

To determine the effectiveness of a new traffic control system, the number of accidents that occurred at 12 dangerous intersections during four weeks before and four weeks after the installation of the new system were observed and the following data were
obtained.
Number of accidents before 3 5 2 3 3 3 0 4 1 6 4 1 Number of accidents after 1 2 0 2 2 0 2 3 3 4 1 0 Use sign test to test null hypothesis that new traffic control system is only as effective as the old system. Use 0.05 level.
Solution The data are in pair, thus the reading differences are
difference= beforeafter 2 3 2 1 1 3 2 1 2 2 3 1 Now, we assign +ve sign for the +ve data value and –ve sign for ve data value
Then we getdifference= beforeafter 2 3 2 1 1 3 2 1 2 2 3 1 sign + + + + + +  +  + + + Reading, signed data, we get
number of positive sign x= 10
number of sign n=12
Now \( H_0:\mu_1= \mu_2 \)
\( H_1:\mu_1 > \mu_2 \) [Righttailed]
\( \alpha =0.05 \)  Since sample size is small, we use binomial distribution
Thus, we reject H0 if pvalue is less or equal to 0.05  According to binomial distribution table
\( P(x \geq 10, n=12,p=0.5) =P(x=10,x=11,x=12)=0.019 \)
Thus, p value is 0.0193  Since, pvalue 0.0193, which is less than \( \alpha = 0.05 \), we reject H0.
Interpretation:
The data provide sufficient evidence to indicate that new traffic control system is effective as old system.
 \( H_0:\mu_1= \mu_2 \)
Wilcoxon Signedrank Test
Wilcoxon signedrank test (Sign Rank Test) is a nonparametric test. It can be used for
 onesample
 a paired sample
In sign test, we consider direction of the difference only, but not the magnitude of the difference. But, in signedrank rest, we also consider magnitude (to some degree) of difference.
To operate Wilcoxon signedrank test, we sort differences of data based on absolute values (i.e., discarding the sign). Then we assign rank to the differences ignoring the signs (i.e. assign rank 1 to the smallest difference, rank 2 to the next etc). If there are tied ranks, we give mean of the ranks they would have if they were not tied. If null hypothesis is true, then sum of positive ranks and sum of negative ranks are expected to be roughly equal. But if null hypothesis is false, we expect one of the sums to be quite small/large.
Critical Values
\( \ne \)  \( T \le T_\alpha \)  Two tailed 
\( < \)  \( T^ \le T_{2_\alpha }\)  Left tailed 
\( > \)  \( T^+ \le T_{2_\alpha }\)  Right tailed 
Summary Steps
 Calculate difference \( X  \mu \)
 Ignore if difference is zero, reduce n accordingly
 Rank the differences ignoring their sign
 Calculate sum of ranks of positive differences \( T^+ \)
 Calculate sum of ranks of negative differences \( T^\)
 Calculate \( T=min\{ T^+, T^ \} \)
 If \( n \le 15 \), use Wilcoxon signedrank test
The rule is that if T is equal to or less than \( T_{critical} \), we reject the null hypothesis.
otherwise Z test with
\( Z=\frac{T \mu}{\sigma} \)
where \( \mu=\frac{n(n+1)}{4}, \sigma^2=\frac{n(n+1)(2n+1)}{24} \)
 The following are 15 measurements of weights in kg:
97.5, 95.2, 97.3, 96.0, 96.8, 100.3, 97.4, 95.3
93.2, 99.1, 96.1, 97.6, 98.2, 98.5, 94.9
Use sign ranked test at 0.05 level of significance to test whether the mean weight is 98.5.
Solution
Given that, \( \mu=98.5\), therefore, we calculate difference as \( d=x\mu\), and ignore the data 98.5X 97.5 95.2 97.3 96 96.8 100.3 97.4 95.3 93.2 99.1 96.1 97.6 98.2 94.9 d 1 3.3 1.2 2.5 1.7 1.8 1.1 3.2 5.3 0.6 2.4 0.9 0.3 3.6 d 1 3.3 1.2 2.5 1.7 1.8 1.1 3.2 5.3 0.6 2.4 0.9 0.3 3.6 R 4 12 6 10 7 8 5 11 14 2 9 3 1 13 Form the sample data \( n=14\)
 \( T^+ \) sum of the ranks of the positive differences = 2+8= 10
 \( T^ \) sum of the ranks of the negative differences =95
 \( T=min\{ T^+,T^\} = 10 \)
Now,
 \( H_0: \mu =98.5 \)
\( H_1: \mu \ne 98.5 \)
\( \alpha = 0.05 \)  Since, test is concerning sign rank, we use \( T\) statistics
Thus, \( T_{\alpha}=Z_{0.05}=21 \) for n=14  Based on sample data, value of test statistic is T=10
 Here, \( T=10 \) is less than 21, so \( H_0 \) is rejected.
Interpretation: The mean weight is \( \mu \ne 98.5 \)
Kruskal Wallis HTest
The KruskalWallis (Htest) is also called KruskalWallis oneway analysis of variance by ranks. It is for use with k independent samples, where k is equal to or greater than 3, and measurement is at least ordinal. (When k = 2, we use the MannWhitney Utest instead). Since, samples are independent, they can be of different sizes.
To operate KruskalWallis Htest, we combine the data. Then assign rank to the whole (i.e. assign rank 1 to the smallest, rank 2 to the next etc.) If there are tied ranks, give mean of the ranks they would have if they were not tied.
Summary Steps
 Rank the data as whole.
 Calculate the sum of the ranks of each sample (Ri).
 Compute statistic
\( H=\left [ \frac{12}{n(n+1)} \displaystyle \sum_{i=1}^k \frac{R_i^2}{n_i}\right ] 3(n+1) \)
where k = the number of samples  Reject null hypothesis as per chi square statistics
 The final grades of samples from three groups of students who were taught the same mathematics course by three different methods are as follows
1st Method 94 88 91 74 2nd method 82 82 79 {} 3rd Method 98 67 72 76 Use Htest to test the null hypothesis that three methods are equally effective at 0.05 level of significance.
Solution
Method Rank 1st 2nd 3rd 1st 2nd 3rd 74 79 67 3 5 1 88 82 72 8 6.5 2 91 82 76 9 6.5 4 94 98 10 11 Sum =30 Sum= 27 Sum =18 From the sample data
 \( R_1= \) sum of ranks occupied by 1st sample = 30
 \( n_1= \) the number of cases in the 1st sample =4
 \( R_2= \) sum of ranks occupied by 2nd sample= 27
 \( n_2= \) the number of cases in the 2nd sample =3
 \( R_3= \) sum of ranks occupied by 3rd sample= 18
 \( n_3= \) the number of cases in the 3rd sample =4
 \( n= \) the total number of cases all sample =11
Now,
 \( H_0: \) All means are equal
\( H_1: \) All means are not equal
\( \alpha = 0.05 \)  Since, test is concerning 3 sample, we use \( H\) statistics
Thus,
\( H_{\alpha, 4,4,3}=H_{0.05}=5.57 \) for \(n_1=5,n_2=5\)  Based on sample data, value of test statistic is
\( H=\left [ \frac{12}{n(n+1)} \displaystyle \sum_{i=1}^k \frac{R_i^2}{n_i}\right ] 3(n+1) =6.67\)  Here, \( H=6.67 \) is greater than 5.57, so \( H_0 \)is rejected.
Interpretation: Three methods are not equally effective.
Run Test: Test for Randomness
Run test is a statistical procedure to examine whether a string of data are occurring randomly or not. It is a nonparametric statistical test that checks a randomness for a twovalued data sequence.
Run is basically defined as the set of identical (or related) symbols contained between two different symbols. It is sequence of letters of one kind surrounded by another. For instance, if a sample of 22 responses is
MMMMFFFMFFFFMFFFFMMMFF
M= male, F=female
Then run counts starting from MMMM and ending with FF, so there are 8 runs
In a numerical data, a run is defined as a series of increasing values or a series of decreasing values. The number of increasing, or decreasing, values is the length of the run.
Scoring Procedure
 Count the runs
 Compute the numbers \( n_1,n_2\)
 Use R statistic if \( n_1,n_2\) both less than 15
The rule is that if R lies outside the interval of \( u_\alpha , u_\alpha ' \) we reject the null hypothesis.
otherwise use Z with
\( Z=\frac{R\mu}{\sigma} \)
where \( \mu=\frac{2n_1n_2)}{n_1+n_2}+1, \sigma^2=\frac{2n_1n_2(2n_1n_2n_1n_2)}{(n_1+n_2)^2(n_1+n_21)} \)
 In 22 tosses of a coin, the following sequence of heads (H) and tails (T) is obtained:
HHHHTTTHHHHHHHTTHHTTTT
Test at 0.05 significance level whether the sequence is random.
Solution
 \( H_0: \) sequence is random
\( H_1: \) sequence is not random
\( \alpha = 0.05 \)  Since, test is concerning runs, we use \( R\) statistics
Thus,
\( u_{\alpha}=u_{0.025}=6 \) and \( u_{\alpha}'=u_{0.025}'=17 \) for \( n_1=13,n_2=9\)  Based on sample data, value of test statistic is
R=6  Here, \( R=6 \) lies in critical region, so \( H_0 \) is rejected.
Interpretation: The coin tosses are not random.
 \( H_0: \) sequence is random
No comments:
Post a Comment