Correlation Grouped Data
When the number of observation is large then the data are often classified into two- way frequency distribution known as the correlation table.
The class intervals of y are listed in the captions or column headings and those for X are listed in the studs at the left side of table, the order may also be reversed. The frequencies for every cell of the table are determined by either tallying or card sorting just as in the case of frequency distribution of a single variable.
The formula for calculating the coefficient of correlation is as shown:
R = N Σ f dx dy - Σ f dx Σ f dy
√(NΣ fd)2x - (Σ f dx)2 √ N Σ fd2y - (Σ f dy)2
This formula is just same as of the assumed mean. But here there is only difference that the deviations are also multiplied by the frequencies.
The steps for the preparation of correlation table are as follows:
1) At first take the step deviations of the variable X and represent these deviations by dx.
2) In the same way take the step deviations of the variable Y and represent these deviations by dy.
3) Multiply dx dy and the respective frequency of every cell and write the figure obtained in the right-hand upper corner of every cell.
4) Add together all the cornered values as calculated in step (ii) and obtain the total Σ f dx.
5) Take the squares of the deviations of variable y and then multiply them by the respective frequencies and obtain Σ fd2x.
6) Multiply all frequencies of the variable y by the deviations of y and obtain the total Σ f dy.
7) Take the squares of the deviations of the variable y and then multiply them by the respective frequencies and obtain Σ fd2y
8) Now substitute the values of Σ fd dx dy, Σ fd x, Σ f dx Σ fd2x, Σ f dy and Σ f d2y in the above formula and obtain the value of r.
Illustration :
The following dare the marks obtained by the students of a class in statistics and accountancy.
Roll no. of students
|
Marks in statistics
|
Marks in accountancy
|
Roll no. of students
|
Marks in statistics
|
Marks in accountancy
|
1
|
15
|
13
|
13
|
14
|
11
|
2
|
0
|
1
|
14
|
9
|
3
|
3
|
1
|
2
|
15
|
8
|
5
|
4
|
3
|
7
|
16
|
13
|
11
|
5
|
16
|
8
|
17
|
10
|
10
|
6
|
2
|
9
|
18
|
13
|
11
|
7
|
18
|
12
|
19
|
11
|
14
|
8
|
5
|
9
|
20
|
11
|
7
|
9
|
4
|
17
|
21
|
12
|
18
|
10
|
17
|
16
|
22
|
18
|
15
|
11
|
6
|
6
|
23
|
15
|
15
|
12
|
19
|
18
|
24
|
7
|
3
|
Prepare a correlation table by taking ht magnitude of each class interval as four marks and the first interval as equal to 0 and less than 4. Now Calculate Karl Pearson's coefficient of correlation between the marks in statistics and marks in accountancy and comment on the correlation table.
Solution:
Preparation of correlation table:
<-----Marks in Statistics----->
Marks in accountancy ↓
|
0-4
|
4-8
|
8-12
|
12-16
|
16-20
|
Total
|
0-4
|
2
|
1
|
1
|
|
|
4
|
4-8
|
1
|
1
|
2
|
1
|
|
5
|
8-12
|
1
|
1
|
1
|
2
|
1
|
6
|
12-16
|
|
|
2
|
1
|
2
|
5
|
16-20
|
|
1
|
|
1
|
2
|
4
|
Total
|
4
|
4
|
6
|
5
|
5
|
24
|
Let marks in statistics be denoted by X and marks in accountancy by y.