Statistics
Name of Student
Name of Institution
Date of submission
Assignment Three
Question One
Definitions of dependable and undependable variables
Dependent variables are variables that result with respect to what surrounds them and they represent the amount produced. Independent variables are variables that cause things to happen and represent the reason why things happen. In this problem, the amount gained in weekly rent is dependent on the size of the apartment. This makes the weekly rent the dependable variable while the apartment size is the independent variable.
i) Finding regression coefficients
The Number of values is 25.
We must now go ahead to find XY and X2,
Where XY= weekly rent X size and X2=the square of weekly rent
Next, we must now find the summation of each value differently i.e. ∑X, ∑Y, ∑XY, ∑X2
Apartment Weekly Rent ($) Size (square meters) XY Y2 X2
1 219 79 17,301 6,241 47,961
2 369 135 49,815 18,225 136,161
3 277 101 27,977 10,201 76,729
4 346 114 39,444 12,996 119,716
5 219 67 14,673 4,489 47,961
6 392 138 54,096 19,044 153,664
7 381 106 40,386 11,236 145,161
8 216 67 14,472 4,489 46,656
9 202 65 13,130 4,225 40,804
10 265 89 23,585 7,921 70,225
11 323 102 32,946 10,404 104,329
12 381 119 45,339 14,161 145,161
13 531 184 97,704 33,856 281,961
14 415 127 52,705 16,129 172,225
15 323 109 35,207 11,881 104,329
16 335 114 38,190 12,996 112,225
17 254 116 29,464 13,456 64,516
18 392 117 45,864 13,689 153,664
19 277 107 29,639 11,449 76,729
20 265 83 21,995 6,889 70,225
21 369 126 46,494 15,876 136,161
22 381 97 36,957 9,409 145,161
23 277 70 19,390 4,900 76,729
24 185 93 17,205 8,649 34,225
25 404 111 44,544 12,321 163,216
Totals 7,998 2,636 888,795 295,132 2,725,894
Using the universal Regression Equation(y) = a + bx Slope(b) = (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2) Intercept(a) = (ΣY – b(ΣX)) / N
Where
x and y are the variables. b = The slope of the regression line a = The intercept point of the regression line and the y axis. N = Number of values or elements X = First Score Y = Second Score ΣXY = Sum of the product of first and Second Scores ΣX = Sum of First Scores ΣY = Sum of Second Scores ΣX2 = Sum of square First Scores
We substitute the values we have found to come up with the equation
Regression Equation(y) = a + bx Slope(b)
= (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2) Intercept(a) = (ΣY – b(ΣX)) /
N
We first find the value of b, the formula to use to find the value of b is:
Slope (b) = (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2)
b = (25*888,795)-(7,998*2,636)/{25*2,725,894-(7,998)2}
b = (22,219,875-21,082,728)/(68,147,350-63,968,004)
b = 1,137,147/4,179,346
b = 0.27
Intercept (a) = {ΣY – b(ΣX)} / N
= {2,636-0.27*7,998}/25
= {2,636-2,159.46}/25
= 476.54/25
= 19.06
Having found the values of (a) and (b), it now becomes easy to find the regression equation (y).
Taking (b) to be the value 277
(y) = a + bx Slope(b)
=19.06+0.27*277
=19.06+74.79
=93.85
ii) Simple linear regression equation is an analysis that statistically associates any two variables and it brings out the connection between any two variables.
These regression coefficients show that there is a high rate of dependency of the weekly rent to apartment size.
The weekly rental cost for an apartment that has 100m2 might be between 277 given that a 101m2 goes for $277 a week.
No. The data given above does not give predictions of an apartment size below 65 square meters.
I would recommend the 100 square meter apartment that goes for $294 because the price per size ratio of this apartment is bigger compared to the 120 square meter room that goes for $329.
Question Two
In order to find the linear correlation coefficient for the values in the above question, the following equation is used;
r = ∑ (xy)/√ [(∑x2) * (∑y2)]
Where ∑ = summation symbol
x = x1-x
x1= x value observation for i
x = mean x value
y = y1-y
y1= y value for observation i
y = mean y value
The mean values for x and y are;
Mean value for x = 7,998/25, = 319.92
Mean value for y = 2,363/25, = 94.52
The linear correlation coefficient (r) = ∑ (xy)/√ [(∑x2) * (∑y2)]
r = 7,998*2,363/√[2,725,894*295,132]
r = 18,899,274/(1651.03*543.26)
r = 18,899,274/896938.558
r = 21.07
This value indicates that the lesser the size of the room the cheaper the cost.
It should be noted that for a single independent variable case R2 = r2
Where R2 = Coefficient of determination & r = Simple correlation coefficient
In this case, 0≤R2≤1
When R2= 1, then the slopes becomes a perfect linear relationship
When R2 = 0, then there is no linear relationship between x and y
Estimated standard of error regression (slope)
First, we find the standard error of estimate SE, which is found by;
SE = √ [SSE/(n-k-1)]
Where SSE = Sum of square errors, n = Sample size, and k = number of
independent variables
SE = √SSE/n-2
SE = √21.07/(25-2)
SE = √21.07/23, = √ 0.9160, = 0.957
Then we find the estimated standard of error regression (slope)
Sb1 = SE / √∑x2 – {(∑x) 2 / n}
Where Sb1 = estimate of standard error of the least squares slope
Sb1 = 0.957 / √7,9982 – {7,9982/ 25}
= 0.957 / √63968004 – {63968004 / 25}
= 0.957 / √63968004 – 2558720.16
= 0.957 / √61409283.84
= 0.957 / 7836.41, = 0.00012
Testing for evidence of a linear relationship
t = (b1 – β1) / Sb1
Where: b1 = sample regression slope, β1 = hypothesized slope.
Taking the sample regression slope to be 0 and hypothesized slope to be 1, t then
becomes;
= (0-1) / 0.00012
= -8333.33, since there are no negatives in this case, the answer is said to be zero.
The linear correlation coefficient (r) shows that there is a relationship between weekly rent to that of apartment size while the evidence shown from the calculations in c above show that there is no linear relationship between apartment size and weekly rent.
Question Three
Since we are not provided with the standard deviation, we must find it in order to carry on with this question.
Number of samples (n) = 27, mean (x) = 45
Standard deviation (s) = √ {∑ (X-x) 2 / n}
= √ {(1179 – 45)2 / n}
= √ {(1134)2 / 27)}
= √ 47670
= 218.33
The standard deviation was derived from the samples; this becomes a t-test
The hypothesis here becomes:
H0: μ ≥ 30
H1: μ < 30
The test statistics is:
t = = (x-μ) / (s-√n)
= (45-30) / (218.33-√27)
= 15 / 213.13
= 0.07
The t critical value associated with this is:
-tα (n-1) = -t.2605 = 2.3661
The test statistics does not fall in the rejected region because 0.07>203661. H0 is
not rejected.
There is enough evidence that the average processing days has changed from 45 days.
Using the five number summary, we get:
Mean (x) = (16+21.75+39+64.25+92) / 5 = 46.6
Standard deviation (s) = = √ {∑ (X-x) 2 / n}
= √ {(233 – 46.6)2 / 5}
= √ {(186.4)2 / 5)}
= √ 6948.992
= 83.36
Sample number (n) = 5
The standard deviation was derived from the samples; this becomes a t-test
The hypothesis here becomes:
H0: μ ≥ 30
H1: μ < 30
The test statistics is:
t = = (x-μ) / (s-√n)
= (46.6-30) / (83.36-√5)
= 16.6 / 81.12
= 0.205
The t critical value associated with this is:
-tα (n-1) = -t.405 = 0.364
The test statistics does not fall in the reject region because 0.07>0.364. H0 is
not rejected.
In both (a) and (c), there is no rejection region because the p-value is greater than the
Alpha.
Question Four
From the information given, standard deviation (σ) = 0.05, random sample (n) = 100, and a mean (x) of 1.99. Since the standard deviation did not come from the sample, this becomes a Z-test.
The null and alternatives then become:
Ho : μ ≤ 2
H1 : μ > 2
At 0.5 level of significance, using the critical value approach
Z = (x-μ) / (σ-√n)
= (1.99-2) / (0.05-√100)
= -0.01 / -9.95
= 0.001
Probability (p) value when (σ) = 0.05, and its meaning
p-value = P(Z>0.001) = 0.00001
The probability of getting a sample whose mean is 1.99 liters or more when H0 is true is 0.00001.
When σ = 95, the population mean in bottles is:
Z = (x-μ) / (σ-√n)
= (1.99-2) / (0.95-√100)
= -0.01 / -0.05
= 0.2
Comparing conclusions in (a) and (c).
For a large test statistic, the critical value is large and its p-value is small than even the alpha value.A rejection is made in part (a) if test statistics is found in the rejection region, andA rejection is made in part (c) if the alpha is greater than the p-value.
Question Five
The probability that a share fund lost 18%
Event A1. 10% share fund loss.
Event A2. 18% share fund loss.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 18% = -0.18
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*-0.18 = -0.0144
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(0.1)(-0.008) + (-0.18)(-0.0144)]
P( A1 | B ) = 0.235
Probability that a share fund gained in value i.e. 1% profit
Event A1. 10% share fund loss.
Event A2. 1% share fund profit.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 1% = 0.001
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*0.001 = 0.00008
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(-0.1)(-0.008) + (-0.1)(0.00008)]
P( A1 | B ) = 0.897
The probability that share fund gained at least 10%
Event A1. 10% share fund loss.
Event A2. 10% share fund profit.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 10% = 0.1
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*0.1 = 0.008
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.1)(0.008)]
P( A1 | B ) = 0.5
The return for 80% of share fund was greater than what value?
Event A1. 10% share fund loss.
Event A2. 80% share fund profit.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 80% = 0.8
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*0.8 = 0.064
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.8)(0.064)]
P( A1 | B ) = 0.015
The return of 80% share fund value was greater than the value attained at 10%
The return of 90% share fund was than what value?
Event A1. 10% share fund loss.
Event A2. 90% share fund profit.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 90% = 0.9
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*0.9 = 0.072
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.9)(0.072)]
P( A1 | B ) = 0.012
The returns of 90% share fund is cannot be compared to any other and therefore it is not less than any returns made.
Event A1. 10% share fund loss.
Event A2. 95% share fund profit.
Event B. 8% standard deviation
In terms of probability, we know that:
P(A1) = 10% = -0.1
P(A2) = 95% = 0.95
P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008
P(B│A2) = 8% of A2 = 0.08*0.95 = 0.076
P( A1 | B ) = P( A1 ) P( B | A1 )
P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )
P( A1 | B ) = (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.95)(0.076)]
P( A1 | B ) = 0.011
At 95% share funds return, the value 0.011 can be symmetrically distributed between mean values of 0.012 and 0.015.