Definitions of dependable and undependable variables

Statistics

Name of Student

Name of Institution

 

 

Date of submission

Assignment Three

Question One

Definitions of dependable and undependable variables

Dependent variables are variables that result with respect to what surrounds them and they represent the amount produced. Independent variables are variables that cause things to happen and represent the reason why things happen. In this problem, the amount gained in weekly rent is dependent on the size of the apartment. This makes the weekly rent the dependable variable while the apartment size is the independent variable.

i) Finding regression coefficients

The Number of values is 25.

We must now go ahead to find XY and X2,

Where XY= weekly rent X size and X2=the square of weekly rent

Next, we must now find the summation of each value differently i.e. ∑X, ∑Y, ∑XY, ∑X2

Apartment Weekly Rent ($) Size (square meters) XY Y2 X2

1 219 79 17,301 6,241 47,961

2 369 135 49,815 18,225 136,161

3 277 101 27,977 10,201 76,729

4 346 114 39,444 12,996 119,716

5 219 67 14,673 4,489 47,961

6 392 138 54,096 19,044 153,664

7 381 106 40,386 11,236 145,161

8 216 67 14,472 4,489 46,656

9 202 65 13,130 4,225 40,804

10 265 89 23,585 7,921 70,225

11 323 102 32,946 10,404 104,329

12 381 119 45,339 14,161 145,161

13 531 184 97,704 33,856 281,961

14 415 127 52,705 16,129 172,225

15 323 109 35,207 11,881 104,329

16 335 114 38,190 12,996 112,225

17 254 116 29,464 13,456 64,516

18 392 117 45,864 13,689 153,664

19 277 107 29,639 11,449 76,729

20 265 83 21,995 6,889 70,225

21 369 126 46,494 15,876 136,161

22 381 97 36,957 9,409 145,161

23 277 70 19,390 4,900 76,729

24 185 93 17,205 8,649 34,225

25 404 111 44,544 12,321 163,216

Totals 7,998 2,636 888,795 295,132 2,725,894

Using the universal Regression Equation(y) = a + bx Slope(b) = (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2) Intercept(a) = (ΣY – b(ΣX)) / N

Where

x and y are the variables. b = The slope of the regression line a = The intercept point of the regression line and the y axis. N = Number of values or elements X = First Score Y = Second Score ΣXY = Sum of the product of first and Second Scores ΣX = Sum of First Scores ΣY = Sum of Second Scores ΣX2 = Sum of square First Scores

We substitute the values we have found to come up with the equation

Regression Equation(y) = a + bx Slope(b)

= (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2) Intercept(a) = (ΣY – b(ΣX)) /

N

We first find the value of b, the formula to use to find the value of b is:

Slope (b) = (NΣXY – (ΣX)(ΣY)) / (NΣX2 – (ΣX)2)

b = (25*888,795)-(7,998*2,636)/{25*2,725,894-(7,998)2}

b = (22,219,875-21,082,728)/(68,147,350-63,968,004)

b = 1,137,147/4,179,346

b = 0.27

Intercept (a) = {ΣY – b(ΣX)} / N

= {2,636-0.27*7,998}/25

= {2,636-2,159.46}/25

= 476.54/25

= 19.06

Having found the values of (a) and (b), it now becomes easy to find the regression equation (y).

Taking (b) to be the value 277

(y) = a + bx Slope(b)

=19.06+0.27*277

=19.06+74.79

=93.85

ii) Simple linear regression equation is an analysis that statistically associates any two variables and it brings out the connection between any two variables.

These regression coefficients show that there is a high rate of dependency of the weekly rent to apartment size.

The weekly rental cost for an apartment that has 100m2 might be between 277 given that a 101m2 goes for $277 a week.

No. The data given above does not give predictions of an apartment size below 65 square meters.

I would recommend the 100 square meter apartment that goes for $294 because the price per size ratio of this apartment is bigger compared to the 120 square meter room that goes for $329.

Question Two

In order to find the linear correlation coefficient for the values in the above question, the following equation is used;

r = ∑ (xy)/√ [(∑x2) * (∑y2)]

Where ∑ = summation symbol

x = x1-x

x1= x value observation for i

x = mean x value

y = y1-y

y1= y value for observation i

y = mean y value

The mean values for x and y are;

Mean value for x = 7,998/25, = 319.92

Mean value for y = 2,363/25, = 94.52

The linear correlation coefficient (r) = ∑ (xy)/√ [(∑x2) * (∑y2)]

r = 7,998*2,363/√[2,725,894*295,132]

r = 18,899,274/(1651.03*543.26)

r = 18,899,274/896938.558

r = 21.07

This value indicates that the lesser the size of the room the cheaper the cost.

It should be noted that for a single independent variable case R2 = r2

Where R2 = Coefficient of determination & r = Simple correlation coefficient

In this case, 0≤R2≤1

When R2= 1, then the slopes becomes a perfect linear relationship

When R2 = 0, then there is no linear relationship between x and y

Estimated standard of error regression (slope)

First, we find the standard error of estimate SE, which is found by;

SE = √ [SSE/(n-k-1)]

Where SSE = Sum of square errors, n = Sample size, and k = number of

independent variables

SE = √SSE/n-2

SE = √21.07/(25-2)

SE = √21.07/23, = √ 0.9160, = 0.957

Then we find the estimated standard of error regression (slope)

Sb1 = SE / √∑x2 – {(∑x) 2 / n}

Where Sb1 = estimate of standard error of the least squares slope

Sb1 = 0.957 / √7,9982 – {7,9982/ 25}

= 0.957 / √63968004 – {63968004 / 25}

= 0.957 / √63968004 – 2558720.16

= 0.957 / √61409283.84

= 0.957 / 7836.41, = 0.00012

Testing for evidence of a linear relationship

t = (b1 – β1) / Sb1

Where: b1 = sample regression slope, β1 = hypothesized slope.

Taking the sample regression slope to be 0 and hypothesized slope to be 1, t then

becomes;

= (0-1) / 0.00012

= -8333.33, since there are no negatives in this case, the answer is said to be zero.

The linear correlation coefficient (r) shows that there is a relationship between weekly rent to that of apartment size while the evidence shown from the calculations in c above show that there is no linear relationship between apartment size and weekly rent.

Question Three

Since we are not provided with the standard deviation, we must find it in order to carry on with this question.

Number of samples (n) = 27, mean (x) = 45

Standard deviation (s) = √ {∑ (X-x) 2 / n}

= √ {(1179 – 45)2 / n}

= √ {(1134)2 / 27)}

= √ 47670

= 218.33

The standard deviation was derived from the samples; this becomes a t-test

The hypothesis here becomes:

H0: μ ≥ 30

H1: μ < 30

The test statistics is:

t = = (x-μ) / (s-√n)

= (45-30) / (218.33-√27)

= 15 / 213.13

= 0.07

The t critical value associated with this is:

-tα (n-1) = -t.2605 = 2.3661

The test statistics does not fall in the rejected region because 0.07>203661. H0 is

not rejected.

There is enough evidence that the average processing days has changed from 45 days.

Using the five number summary, we get:

Mean (x) = (16+21.75+39+64.25+92) / 5 = 46.6

Standard deviation (s) = = √ {∑ (X-x) 2 / n}

= √ {(233 – 46.6)2 / 5}

= √ {(186.4)2 / 5)}

= √ 6948.992

= 83.36

Sample number (n) = 5

The standard deviation was derived from the samples; this becomes a t-test

The hypothesis here becomes:

H0: μ ≥ 30

H1: μ < 30

The test statistics is:

t = = (x-μ) / (s-√n)

= (46.6-30) / (83.36-√5)

= 16.6 / 81.12

= 0.205

The t critical value associated with this is:

-tα (n-1) = -t.405 = 0.364

The test statistics does not fall in the reject region because 0.07>0.364. H0 is

not rejected.

In both (a) and (c), there is no rejection region because the p-value is greater than the

Alpha.

Question Four

From the information given, standard deviation (σ) = 0.05, random sample (n) = 100, and a mean (x) of 1.99. Since the standard deviation did not come from the sample, this becomes a Z-test.

The null and alternatives then become:

Ho : μ ≤ 2

H1 : μ > 2

At 0.5 level of significance, using the critical value approach

Z = (x-μ) / (σ-√n)

= (1.99-2) / (0.05-√100)

= -0.01 / -9.95

= 0.001

Probability (p) value when (σ) = 0.05, and its meaning

p-value = P(Z>0.001) = 0.00001

The probability of getting a sample whose mean is 1.99 liters or more when H0 is true is 0.00001.

When σ = 95, the population mean in bottles is:

Z = (x-μ) / (σ-√n)

= (1.99-2) / (0.95-√100)

= -0.01 / -0.05

= 0.2

Comparing conclusions in (a) and (c).

For a large test statistic, the critical value is large and its p-value is small than even the alpha value.A rejection is made in part (a) if test statistics is found in the rejection region, andA rejection is made in part (c) if the alpha is greater than the p-value.

Question Five

The probability that a share fund lost 18%

Event A1. 10% share fund loss.

Event A2. 18% share fund loss.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 18% = -0.18

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*-0.18 = -0.0144

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(0.1)(-0.008) + (-0.18)(-0.0144)]

P( A1 | B ) = 0.235

Probability that a share fund gained in value i.e. 1% profit

Event A1. 10% share fund loss.

Event A2. 1% share fund profit.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 1% = 0.001

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*0.001 = 0.00008

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(-0.1)(-0.008) + (-0.1)(0.00008)]

P( A1 | B ) = 0.897

The probability that share fund gained at least 10%

Event A1. 10% share fund loss.

Event A2. 10% share fund profit.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 10% = 0.1

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*0.1 = 0.008

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.1)(0.008)]

P( A1 | B ) = 0.5

The return for 80% of share fund was greater than what value?

Event A1. 10% share fund loss.

Event A2. 80% share fund profit.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 80% = 0.8

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*0.8 = 0.064

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.8)(0.064)]

P( A1 | B ) = 0.015

The return of 80% share fund value was greater than the value attained at 10%

The return of 90% share fund was than what value?

Event A1. 10% share fund loss.

Event A2. 90% share fund profit.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 90% = 0.9

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*0.9 = 0.072

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.9)(0.072)]

P( A1 | B ) = 0.012

The returns of 90% share fund is cannot be compared to any other and therefore it is not less than any returns made.

Event A1. 10% share fund loss.

Event A2. 95% share fund profit.

Event B. 8% standard deviation

In terms of probability, we know that:

P(A1) = 10% = -0.1

P(A2) = 95% = 0.95

P(B│A1) = 8% of A1 = 0.08*-0.1 = -0.008

P(B│A2) = 8% of A2 = 0.08*0.95 = 0.076

P( A1 | B ) =   P( A1 ) P( B | A1 )

P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 )

P( A1 | B ) =  (-0.1)(-0.008) / [(-0.1)(-0.008) + (0.95)(0.076)]

P( A1 | B ) = 0.011

At 95% share funds return, the value 0.011 can be symmetrically distributed between mean values of 0.012 and 0.015.