Statistical Analysis

Regression model:

 

Investigation of possible relationship between 2 variables:

 

Examples:

Is there any relationship between smoking Cigarette and Cancer?

Is there any relationship between Environmental factors and Cancer

Is there any relationship between Genetic factors and Cancer

Is there any relationship between X and Y

 

Is there any relationship between Adverting Expenditure and Sales

Is there any relationship between Price and Sales

Is there any relationship between Quality and Sales

Is there any relationship between Location and Sales

 

Is there any relationship between X and Y

 

Investigation of possible relationship between 2 variables:

  • Tabular Methods
  • Graphical Methods
  • Numerical Methods

 

  • Tabular Methods

The use of the Cross Tabs allows us to make a comment on the possible relationship between X and Y.

 

  • Graphical Methods

The use of the scatter Plotallows us to make a comment on the possible relationship between X and Y:

X and Y appear to be:

  • Not related
  • Related in a non-linear fashion (Polynomial, Exponential, Log etc.)
  • Related in a linear fashion :
  • Is it a direct linear relationship?
  • Is it an indirect linear relationship?

 

  • Numerical Methods

Regression and correlation analysis will be used to make a comment on the possible relationship between X and Y. The main idea is to:

  • Calculate and interpret the intercept (b0) and the slope (b1).
  • Calculationand interpret the Coefficient of Determination(r2).
  • Calculate and interpret the Correlation Coefficient (r).
  • Test the significance of relationship (from sample to Population).
  • Conduct forecasting.

 

 

 

Steps in Regression Analysis:

  • Step 1- Model Building – Purpose is to establish a Cause and effect relationship between the Dependent (Y) and The Independent variables ( X1, X2,,….. )

Y = f (X1, X2, X3, X4,…….)Multivariate Model

 

  • Step 2- Specification Step– Purpose is to simplify the model as a “Simple” “Linear” model.

 

Number of Independent variables:

  • Simple model ( only ONE independent variable)

        Y = f (X)Simple Model

 

  • Multivariate model ( More than ONE independent variable)

Y = f (X1, X2, X3, X4,…….)

Type of Functional relationship:

  • Linear relationship

Y = b0 + b1X

 

 

Intercept (b0): At X = 0, the Y is EXPECTED (or estimated) to be b0

 

Slope (b1): For every additional unit increase in X, Y is estimated to increase (decrease if negative) by b1

 

 

  • Non-linear relationship i.e. Exponential, Cubic, Log etc

 

Y = b0 + b1X2

Y = b0 + b1X1/3* X1/2

 

 

 

 

  • Step 3- Data collection step
  • Time series Data (Historical Data)- Collecting data across different Time Periods.
Elements Variable 1 Variable 2
Time X Y
2008    
2009    
2010    
2011    
2012    

 

 

 

Elements Variable 1 Variable 2
Time X Y
January    
February    
March    
April    
May    

 

 

  • Cross- Sectional Data – Collecting data across different observations (elements)
Elements Variable 1 Variable 2
Countries X Y
Afghanistan

 

   
     
     
     
Zimbabwe    

 

Elements Variable 1 Variable 2
Companies X Y
 IMB    
Google    
Microsoft    
     
     

 

 

 

 

 

  • Step 4- Visualization Step – Purpose is to visualize (see) the pattern. Does it appear to be any type of relationship? Is the relationship non-linear? If linear, does it appear to be a direct (positive) or an Indirect (Negative) relationship?

 

 

  • Step 5- Estimation Step– Purpose is to estimate (and Interpret) the Intercept and the slope.

 

Step 6- Forecasting – Purpose is to use the pattern to conduct forecasting.

 

 

 

Chapter 12 Regression and Correlation Analysis

 

  1. Develop the scatter diagram for the data set. Comment on the possible relationship between X and Y.
  2. Calculate the intercept(b0) and the slope(b1).Interpretyour findings.
  3. For a given level of X, forecast the
  4. Calculation the Coefficient of Determination and Interpret
  5. Calculate the Correlation Coefficient (r) and Interpret.

 

  True Values Estimated Values
  All Some
  Population Parameter Sample Statistics
Average μ X- Bar
Standard Deviation σ S
Proportion (%) π P- Bar
Intercept β0 b0
Slope β1 b1
Correlation Coefficient ρ r

 

Part1. Comment on the possible relationship between X and Y

Make a comment about the following:

 

The use of the scatter Plot allows us to make a comment on the possible relationship between X and Y:

X and Y appear to be:

  • Not related
  • Related in a non-linear fashion (Polynomial, Exponential, Log etc.)
  • Related in a linear fashion :
  1. Is it a direct linear relationship?
  2. Is it an indirect linear relationship?

 

Do you see any pattern? What does the pattern look like? Does it appear to be a linear or a non-linear relationship? If linear, does it appear to be “direct” or “indirect” linear relationship?

 

It appears that there is a directlinear relationship between “# of TV ads” and “# of cars sold”.

 

 

 

 

 

 

 

 

 

 

 

 

Part 2– Interpretations of the intercept (b0) and the slope (b1):

 

Y^ = b0 + b1X

 

 

Intercept (b0): At X = 0, the Y is EXPECTED (or estimated) to be b0

Slope (b1): For every additional unit increase in X, Y is estimated to increase (decrease if negative) by b1

 

 

Y^ = 10+ 5X

 

If we run NO TV ads “# of cars sold” is expected to be 10 cars.

For every additional TV ads,“# of cars sold” is expected to increase by 5 cars.

 

 

 

 

 

Part 3– For a given level of X, forecast the Y.

 

Y^ = b0 + b1X

 

In a typical case (homework), the value of X will be provided. Simply insert the given value of X in the above equation and finish the calculation.

 

Y^ = 10+ 5(5) = 35

 

If we run 5 TV ads, “# of cars sold” is expected to be 35.

 

Part 4. Coefficient of Determination(r2)

 

Calculation the Coefficient of Determination and Interpret it.

The Coefficient of Determination (r2 ) is a measure of the goodness of fit of a linear model to observed data. The value of the (r2 ) is always expressed as a percentage and it varies between 0 =<r2 =< 1.00

Overall, the larger the r2, the better the fit.

 

For a complete interpretation, make the following three comments about r2:

  • Comment on the goodness of fit. (Tip: If r2> .65 it is a good fit, if r2>0.80 it is an excellent fit).

Example: For a sample of …. randomly selected ……, the linear model provides a …(good, excellent etc)…. fit.

  • What % of variations in Y^ is EXPLIAINED by the variations in X. (the main factor).
  • What % of variations in Y^ is EXPLIAINED by the variations in other influencing factors. 1-r2

 

  • For a sample of 5 randomly selected time period, the linear model provide an excellent fit to the observed date.
  • 72% of variations in “# of cars sold” is explained by “# of TV ads”.
  • 28% of variations in “# of cars sold” is explained by other influencing factors i.e. Management, Price …

 

 

 

 

 

 

 

 

Part 5.Correlation Coefficient

 

The Correlation Coefficient (r) is a measure of the strength as well as the direction of a linear model to observed data. The value of the (r) is always expressed as a percentage and it varies between  -1.00 =<r =< 1.00

 

Overall, the larger the r, the stronger the relationship

 

Tips regarding theDIRECTION of the linearrelationship:

If the sign of the “r” is positive, then use the adjective of “Direct”. If the sign of the “r” is negative use the adjective of “Indirect

 

Tips regarding theSTRENGHT of linear relationship:

Disregard the sign, if the absolute value of “r” is within the following ranges, then use the suggested adjectives to interpret your findings:

r >0.70 Strong
r >0.85 Extremely Strong
r <0.30 Weak
r <0.15 Extremely Weak
0.40< r < 0.60 Medium
0.60< r < 0.70 Semi-strong
0.30< r < 0.40 Semi-weak

 

For a sample of 5 randomly selected time period, there is direct extremely strong linear relationship between “#of cars sold” and “# of TV ads”.

Regression model:

Purpose: In simple linear regression, a model will be used to describe the relationship between a single dependent variable y and a single (or multiple) independent variable(s) x.

Model Building:Either a simple or multiple regression modelsare initially posed as a hypothesis concerning the relationship among the dependent and independent variables.

Example: As an illustration of regression analysis and the least squares method, suppose a university medical centre is investigating the relationship between stress and blood pressure. Assume that both a stress test score and a blood pressure reading have been recorded for a sample of 20 patients. The data are shown graphically in the figure below, called a scatter diagram. Values of the independent variable, stress test score, are given on the horizontal axis, and values of the dependent variable, blood pressure, are shown on the vertical axis. The line passing through the data points is the graph of the estimated regression equation: y = 42.3 + 0.49x. The parameter estimates, b0 = 42.3 and b1 = 0.49, were obtained using the least squares method.

Correlation:
Correlation and regression analysis are related in the sense that both deal with relationships among variables. The correlation coefficient is a measure of linear association between two variables. Values of the correlation coefficient are always between -1 and +1.

A correlation coefficient of +1 indicates that two variables are perfectly related in a positive linear sense, a correlation coefficient of -1 indicates that two variables are perfectly related in a negative linear sense, and a correlation coefficient of 0 indicates that there is no linear relationship between the two variables.

Neither regression nor correlation analyses can be interpreted as establishing cause-and-effect relationships. They can indicate only how or to what extent variables are associated with each other. The correlation coefficient measures only the degree of linear association between two variables. Any conclusions about a cause-and-effect relationship must be based on the judgment of the analyst.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Application of the Simple Linear Regression

Example 1:  Simple regression line

Reed Auto periodically has a special week-long sale.  As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale.

Data from a sample of 5 previous sales are shown on the right.

X Y
# of TV Ads # of cars sold
1 14
3 24
2 18
1 17
3 27

 

Y^ = b0 +  b1 X     Estimated Regression Equation

 

  • Slope for the Estimated Regression Equation

 

 

 

 

 

 

 

y-Intercept for the Estimated Regression Equation

 

 

 

 

 

 

Let’s forecast the # of auto sales if we run “5” TV ads.

 

Estimated Regression Equation

Y^ = b0 +  b1 X

 

 

Example:  Relationship between Total Cost and Production Volume

 

Below is the monthly data which depicts the relationship between Total Cost and Production Volume for the last 6 month.

#21 X Y
Observatios Production Volume Total Cost
Jan. 400 $4,000
Feb. 450 $5,000
March 550 $5,400
April 600 $5,900
May 700 $6,400
June 750 $7,000

 

 

Let’s answer the following questions:

 

 

  1. Develop the scatter diagram for the data set. Comment on the possible relationship between X and Y.
  2. Calculate the intercept (b0) and the slope (b1). Interpret your findings.
  3. Test of ( b1 ). Conduct a test of the significance of linear relationship between X and Y. Interpret your findings.
  4. Calculation the Coefficient of Determination and Interpret
  5. Calculate the Correlation Coefficient (r) and Interpret.
  6. Test of ( r ). Conduct a test of the significance of the Strength of linear relationship. Interpret your findings.
  7. For a given level of X, forecast the

 

 

 

 

 

 

 

 

 

 

Place a custom essay order similar to this or any related topic. NB: The assignment paper will be written from scratch as per your instructions and it will be 100% original. It will pass all plagiarism check.