Scatter diagram - correlation analysis, Applied Statistics

Scatter Diagram

The first step in correlation analysis is to visualize the relationship. For each unit of observation in correlation analysis there is a pair of numerical values. One is considered the independent variable; the other is considered dependent upon it and is called the dependent variable. One of the easiest ways of studying the correlation between the two variables is with the help of a scatter diagram.

A scatter diagram can give us two types of information. Visually, we can look for patterns that indicate whether the variables are related. Then, if the variables are related, we can see what kind of line, or estimating equation, describes this relationship.

The scatter diagram gives an indication of the nature of the potential relationship between the variables.

Example 

A sample of 10 employees of the Universal Computer Corporation was examined to relate the employees' score on an aptitude test taken at the beginning of their employment and their monthly sales volume. The Universal Computer Corporation wishes to estimate the nature of the relationship between these two variables

Aptitude Test Score

Monthly Sales (Thousands of Rupees)

Aptitude Test Score

Monthly Sales (Thousands of Rupees)

X

Y

X

Y

50

30

70

60

50

35

70

45

60

40

80

55

60

50

80

50

70

55

90

65

To determine the nature of the relationship for example, we initially draw a graph to observe the data points.

Figure 1

2406_scatter diagram.png

On the vertical axis, we plot the dependent variable monthly sales. On the horizontal axis we plot the independent variable aptitude test score. This visual display is called a scatter diagram.

In the figure given above, we see that larger monthly sales are associated with larger test scores. If we wish, we can draw a straight line through the points plotted in the figure. This hypothetical line enables us to further describe the relationship. A line that slopes upward to the right indicates that a direct, or a positive relation is present between the two variables. In the figure given above we see that this upward-sloping line appears to approximate the relationship being studied.

The figures below show additional relations that may exist between two variables. In figure 2(a), the nature of the relationship is linear. In this case, the line slopes downward. Thus, smaller values of Y are associated with larger values of X. This relation is called an inverse (linear) relation.

Figure 2

705_scatter diagram1.png

 

Figure 2(b) represents a relationship that is not linear. The nature of the relationship is better represented by a curve than by a straight line - that is, it is a curvilinear relation. The relationship is inverse since smaller values of Y are associated with larger values of X.

Figure 2(c) is another curvilinear relation. In this case, however, larger values of Y are associated with larger values of X. Hence, the relation is direct and curvilinear.

In figure 2(d), there is no relation between X and Y. We can draw neither a straight line nor a curve that adequately describes the data. The two variables are not associated.

Posted Date: 9/15/2012 4:09:26 AM | Location : United States







Related Discussions:- Scatter diagram - correlation analysis, Assignment Help, Ask Question on Scatter diagram - correlation analysis, Get Answer, Expert's Help, Scatter diagram - correlation analysis Discussions

Write discussion on Scatter diagram - correlation analysis
Your posts are moderated
Related Questions
how do i determine the 40th percentile in an ogive graph

Estimate the standard deviation of the process: Draw the X (bar) and R charts for the data given and give your comments about the process under study. Estimate the standard de

Advantages By definition, mode is the most typical or representative value of a distribution. Hence, when we talk of modal wage, modal size of shoe or modal size of family i

Mode Mode is the value of the observation which occurs with the   greatest  frequency and thus  it is the most fashionable value, Mode has been derived from French  word  La  m

We are interested in assessing the effects of temperature (low, medium, and high) and technical configuration on the amount of waste output for a manufacturing plant. Suppose that

Agency revenues. An economic consultant was retained by a large employment agency in a metropolitan area to develop a regression model for predicting monthly agency revenues ( y ).

You are a business analyst working for a company called Combined Computers Pty Ltd. You have been asked to prepare a business report with statistics in it for the managing director

Question 3 25 marks Your employer, Quick Hit Agency (QHA), is a debt collections agency. The company specializes in collecting small accounts. QHA does not deal in large accounts

Histogram: It is generally used for charting continuous frequency   distribution. In histogram, data are plotted as a series  of rectangle one over the other. Class intervals

Different analyses of recurrent events data: The bladder cancer data listed in Wei, Lin, and Weissfeld (1989) is used in Example 54.8/49.8 of SAS to  illustrate different anal