Reference no: EM132364651
SAS: Regression, Statistics, and Variables
Use the data provided in the Unit 4 Data Analytics spreadsheet, linked in the Resources. Import and convert the worksheet "Data" to an SAS data file. Run a series of regression analysis experiments.
1. Run a simple regression test against the data using SAS.
• Which variable is the dependent?
• Which variable is the independent?
• How strong is the relationship between the variables?
• What are the best values of the intercept and slope relating to the data?
The last question is used to define "best fit," which SAS will recommend based on the evaluation of the data to create the line. It is up to you to provide reasonable definitions and values, that is, experiment with changing the estimated values against the actual values contained in the dataset. By choosing reasonable values when estimating, the estimates themselves can cause errors; these errors are called residuals. To understand the impact on distance between the estimated valued and actual values, use the method of least squares.
• What values did you use for your estimates for each observation?
Explain why you used the estimate (plugged) value for the observation itself.
• What was the error?
• What is the value of the intercept?
• What is the value of the scope?
• Where is the upper boundary?
• Where is the lower boundary?
• How does answering these questions help you understand the dataset?
• How can this understanding help you predict expected levels in the future?
There are unknowns with the best model and confidence:
• What is one example of an unknown that can affect the actual outcome and render the model obsolete?
• Why is it important for a data analyst to continually evaluate and adjust analytical models on ongoing bases?
Provide the following basic statistics and elements in your paper for two variables:
2. In a table, provide the following for each variable:
• Variable name.
• Means.
• Standard deviation.
• Minimum.
• Maximum.
3. In a second table, provide:
• Frequency distribution.
4. For the third table, provide a table containing percentile calculations, the table should include:
• Quantile.
• Percentile.
• (k/100)*n.
• Result.
5. With a screen print, provide the outcome of an SAS box-and-whisker plot. An SAS box-and-whisker plot includes the same information as a five-number summary, which you learned in Unit 1, plus outliers:
6. Finally, include a screen print of a histogram for one experiment (run).
Write a paper (12-15 pages for the body section). Use APA (6th edition) style and format; include your analysis, diagrams, figures, and answers to the exercises above, with a minimum of five references. Cover the following topics:
1. Explain how SAS is used to aggregate data to information using different methods, procedures, and variables.
2. Demonstrate how SAS methods and processes are used to manipulate data to support data analytics.
3. Apply SAS to aggregate and manipulate data for data analytics.
Assignment Requirements
• Written communication: Written communication is free of errors that detract from the overall message.
• APA formatting: Resources and citations are formatted according to APA (6th edition) style and formatting.
• Length of paper: 12-15 pages, excluding the references page.
• Font and font size: Times New Roman, 12 point.
Attachment:- Data analytics.rar