Discuss about the discretization and removing variables

Assignment Help Management Information Sys
Reference no: EM132241083

Exercise

Deliverables: Two Files: (1) Submit this lab report with answers to all questions including output screenshots. (2) Submit an R script that contains all commands with comments that briefly describe each commands purpose.

All questions must be answered in your own words with any paraphrased references properly cited using in-text citations and a reference list as needed.

Part 2 - Run an exercise on the CreditApproval data set, completing this report and providing the commands, output screenshots, and discussion/interpretation as requested. Ensure that all commands are saved in this report and in an R script.

ii. Run the read.csv() command to load the data into a variable named ‘credit'. Then, run the command to preview the first 10 data rows in ‘credit'.

Include the command and output screenshot. Note: Ensure that you use the utils::read.csv() command and not any other similar commands from other packages.

Command: >

Output:

2 iii. Run the str() command on the ‘credit' dataset. Include the command, output screenshot, and a brief description of how the structure is presented.

Command: >

Output:

Description:

b. Descriptive Statistics:

i. Run the summary() command on the ‘credit' dataset to display the descriptive statistics for all variables. Include the command and output screenshot.

Command: >

Output:

ii. Choose two numeric attributes from ‘credit', run the summary() command on both, and provide your interpretation of each of the six descriptive statistics.

Command: >

Output:

Command: >

Output:

Interpretation:

iii. Choose two factor attributes from ‘credit', run the summary() command on both, and provide your interpretation of each of the six descriptive statistics.

Command: >

Output:

Command: >

Output:

Interpretation:

iv. What differences did you observe between the output of the str() and summary() commands (50 words)?

c. Variable Filters - Discretization and Removing Variables:

ii. Run the three different discretization methods discussed in the tutorial (equal interval, equal frequency, k-means clustering). For each method, include the command and output screenshot. For all commands, provide a one-paragraph discussion (100 words) of the input parameters used, the number of bins, and your interpretation of the output.

Command: >

Output:

Command: >

Output:

Command: >

Output:

Discussion:

iii. Compare and contrast the discretization methods above providing at least one example of when you would use each one (150-200 words).

iv. Run a command to remove one of the attributes from the ‘credit' dataset. Run another command to demonstrate that the attribute was successfully removed. Include both commands and output screenshots as well as a discussion of when and why variables should be removed from a dataset.

Command: >

Output:

Command: >

Output:

Discussion:

6

d. Row Filters - Handling Missing Values and Sorting:

i. Run a command to check if the ‘credit' dataset has any missing values. Your command and output should show all attributes along with how many observations total have missing values. Include the command and output screenshot.

Command: >

Output:

ii. Choose one of the numeric attributes with missing values and run the command to replace the missing values with the attribute mean. Then run the command to verify that the variable no longer has missing values. Include both commands and output screenshots.

Command: >

Output:

Command: >

Output:

v. Run the command to sort the ‘credit' dataset by one of the attributes. Then run the command to validate the sorting. Include both commands and output screenshots as well as a discussion where you provide at least two reasons why data should be sorted.

Command: >

Output:

Command: >

Output:

Discussion:

8

e. Data Visualization:

i. Run the plot() function for one of the variables in the ‘credit' dataset. Include the command, output screenshot, and a one-paragraph (100 words), masters-level interpretation of what the plot shows.

Command: >

Output:

Discussion:

References

Reference no: EM132241083

Questions Cloud

Discussion on the distribution strategy : This week's discussion focuses on Chapter 10: Distribution Strategy. Following your review of the material from the chapter reading, prepare to take a stand on.
Construct a query to show the expected payment date : Construct a query to show the expected payment date if invoices are due within 30 days of transaction.
What was the allocation of operating costs to this customer : Assume that servicing one of its customers required 230 miles of travel time, 11 copies, and 50 secretarial hours. What was the allocation of operating costs
Describe the importance of denormalization : Share any experience involving "dirty data" from your personal or working environment. (If you cannot identify one, provide a hypothetical situation.)
Discuss about the discretization and removing variables : What differences did you observe between the output of the str() and summary() commands?
What would revenue have to be in order for X Company : Management thinks that there will be decreased demand for its products. What would revenue have to be in 2019 in order for X Company to break even
Preparation of business plan - start up costs for year : Break-even Analysis - Preparation of business plan - Balance sheet forecast - You just need to do the excel slides. Which is at the end of the template
What are risk and benefits of preforming penetration testing : What are the pros and cons of using in-house resources or outsourcing penetration testers?
What are some other virtualization options : What are some other virtualization options and what are the pros and cons of each as compared to Vmware?

Reviews

Write a Review

Management Information Sys Questions & Answers

  What is the subject of the envisaged project

What is the subject of the envisaged project? A quick statement of what you are investigating and hope to demonstrate.

  Cpm network diagrami want help with the following

cpm network diagrami want help with the following problemthe following data have been collected for a certain

  Accurate valuation of goodwill

By itself, it is valueless. Assuming that all unrelated acquisitions are at "arm's length," why is the accurate valuation of goodwill so important? Why should you be concerned about it?

  Discuss about the continuity of operations plan

Discuss the difference between a Continuity of Operations Plan (COOP), a Business Continuity Plan (BCP), and a Disaster Recovery Plan (DRP). What would be your recommendation for training personnel on your BCP and DRP at the project organization? ..

  Explain how your organization should house its backups

You organization must easily be able to recover data no older than one month, as a optional requirement.

  Draw a process flow diagram

Draw a process flow diagram and identify the bottleneck operation.- Based on your simulation recommended staffing level, what is the probability of paying off on the guarantee?

  Supply chains - coca cola and powerade

Supply Chains: Coca Cola and PowerAde - Evaluate how you would measure the performance of the supply chain.

  Estimate the total cost for the project

Estimate the costs for the resources, and add the costs to the resource information in the Microsoft Project file. Estimate the total cost for the project, using the Microsoft Project budget or cost reports.

  Create a business impact analysis on sangrafix

Create a business impact analysis on SanGrafix a video game design company. The BIA should include a descriptive list of the organization's key business areas. The BIA helps to identify and prioritize critical IT systems and components. A template..

  Forecasting methods explained in this solutionhello thank

forecasting methods explained in this solutionhello thank you for taking the time to look and consider my posting on

  How frequently will the hardware require service

What is the total cost to own the hardware? How frequently will the hardware require service? What IT security features does the hardware offer? Does the hardware support the workflow of the hospital?

  Describe the integrative functions and activities

Describe the integrative functions and activities within the information systems area, including the role of the CIO and technologies managed.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd