Discuss about the discretization and removing variables

Assignment Help Management Information Sys
Reference no: EM132241083

Exercise

Deliverables: Two Files: (1) Submit this lab report with answers to all questions including output screenshots. (2) Submit an R script that contains all commands with comments that briefly describe each commands purpose.

All questions must be answered in your own words with any paraphrased references properly cited using in-text citations and a reference list as needed.

Part 2 - Run an exercise on the CreditApproval data set, completing this report and providing the commands, output screenshots, and discussion/interpretation as requested. Ensure that all commands are saved in this report and in an R script.

ii. Run the read.csv() command to load the data into a variable named ‘credit'. Then, run the command to preview the first 10 data rows in ‘credit'.

Include the command and output screenshot. Note: Ensure that you use the utils::read.csv() command and not any other similar commands from other packages.

Command: >

Output:

2 iii. Run the str() command on the ‘credit' dataset. Include the command, output screenshot, and a brief description of how the structure is presented.

Command: >

Output:

Description:

b. Descriptive Statistics:

i. Run the summary() command on the ‘credit' dataset to display the descriptive statistics for all variables. Include the command and output screenshot.

Command: >

Output:

ii. Choose two numeric attributes from ‘credit', run the summary() command on both, and provide your interpretation of each of the six descriptive statistics.

Command: >

Output:

Command: >

Output:

Interpretation:

iii. Choose two factor attributes from ‘credit', run the summary() command on both, and provide your interpretation of each of the six descriptive statistics.

Command: >

Output:

Command: >

Output:

Interpretation:

iv. What differences did you observe between the output of the str() and summary() commands (50 words)?

c. Variable Filters - Discretization and Removing Variables:

ii. Run the three different discretization methods discussed in the tutorial (equal interval, equal frequency, k-means clustering). For each method, include the command and output screenshot. For all commands, provide a one-paragraph discussion (100 words) of the input parameters used, the number of bins, and your interpretation of the output.

Command: >

Output:

Command: >

Output:

Command: >

Output:

Discussion:

iii. Compare and contrast the discretization methods above providing at least one example of when you would use each one (150-200 words).

iv. Run a command to remove one of the attributes from the ‘credit' dataset. Run another command to demonstrate that the attribute was successfully removed. Include both commands and output screenshots as well as a discussion of when and why variables should be removed from a dataset.

Command: >

Output:

Command: >

Output:

Discussion:

6

d. Row Filters - Handling Missing Values and Sorting:

i. Run a command to check if the ‘credit' dataset has any missing values. Your command and output should show all attributes along with how many observations total have missing values. Include the command and output screenshot.

Command: >

Output:

ii. Choose one of the numeric attributes with missing values and run the command to replace the missing values with the attribute mean. Then run the command to verify that the variable no longer has missing values. Include both commands and output screenshots.

Command: >

Output:

Command: >

Output:

v. Run the command to sort the ‘credit' dataset by one of the attributes. Then run the command to validate the sorting. Include both commands and output screenshots as well as a discussion where you provide at least two reasons why data should be sorted.

Command: >

Output:

Command: >

Output:

Discussion:

8

e. Data Visualization:

i. Run the plot() function for one of the variables in the ‘credit' dataset. Include the command, output screenshot, and a one-paragraph (100 words), masters-level interpretation of what the plot shows.

Command: >

Output:

Discussion:

References

Reference no: EM132241083

Questions Cloud

Discussion on the distribution strategy : This week's discussion focuses on Chapter 10: Distribution Strategy. Following your review of the material from the chapter reading, prepare to take a stand on.
Construct a query to show the expected payment date : Construct a query to show the expected payment date if invoices are due within 30 days of transaction.
What was the allocation of operating costs to this customer : Assume that servicing one of its customers required 230 miles of travel time, 11 copies, and 50 secretarial hours. What was the allocation of operating costs
Describe the importance of denormalization : Share any experience involving "dirty data" from your personal or working environment. (If you cannot identify one, provide a hypothetical situation.)
Discuss about the discretization and removing variables : What differences did you observe between the output of the str() and summary() commands?
What would revenue have to be in order for X Company : Management thinks that there will be decreased demand for its products. What would revenue have to be in 2019 in order for X Company to break even
Preparation of business plan - start up costs for year : Break-even Analysis - Preparation of business plan - Balance sheet forecast - You just need to do the excel slides. Which is at the end of the template
What are risk and benefits of preforming penetration testing : What are the pros and cons of using in-house resources or outsourcing penetration testers?
What are some other virtualization options : What are some other virtualization options and what are the pros and cons of each as compared to Vmware?

Reviews

Write a Review

Management Information Sys Questions & Answers

  Information technology and the changing fabric

Illustrations of concepts from organizational structure, organizational power and politics and organizational culture.

  Case study: software-as-a-service goes mainstream

Explain the questions based on case study. case study - salesforce.com: software-as-a-service goes mainstream

  Research proposal on cloud computing

The usage and influence of outsourcing and cloud computing on Management Information Systems is the proposed topic of the research project.

  Host an e-commerce site for a small start-up company

This paper will help develop internet skills in commercial services for hosting an e-commerce site for a small start-up company.

  How are internet technologies affecting the structure

How are Internet technologies affecting the structure and work roles of modern organizations?

  Segregation of duties in the personal computing environment

Why is inadequate segregation of duties a problem in the personal computing environment?

  Social media strategy implementation and evaluation

Social media strategy implementation and evaluation

  Problems in the personal computing environment

What is the basic purpose behind segregation of duties a problem in the personal computing environment?

  Role of it/is in an organisation

Prepare a presentation on Information Systems and Organizational changes

  Perky pies

Information systems to adequately manage supply both up and down stream.

  Mark the equilibrium price and quantity

The demand schedule for computer chips.

  Visit and analyze the company-specific web-site

Visit and analyze the Company-specific web-site with respect to E-Commerce issues

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd