Prepare heritage data for classification learning

Assignment Help PL-SQL Programming
Reference no: EM131299370

Database Assignment 1:

1. Using heritage data (release 1) in SQL

a. Find support for all single itemsets

b. List all itemsets with 2 elements and support of at least 0.2

c. List all itemsets with 3 elements and support at least 0.2

2. In Weka

a. Load heritage data (release 1)

b. Apply at least two association rule generation algorithms and compare results

c. Apply FPTree algorithm with at least two measures of rule metrics

Assignment 2:

1. In SQL/Weka:

a. Prepare heritage data for classification learning

b. Load heritage data release 3 (preprocessed to binary representation, including demographics and output attribute(s))

c. Perform exploratory analysis

d. Create at least three classification models for predicting hospitalization based on Year 1 data.

e. Which model performs the best on year 2 data?

f. Create regression model for predicting hospitalization days.

g. What is the difference between regression and classification models?

h. Present your results in a form of short report that includes screenshots, tables, an d needed description.

Assignment 3:

Classification Part 2

1. Using heritage release 3 data prepared last assignment

a. Include drug information into data

b. Include laboratory information into data

c. Import newly created data into Weka and run classification algorithms

d. Does inclusion of the information improve predictions?

There are many ways to complete question 4, so you need to make different decisions.

Try not to overcomplicate the problem.

2. In Weka using heritage 3 dataset

a. Apply kmeans algorithm for k=2, 3, 5, 10

b. Apply EM algorithm. What is the optimal number of clusters obtained by EM?

c. Compare the created clusters to classification based on hospitalization in year 2.

Assignment 4:

3.Using the data table shown below.

a.Calculate distance between all points in 1
-norm, 2
-norm and infinity
-norm. Show dissimilarity matrix.

b. Is there any need to preprocess the data to be more suitable for clustering? If so, describe the operations and show the resulting data table.

c.Apply k
-means clustering algorithm with k=2.

Using the data table shown below.

a. Calculate distance between all points in 1-norm, 2-norm and infinity-norm. Show dissimilarity matrix.

b. Is there any need to preprocess the data to be more suitable for clustering? If so, describe the operations and show the resulting data table.

c. Apply k-means clustering algorithm with k=2.

ID

Age

BMI

Gender

Total Cholesterol

1

30

24

M

180

2

70

19

M

190

3

65

26

M

220

4

40

32

F

260

Assignment 5:

-Text Mining

1. Write regular expression to:

a. detect zip codes in text

b. Find last names of all patients whose first name is John (note that regular expressions may have some false positives/false negatives).

2. List challenges in automatically retrieving ICD-9 codes from clinical notes. Search literature for to find relevant published work. Also, include own observations and comments.

3. Using the SMS data

a. Split data into training (80%) and testing (20%) sets

b. Build naïve Bayes classifier for detecting spam based on bag of words

i. List all words in the documents

ii. Count occurrences in spam and ham

iii. Assign likelihoods P(word|spam) and P(word|ham) for all words

iv. Convert test data into list of words. For each message you need, 2 columns: message id and word

v. Classify test data. This can be done by a series of joins with the data prepared in (iii).

vi. Calculate accuracy of your model (accuracy, precision, recall)

Reference no: EM131299370

Questions Cloud

Problem regarding the amount of money : Assume you have $100 in cash, $500 in your checking account, and $2,000 in savings. According to the M1 definition (cash plus checking account balances) the amount of money you have is?
How can we ethically test new drugs for aids : . This is a strong example of the conflict between doing the best we know for patients now and finding better treatments for other patients in the future. How can we ethically test new drugs for AIDS?
What is happening in construction : Can someone please answer this question for me the correct way! Entry and exit of firms-What is happening in construction?
Write a two-page paper following the directions : Write a two-page paper following the directions within the textbook on Case Project 6-4, Case Project 7-2, and Project 1-3. Include a title page and separate reference page
Prepare heritage data for classification learning : Load heritage data release 3 (preprocessed to binary representation, including demographics and output attribute(s)) - Perform exploratory analysis - Create at least three classification models for predicting hospitalization based on Year 1 data.
Design questionnaire to satisfy roxanne freemans information : Critically evaluate the questionnaire.- Will Canterbury Travels gain the information it needs from this survey?-  Design a questionnaire to satisfy Roxanne Freeman's information needs.
Represent situation of restaurants with an e r diagram : Each menu has many menu items, and items can appear on multiple menus, and with different prices on different menus. Represent this situation of restaurants with an E-R diagram.
Provide another recommendation : Provide another recommendation.- Just conduct more research on the area of expanding menu. Read the file for detailed instruction.
Describe national trends that will affect the brand : Give a brief description of the company, the selected brand, its functionality and/ value offering.- Describe national trends that will affect the brand.

Reviews

Write a Review

PL-SQL Programming Questions & Answers

  Sql statement which select names and owners of great danes

Write SQL statement which would select each of the following: names and owners of all Great Danes and all attributes of poodles whose balance is no greater than $50.

  Describe all system privileges found in sql server

Describe all system privileges found in SQL Server

  Design and implement the best deal business database

In this assignment, you are to design and implement the Best Deal business database that you have modelled in the assignment-1 and a series of SQL queries to reflect the business logic of the Best Deal.

  Write an insert statement that adds this row

Write an INSERT statement that adds this row to the Categories table: category_name: Brass and Code the INSERT statement so MySQL automatically generates the category_id column

  Data processing of an international based organization

Imagine that you have been hired as a consultant to assist in streamlining the data processing of an international based organization that sells high-end electronics. The organization has various departments such as payroll, human resources, finan..

  Write a pl-sql block to select the name of the employee

Write a PL/SQL block to select the name of the employee with a given salary value. You will be using the MESSAGES table that was created for a previous assignment.

  Using the oracle developer data modeler tool

Credit will be given to queries that are not trivial, for example, "SELECT * FROM Tablename" is unlikely to gain more than one mark. SQL Developer can be used to produce the queries and populate the tables.

  Population of alligators on the kennedy space

In 1970 the population of alligators on the Kennedy Space Center grounds was estimated to be 300. In 1980 the population had grown to an estimated 1500. Using the Malthusian law for population growth, estimate the alligator population on the Kenne..

  Problem of select command in sql

Write a SELECT statement that returns these columns from the Products table: The ListPrice column A column that uses the CAST function to return the ListPrice column with 1 digit to the right of the decimal point A column that uses the CONVERT fun..

  Write an sql statement to list lastname

Write an SQL statement to list LastName, FirstName, and Phone of the customers who made the purchase with SaleIDs 1, 2, and 3. Use a subquery.

  What does each allow or restrict

Roles and Profiles are used by Oracle to define and control access and privileges of groups of users. Compare and contrast the concepts of role and profile.

  Sql concepts and database design

The Strayer Oracle Server may be used to test and compile the SQL Queries developed for this assignment. Your instructor will provide you with login credentials to a Strayer University maintained Oracle server.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd