Data mining project

Assignment Help Other Subject
Reference no: EM132514918

DATA MINING PROJECT

Select the leaf data set for our data mining project

Business and data understanding

Leaf data set is multivariate data. It includes 40 different plant spices. It contains 340 instances and 16 attribute. Name of included attributes are given below
• Class(Species)
• Specimen Number
• Eccentricity
• Aspect Ratio
• Elongation
• Solidity
• Stochastic Convexity
• Isoperimetric Factor
• Maximal Indentation Depth
• Lobedness
• Average Intensity
• Average Contrast
• Smoothness
• Third moment
• Uniformity
• Entropy

In these attributes different plants are classified into 36 classes and specimen numbers represents that how many spacemen produced by the plant according to the class whereas rest of attributes explains the characteristics of plant. The data type of all attributes is real. All values of attributes are available in this data set. Associated task of given data set is classification. We classify the class of different plants according to the characteristics of leaf.

Modelling
According to the CRIS-DM model next phase is modelling. In this phase we select the operator that will be useful to do the classification of class attribute. We make an effort to apply the four different operators such as decision tree, Random Forest Tree, Deep Learning, and Naive Bayes. Then, Naive Bayes are unsuccessful for leaf data set because the data is in numerical form and Naive Bayes cannot handle the numerical data. As a result, we select decision tree, random forest tree, and deep learning operator to use in the process.

Next, we select the root mean squared error criteria to analyse the result. In addition to this, we select other useful operators to complete the process such as select attribute, set role, split data, apply model, and performance.

Select attribute operator: we used this operator to select the attributes for our process. This operator allows selecting which attributes should be a part of resulting.

Set role operator: This operator to set the role of attribute and target the role as label.
Split data: This operator split up the data into two or more examples sets or separates the data.
Apply model: This model is trained by another operator to the example set. The goal of this model is to predict or classify invisible data and transform the data by applying a pre-processing model.
Performance: This operator is used for statistical performance evaluation of classification tasks. This operator provides a list of performance criteria for the classification task.

Evaluation
This is the evaluation phase. In this phase, we use the operators in the Rapid miner software for the objective.

1. Decision tree: A decision tree is a set of trees, similar to set of nodes, designed to create decisions about classifying values into classes or estimating numerical target values. Each node represents a split rule for a particular attribute. This operator can handle an example set containing nominal and numeric properties. In this process, firstly import the data.

Attachment:- Data mining project.rar

Reference no: EM132514918

Questions Cloud

Random samples of the same sample size : Assume 50 random samples of the same sample size are taken from a population, and a 90% confidence interval is constructed from each sample.
Estimate the number of pistons in the sample : A machine that manufactures automobile pistons 1% is estimated to produce a defective piston of the time. Suppose that this estimate is correct
Examine pairwise correlations between variables : What R command accomplishes this simultaneously for all variables in your data frame called "mydata"?
Use of caffeine lowered birth weights : A researcher studied whether pregnant women who consumed more than 800 mg of caffeine per day had babies with a lower birth weight (in lbs).
Data mining project : Select the leaf data set for our data mining project - In these attributes different plants are classified into 36 classes and specimen numbers represents
What is the stocks value : Assume that the dividend growth rate estimate is increased to a constant 7 percent per year. What is the stock's value
ICT616 Data Resources Management Assignment : ICT616 Data Resources Management Assignment Help and Solution, Murdoch University - Assessment Writing Service
Government inspector tests the drinking water : A government inspector tests the drinking water at 10 locations around a certain city and finds the following levels of salinity (in parts per million of course
Prepare the entry Lucy should make on June : Lucy factored $2,500,000 of accounts receivable with Ethel on a without recourse basis on May 1. Prepare the entry Lucy should make on June

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd