Construct the final dataset

Assignment Help Database Management System
Reference no: EM131469671

Data Mining Assignment

Download a good dataset from Internet. It should have 100's of records and good number of parameters (deadline for downloading your dataset: March 30, 2017).

Mention the source of the dataset

Describe the Problem the dataset is about? Explain in a few paragraphs.

Select one of the data-mining task will you apply?

  • Prediction
  • Association rule mining
  • Clustering

Upload your dataset and the above details on Google drive and ask me for recommendation. 

When is approved, apply all step in CRISP-DM process in rapidMiner. Write a detailed report of all of your work. Submit that report to me.

Business Understanding

This initial phase focuses on understanding the project objectives and requirements from a business perspective, and then converting this knowledge into a data mining problem definition, and a preliminary plan designed to achieve the objectives. A decision model, especially one built using the Decision Model and Notation standard can be used.

Data Understanding

The data understanding phase starts with an initial data collection and proceeds with activities in order to get familiar with the data, to identify data quality problems, to discover first insights into the data, or to detect interesting subsets to form hypotheses for hidden information.

Data Preparation

The data preparation phase covers all activities to construct the final dataset (data that will be fed into the modeling tool(s)) from the initial raw data. Data preparation tasks are likely to be performed multiple times, and not in any prescribed order. Tasks include table, record, and attribute selection as well as transformation and cleaning of data for modeling tools.

Modeling

In this phase, various modeling techniques are selected and applied, and their parameters are calibrated to optimal values. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the data preparation phase is often needed.

Evaluation

At this stage in the project you have built a model (or models) that appears to have high quality, from a data analysis perspective. Before proceeding to final deployment of the model, it is important to more thoroughly evaluate the model, and review the steps executed to construct the model, to be certain it properly achieves the business objectives. A key objective is to determine if there is some important business issue that has not been sufficiently considered. At the end of this phase, a decision on the use of the data mining results should be reached.

Deployment

Creation of the model is generally not the end of the project. Even if the purpose of the model is to increase knowledge of the data, the knowledge gained will need to be organized and presented in a way that is useful to the customer. Depending on the requirements, the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data scoring (e.g. segment allocation) or data mining process. In many cases it will be the customer, not the data analyst, who will carry out the deployment steps. Even if the analyst deploys the model it is important for the customer to understand up front the actions which will need to be carried out in order to actually make use of the created models.

Data Descriptions -

Background -

How can we tell the greatness of a movie before it is released in cinema?

This question puzzled me for a long time since there is no universal way to claim the goodness of movies.

Many people rely on critics to gauge the quality of a film, while others use their instincts. But it takes the time to obtain a reasonable amount of critics review after a movie is released. And human instinct sometimes is unreliable.

Question

Given that thousands of movies were produced each year, is there a better way for us to tell the greatness of movie without relying on critics or our own instincts?

Will the number of human faces in movie poster correlate with the movie rating?

To answer this question, I scraped 5000+ movies from IMDB website using a Python library called "scrapy".

Attachment:- Assignment Files.rar

Reference no: EM131469671

Questions Cloud

Find the location of the centroid of this facility : A facility has the shape shown in Figure. Using the methods described find the location of the centroid of this facility.
What can you use to support your arguments : Fixed price with targets- What would you do personally to avoid or overcome these challenges? What can you use to support your arguments?
Find the layout to minimize total materials handling costs : An initial layout for four departments and from-to charts giving distances separating departments and unit transportation costs appear in Figure.
Determine the scope of district four production warehouse : Review the project case. Determine the scope of the District 4 Production Warehouse Move project from the information provided in the case.
Construct the final dataset : The data preparation phase covers all activities to construct the final dataset (data that will be fed into the modeling tool(s)) from the initial raw data
Find the layout recommend by craft for department b c and e : An initial layout for five departments and a from-to flow data chart are given in Figure. Assuming that departments A and D are in fixed locations.
Suggest ways in which the primary stakeholders can influence : Suggest five ways in which the primary stakeholders can influence the organization's financial performance. Provide support for the response.
Define the aldep approach : Consider the rel chart for the Meat Me fast-food restaurant, given in Figure. Assume that the areas required for each department are Assume a sweep width.
Explain sources on slides that contain reference material : Identify sources on slides that contain reference material (data, dates, graphs, quotes, paraphrased words, values, etc.) and list them on a reference slide.

Reviews

Write a Review

Database Management System Questions & Answers

  Write the sql code to change the job code

Write the SQL code to change the job code to 501 for the person whose personnel number is 106. After you have completed the task, examine the results, and then reset the job code to its original value.

  Three level architecture of database

Three level architecture of database in detail

  Message of arbitrary bit length

Suppose H(m) is a collision-resistant hash function that maps a message of arbitrary bit length into an n-big hash value. Is it true that, for all messages x, x1 with x=x1, we have H(x) x=H(x1)?

  Describe the benefits and current trends of data

write a six to eight 6-8 page paper in which youquestion 1. provide an executive overview that addresses the

  Create as creenshot of each query and output data

Create as creenshot of each query and output data

  Relations of airline flight information

The relations given below keep track of airline flight information:Flights(flno: integer, from: string, to: string, distance:integer, departs: time, arrives: time, price: real)Aircraft(aid: integer, aflame: string, cruisingrange: integer)

  Devise a conceptual model that will best address

Submit the preliminary design of the database. Devise a conceptual model that will best address the scenario you selected for the final project

  Create matrix report showing territory sales totals by year

Create a matrix report showing territory sales totals by year and quarter. Provide an interpretation of the results. The interpretation must be a minimum of one paragraph (3 to 5 well-formed sentences) with no spelling or grammatical errors. Based..

  Evaluate interface design models and describe design issues

Evaluate interface design models and describe design issues across human-computer interaction environments associated with these models. Support your response.

  Erp software and service industryoriginally the sap erp

erp software and service industryoriginally the sap erp software was developed for and implemented by mostly

  Design database by developing a fully attributed data model

Design the database by developing a fully attributed data model. The model should show all tables. Each table should have a primary key and may have foreign keys.

  Completing transaction using sql

Write down the complete transaction using SQL.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd