Clusters of whiskeys that can help business decisions makers

Assignment Help Data Structure & Algorithms
Reference no: EM13981794

1. A one page single or double spaced, does not matter) Word or PDF file that has the recommendations and a brief outline of the approach you took for a) and b) below.

2. A ZIP file of your RAPIDMINER repository that has your data and process.

Whiskey Analytics

In Chapter 6 of the book "Data Science for Business" by Provost and Fawcett, there is a reference (page 144) to NYU colleague Foster Provost's desire to find Whiskeys that are similar to Bunnahabhain (he really likes this drink!!). We will use a data science approach to help Professor Provost's friend Professor Johnson. The relevant data and data-dictionary for this are posted below and was originally curated by François-Joseph Lapointe and Pierre Legendre (1994) of the University of Montréal. You will of course use machine learning to do address the issues at hand:

a) Clustering- Your goal is to suggest a few interesting Whiskies to Professor Johnson whose favorite is the Dalwhinnie. Try both hierarchical and k-means clustering, and then choose one of two methods to find some meaningful clusters of whiskeys that can help business decisions makers gain insights from the Whiskey dataset. Based on the cluster Professor Johnson's favorite whiskey falls in suggest 4-5 other whiskies to him.

b) Association rules - ProfessorJohnson and Professor Provost were overheard having a heated argument around whiskey makers preferences and understanding of the market. Provost claimed that there is a higher than random chance that those drinkers that likes a dry palate and a dry finish also liked a whiskey that was dry on the nose, "and that's why any distiller worth his name in salt would make em' that way." Provost claimed that his Scottish grandmother told him so. You have been hired by Bapna as a well-trained data scientist to verify this claim from actual compositions of whiskey (hint: this time using association rules mining). Please also suggest a few interesting patterns of association that you can discern from that data with respect to the traits/characteristics of Scotch whiskies.

c) BONUS-- See if you can replicate the table below from the book. You only have to worry about the Distance column, not the labels that go with it. (see page 146 of the attached book pages)

It is important to note that these category values are not mutually exclusive (e.g., Aber¬lour's palate is described as medium, full, soft, round and smooth). In general, any of the values can co-occur (though some of them, like Color being both light and smoky, never do) but because they can co-occur, each value of each variable was coded as a separate feature by Lapointe and Legendre. Consequently there are 68 binary features of each whiskey.

Foster likes Bunnahabhain, no we can use Lapointe and Legendre's representation of whiskeys with Euclidean distance to find similar ones for him. For reference, here is their description of Bunnahabhain:
• Color. gold
• Nose: fresh and sea
• Body: firm, medium, and light
• Palate: sweet, fruity, and clean
• Finish: full

Here is Bunnahabhain's description and the fivesingle-malt Scotches most similar to Bunnahabhain, by increasing distance:

Whiskey         Distance    Descriptors

Bunnahabhain  -              gold; firm,mallight; sweetfruitriean; fresh,sea; full

Glenglassaugh  0.643       gold; firm,fight,smooth; sweet,grass; fresh,grass

Tullibardine      0.647       gold; firm,med,smooth; sweet,fruit,full,grass,clean; sweet; big,arome,sweet

Ardbeg            0.667       sherry; firm,med,fulklight; sweet; dry,peat,sea;salt

Bruichladdich    0.667       pale; firm,light,smooth; dry,sweeksmoke,clean; light; full

Glenmorangie    0.667       p.gold; med,oily,light; sweekgrass,spice; sweet,spicy,grass,sea,fresh; full,long

Using this list we could find a Scotch similar to Bunnahabhain. At any particular shop we might have to go down the list a bit to find one they stock, but since the Scotches are ordered by similarity we can easily find the most similar Scotch (and also have a vague idea as to how similar the closest available Scotch is as compared to the alternatives that are not available).

This is an example of the direct application of similarity to solve a problem. Once we understand this fundamental notion, we have a powerful conceptual tool for approach
Attachment:- scotch1.xlsx

Reference no: EM13981794

Questions Cloud

The current revolution a communications revolution : In what way (or ways) is the current Knowledge Revolution a child of the industrial revolution, is this a new revolution or simply an extension of the 18th-century revolution?
How you went about making the information : For the workplace to achieve its goals - that is, to function effectively and efficiently - the information must be checked for reliability, be presented in a readable, easily digestible format, and be preferably in such a shape that the person using..
What minority group instituted it the goals : Chose one movement (Black Power, Women's Liberation, Gay Liberation, Chicano Rebellion, Red Power, or Yellow Power), and identify what minority group instituted it, the goals they sought, the methods they employed, and the results they achieved.
Describe why or why not a federal law requiring : Describe why or why not a federal law requiring all state to provide a mental health evaluation for people seeking gun would be permissible?
Clusters of whiskeys that can help business decisions makers : Try both hierarchical and k-means clustering, and then choose one of two methods to find some meaningful clusters of whiskeys that can help business decisions makers gain insights from the Whiskey dataset.
Discuss in three pages how the image of islam gradually : Discuss in three pages how the image of Islam gradually changes among the elites in western-Europe. You can talk about Maxime Rodinson.
How to justify stakeholders inclusion in the project : After Jim leaves, you and the rest of the team get busy discussing how to conduct a stakeholder analysis and how to justify stakeholders' inclusion in the project communication.
Health insurance is a major issue facing u s adults : What is the probability that a randomly selected adult will be an uninsured older adult (age 35 and older)?
What is the closest a ball can land beyond the goal post : For what kick angles will the ball clear the goal posts? What is the closest a ball can land beyond the goal post for a made field goal?

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Implement a hash structure for the contributor data

At this point, you decide to implement a Hash structure for the contributor data to prepare for searches. You will read the contributor information from a file provided; it is a comma delimited (CSV) file

  Write pseudo-code for the given problem

Think of scenarios when you would use a) a while-loop, b) a do-until loop, c) a for-loop. Write one pseudo-code example for each

  Calculate halstead''s basic measures on the factorial code

Calculate Halstead's basic measures on the triangle code from Problem 5 and Calculate Halstead's basic measures on the factorial code given below:

  Write control structure-pseudocode algorithm for simple task

Three simple control structures which could be used to make this algorithm. What do you believe is most difficult part of creating algorithm?

  A sparse matrix is a matrix populated primarily with zeros

a sparse matrix is a matrix populated primarily with zeros. nbspclassical matrix multiplication is too inefficient for

  Spreadsheet to compute projected total costs and profits

Prepare a spreadsheet to compute your projected total costs, total revenues, and total profits for giving seminar on cost estimating.

  Design randomized algorithm for solving decoding problem

The Viterbi algorithm is a deterministic algorithm for solving the Decoding problem. Design a randomized algorithm for solving the Decoding problem.

  Replace the letter n with the letter g and alter the pointer

Then how do I replace the letter N with the letter G and alter the pointers so that the new letter appears in the list in its proper place in alphabetical order?

  Java program to assign passengers seats in airplane

Prepare a Java program to assign passengers seats in an airplane. Suppose a small airplane with seats numbered as follows:

  Design a gui and implement tic tac toe game in java

Design a GUI and implement Tic Tac Toe game in java

  Question about edge connectivity

The edge connectivity of an indirected graph is minimum number k of edges that must be removed to disconnect the graph.

  Describe a fast algorithm for finding the integer

Describe a fast algorithm (with ~N array lookups of A) for finding the integer in A that is repeated. Can you give the algorithm ASAP?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd