Reference no: EM132274546
Assignment: The purpose of this exercise is to make practical sense of mining data streams. This assignment DOES NOT require using R Studio. The assignment consists of two parts below. Put all of your answers in the spaces provided. Answer all questions in your own words. This assignment is designed to be free from the need for external research. Should the need arise to include these, ensure that you properly cite and attribute all non-original content.
Part 1 - Website Optimization
a. Visit websiteoptimization's website and analyze any web page of your choice in the ‘Enter URL to diagnose' field. For the purpose of this exercise, it would be beneficial and easier for analysis if you choose a web page that has a perceived latency when responding to your web requests.
i. Discuss what page you elected to use for analysis and then what the website optimization report reveals about that web page. Include the report and report interpretation below.
(Hint: Focus on the ‘Analysis and Recommendations' section)
ii. Imagine that you are a website manager. Explain how you could use the website optimization report to improve the performance of your website. Provide at least two examples from the website optimization report.
Part 2 - Time Series Analysis
a. Visit datamarket 's website and choose one of the datasets in the list that would be good for time series analysis.
i. Discuss what data is in the dataset at a high-level and why you selected this data for time series analysis.
ii. Imagine that your course project will be using this dataset for time series analysis. Now, imagine what insight or purpose your project would try to uncover by exploring this dataset with time series analysis. (Note: This is a purposefully abstract prompt. The only wrong answers are those that do not use your imagination and analytical abilities).
iii. Choose one of the following time series methods and discuss how you would use the method in your study:
• Lossy Counting
• Random Sampling
• Very Fast Decision Tree (VFDT)
• Concept-Adapting Very Fast Decision Tree (CVFDT)
• Hoeffding Tree
• CluStream
• Sequential Pattern Mining
(Hint: Your weekly lab projects and course project have been building towards this form of thought so look to the structure/implementation of these for your answer.)
iv. Would the method you selected meet the purpose of the study? Are there any potential drawbacks and/or any additional considerations that must be made?