Calculating the performance of the system

Assignment Help Other Subject
Reference no: EM133879934 , Length: word count:2000

Machine Learning Applications

Assessment - Design a Text Retrieval System

Task

Design a text retrieval system to find similar movies/shows based on the descriptions.

Assessment Description

We humans communicate using different languages, either by speaking or writing. Text data is abundant in the real world. It's a challenging task to work with natural languages. Your team lead has assigned you one such task of recommending movies based on the movie description.

Data

A movies/shows dataset with description is curated by pre-processing the Kaggle IMDb Movies/Shows with Descriptions dataset and is provided to you in MyKBS. You are encouraged to explore the original source.

The original dataset is pre-processed and is provided in 2 files - train.csv and test.csv. MyKBS provides you these files each containing following columns:

title: Title of the movie/show.
description: Description of the movie/show.

Problem Statement

As an individual, you are required to download the data sets, i.e., train.csv and test.csv files from MyKBS. You must build a text retrieval system to find similar movies/shows based on the descriptions. You should systematically approach the problem by addressing the below tasks:

Load the data sets and pre-process them to fit your requirements. You must use at least two pre-processing techniques.

Design a text retrieval system using TF-IDF (with inverted file) algorithm. Looking for last-minute assignment help? Grab it now!

Find the top 3 movies/shows matches in the train.csv based on the descriptions provided in the test.csv.

You are to record a 5-minute video accompanying PowerPoint slides to elaborate the approach and performance of the system using relevant metric(s). In recording this video, you will need to prepare accompanying PowerPoint slides thar are clear, concise, of the required quality and references in accordance with the Kaplan Harvard Referencing style.

Learning Objective 1: Explore programming functions to source, store and prepare data for machine learning applications.
Learning Objective 2: Design algorithmic models for the application of machine learning in information technology.
Learning Objective 3: Create advanced insights of strategic organisational value with the aid of machine learning.

Assessment Guidelines

You are required to follow the below guidelines:

You should write your Text Retrieval System code using Python programming language.

The use of any Python third-party package(s) is restricted to the following tasks:
Loading the datasets. E.g., Pandas.
Any necessary text pre-processing steps. E.g., Natural Language Toolkit, etc.
Performing necessary calculations during the building of the system. E.g., NumPy.
Calculating the performance of the system. E.g., Scikit Learn, Matplotlib, Plotly, etc.

You should NOT use any third-party package for calculating TF-IDF (with inverted file).

Reference no: EM133879934

Questions Cloud

Explain your philosophical perspectives on issues of reality : Write a four-to-five-page paper as you address the following: Explain your philosophical perspectives on the issues of reality, knowledge, and truth.
Which authors developed a new social justice model : Which authors developed a new social justice model for counseling supervision with a focus on teaching actionable skills and on social justice outcomes?
Current marketing strategy for eco water : Evaluate Manish Krishna's current marketing strategy for Eco Water. How do you think he's doing so far, and what should he do next? Why?
Which holidays to teach and what activities to use : On-the-other-hand, teacher's will have to make decisions about which holidays to teach and what activities to use.
Calculating the performance of the system : Performing necessary calculations during the building of the system. E.g., NumPy. Calculating the performance of the system
Describe buying experience : Describe a buying experience in which you were unsure of whether you wanted or needed a particular product or service but ultimately made the purchase.
Explain the assessment method or methods used to diagnosis : Pick a Psychological Perspective that you feel best explains the etiology of the disorder. Explain the assessment method or methods used to diagnosis.
Describe sides of the ethical conflict of the case : Explain what principles and standards the study would have violated if the research had been subject to the APA's code of ethics today.
Redefine the engineering landscape : Novus Engineering is preparing to launch a revolutionary new engineering toolset that promises to redefine the engineering landscape.

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd