Application of machine learning in information technology

Assignment Help Computer Engineering
Reference no: EM133974464 , Length: Word Count:2000

Machine Learning Applications

Assessment - Design a Text Retrieval System

Type: Coding and Presentation

Task

Design a text retrieval system to find similar movies/shows based on the descriptions.

Assessment Description

We humans communicate using different languages, either by speaking or writing. Text data is abundant in the real world. It's a challenging task to work with natural languages. Your team lead has assigned you one such task of recommending movies based on the movie description.

Data

A movies/shows dataset with description is curated by pre-processing the Kaggle IMDb Movies/Shows with Descriptions dataset and is provided to you in MyKBS. You are encouraged to explore the original source.

The original dataset is pre-processed and is provided in 2 files - train.csv and test.csv. MyKBS provides you these files each containing following columns:

title: Title of the movie/show.
description: Description of the movie/show.

You are required to train a text retrieval system using the train.csv file. And test the system using the test.csv file.

Problem Statement

As an individual, you are required to download the data sets, i.e., train.csv and test.csv files from MyKBS. You must build a text retrieval system to find similar movies/shows based on the descriptions. You should systematically approach the problem by addressing the below tasks:

Load the data sets and pre-process them to fit your requirements. You must use at least two pre-processing techniques.

Design a text retrieval system using TF-IDF (with inverted file) algorithm. Enjoy trusted, budget-friendly assignment help from today onward!

Find the top 3 movies/shows matches in the train.csv based on the descriptions provided in the test.csv.

You are to record a 5-minute video accompanying PowerPoint slides to elaborate the approach and performance of the system using relevant metric(s). In recording this video, you will need to prepare accompanying PowerPoint slides thar are clear, concise, of the required quality and references in accordance with the Kaplan Harvard Referencing style.

Learning Objective 1: Explore programming functions to source, store and prepare data for machine learning applications.

Learning Objective 2: Design algorithmic models for the application of machine learning in information technology.

Learning Objective 3: Create advanced insights of strategic organisational value with the aid of machine learning.

Assessment Guidelines

You are required to follow the below guidelines:

You should write your Text Retrieval System code using Python 3 programming language.

The use of any Python third-party package(s) is restricted to the following tasks:
Loading the datasets. E.g., Pandas.
Any necessary text pre-processing steps. E.g., Natural Language Toolkit, etc.
Performing necessary calculations during the building of the system. E.g., NumPy.
Calculating the performance of the system. E.g., Scikit Learn, Matplotlib, Plotly, etc.
You should NOT use any third-party package for calculating TF-IDF (with inverted file).
You should ONLY use the provided files, i.e., train.csv and test.csv for training/testing your system.

Reference no: EM133974464

Questions Cloud

Three scenarios are related to the control function : The three scenarios are related to the control function. Explain the type of control that is applied in each of the scenario.
Why ehl employees are resistant to the change : Discuss three (3) possible reasons why EHL employees are resistant to the change.
How do the gaap ensure consistency and transparency : How do the Generally Accepted Accounting Principles (GAAP) ensure consistency and transparency in financial reporting across different organizations?
Hubble contact lense : Hubble possesses a significant and sustainable differential advantage with respect to any segment within the group of urban millennial women
Application of machine learning in information technology : Design algorithmic models for the application of machine learning in information technology and Create advanced insights of strategic organisational value
What is the interest expense for the first year : Elaine, Inc., issued a seven-year non-interest-bearing note with a face value of $20,000 and received $13,301. What is the interest expense for the first year?
Discuss the four attachment styles : Please discuss one of the four attachment styles and the characteristics that these children display.
Calculate deferred tax expense or benefit : Calculate current tax expense. Calculate deferred tax expense or benefit. Prepare the journal entry to record the income tax provision.
Low-cost leader or differentiator : What is Arcor's competitive advantage and how will it help it enter potential markets? Is Arcor an Emerging Giant?

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd