Design algorithmic models for the application

Assignment Help Other Engineering
Reference no: EM133707674 , Length: word count:2000

Machine Learning Applications

Assessment - Design a Text Retrieval System

Coding and Presentation

Your Task

Design a text retrieval system to find similar movies/shows based on the descriptions.

Assessment Description

We humans communicate using different languages, either by speaking or writing. Text data is abundant in the real world. It's a challenging task to work with natural languages. Your team lead has assigned you one such task of recommending movies based on the movie description.

Data

A movies/shows dataset with description is curated by pre-processing the Kaggle IMDb Movies/Shows with Descriptions dataset and is provided to you in MyKBS. You are encouraged to explore the original source.

The original dataset is pre-processed and is provided in 2 files - train.csv and test.csv. MyKBS provides you these files each containing following columns:

title: Title of the movie/show.
description: Description of the movie/show.

You are required to train a text retrieval system using the train.csv file. And test the system using the test.csv file.

Problem Statement

As an individual, you are required to download the data sets, i.e., train.csv and test.csv files from MyKBS. You must build a text retrieval system to find similar movies/shows based on the descriptions. You should systematically approach the problem by addressing the below tasks:

Load the data sets and pre-process them to fit your requirements. You must use at least two pre-processing techniques. (5 marks)

Design a text retrieval system using TF-IDF (with inverted file) algorithm. (10 marks)

Find the top 3 movies/shows matches in the train.csv based on the descriptions provided in the test.csv. (5 marks)

You are to record a 5-minute video accompanying PowerPoint slides to elaborate the approach and performance of the system using relevant metric(s). In recording this video, you will need to prepare accompanying PowerPoint slides thar are clear, concise, of the required quality and references in accordance with the Kaplan Harvard Referencing style. (20 marks)

Learning Objective 1: Explore programming functions to source, store and prepare data for machine learning applications.

Learning Objective 2: Design algorithmic models for the application of machine learning in information technology.

Learning Objective 3: Create advanced insights of strategic organisational value with the aid of machine learning.

You are required to follow the below guidelines:

You should write your Text Retrieval System code using Python 3 programming language.

The use of any Python third-party package(s) is restricted to the following tasks:
Loading the datasets. E.g., Pandas.
Any necessary text pre-processing steps. E.g., Natural Language Toolkit, etc.
Performing necessary calculations during the building of the system. E.g., NumPy.
Calculating the performance of the system. E.g., Scikit Learn, Matplotlib, Plotly, etc.

Reference no: EM133707674

Questions Cloud

Designing comprehensive and ethical framework : Designing a comprehensive and ethical framework for end-of-life care is undoubtedly difficult.
What contributed to the success of allies in world war ii : What contributed to the success of the allies in World War II?
How has immigration shaped the american story : Since 1877, how has immigration shaped the American story?
Victim of explosion is transported : A victim of an explosion is transported to the emergency department for evaluation and treatment.
Design algorithmic models for the application : Machine Learning Applications - Explore programming functions to source, store and prepare data for machine learning applications
What evidence expect to find from sexual assault examination : Write a paper detailing what evidence you might expect to find from the sexual assault examination and other aspects of the examination of the sexual assault vi
Discuss the crisis you experienced : Discuss the crisis you experienced (i.e., victim of a hostage situation). You should also use outside research as applicable.
How much has changed in american life in past half-century : What do these documents suggest about how much has changed in American life in the past half-century and how much has not changed?
Traumatic Brain Injury Model Systems : Does this make sense I will use the Traumatic Brain Injury Model Systems (TBIMS) National Database, the largest national longitudinal TBI database,

Reviews

Write a Review

Other Engineering Questions & Answers

  Determine the corrected production for the dozer

Determine the corrected production for the dozer for downhill dozing on a 5% slope, with a 50 minute working hour and an average operator as per the correction factors below.

  Create the single line diagram of a substation

Create the single line diagram of a substation having the following equipments using a drawing software. You can use any drawing software available (e.g. Smartdraw, MS Visio, Microstation, AUTOCAD, Edraw etc.)

  What is the total ore resource in tonnes

What is the total ore resource in tonnes, assume an SG of 3.6, and what is it worth at today's metal prices in A - calculate the worth of an ore body recovery

  What assumptions around the way technology and humans relate

What assumptions around the way technology and humans relate to each other are embedded in The Matrix 1 "the movie"? Do you agree with them or not?

  The importance of impedance, rise-time and bandwidth

Discuss the connection between rise-time and bandwidth in amplifier circuits.

  How can internet spyware companies subvert network defenses

Why are banking transactions especially vulnerable to attack, to the extent that the banking industry recommends using a separate computer for these transaction

  Discuss mitigate the risk posed by full control access

When disabling inherited permissions on an object, what happens if you select Convert inherited permissions into explicit permissions on this object?

  Produce your definition of systems integration

Produce your definition of systems integration. Identify the main problem areas with systems integration. Identify and explain, with examples

  Are ready to enter an engineering lab to test the properties

Are you ready to enter an engineering lab to test the properties of concrete materials, assisting a team of engineers to build more durable houses?

  Legal requirements for cyber action

What are the legal requirements for a cyber action to meet the definition of an act of war?

  Why is it important to understand processing of characters

Explain the process you used in this lab that demonstrates the potential interaction of the Arduino board and other devices. why is it important to understand the processing of the characters and strings?

  Write response of given problem based on the email

Capture a spam Email message. View the Email header and copy the information to your assignment document. Only one email is necessary.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd