Designing and implementing a mini search engine

Assignment Help Computer Engineering
Reference no: EM132109281

In this project, you will be designing and implementing a mini search engine.

You are probably familiar with Google, Bing or Yahoo, which are some of the most popular search engines that you can find on the Web.

The task performed by a search engine is, as the name says, to search through a collection of documents.

Given a set of texts and a query, the search engine will locate all documents that contain the keywords in the query.

The problem may be therefore reduced to a search problem, which can be efficiently solved with the data structures we have studied in this class.

Your task is to design and implement an algorithm that searches a collection of documents. A minimum of 10 documents should be used.

You have the freedom to select the data structures and algorithms that you consider to be more efficient for this task. Of course, you will have to justify your decisions.

First, you will process the documents and store their content (i.e. words / tokens) in the data structures that you selected (in information retrieval, this phase is called indexing).

Next, for every input query, you will process the query and search its keywords in the documents, using the previously implemented data structures and an algorithm of your choice. (This phase is called retrieval).

For each such query, you will have to display the documents that satisfy the query.

Reference no: EM132109281

Questions Cloud

Describe your findings in your eportfolio : Assignment - Paperback Zone - ITECH1100 Understanding the Digital Revolution - Andres and Benjamin want to improve the consistency of how they pay
Write a function printarray to print the hexadecimal numbers : Create a .bmp file, called myBmp.bmp, that stores the 4x3 image shown in Figure 4, where the 24-bit color code marked in each component.
Calculate the volume of a cylinder tank : Write a MATLAB program that calculates the volume of a cylinder tank by receiving the radius,height, & time from the user and calculates the volume of a liquid.
Should all countries have the same environmental standards : Should all countries have the same environmental standards? What about the same labor standards? Justify your answers.
Designing and implementing a mini search engine : You are probably familiar with Google, Bing or Yahoo, which are some of the most popular search engines that you can find on the Web.
What is the breakeven price : a. If forecasted sales are 5 million tablets, what is the breakeven price? b. Should Prescott discontinue selling this product?
Write a python class definition for class sumi : Write a python class definition for class ‘Sumi' with a single instance variable self.num of type ‘int' and single instance method called "sum_of_digits".
Design and write a computer game program : The project is to design and write a C++17/FLTK computer game program with a graphical user interface.
Provide a student with a tool to use to properly identify : The purpose of this project is to provide a student with a tool to use to properly identify, count, grade and average students' scores in a course.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Read about role base access control models

Look at the XACML operational model that is claimed to be a generic RBAC implementation. Do you agree with the last statement?

  Distinguish between spin locks and suspend locks

Distinguish between spin locks and suspend locks for sole access to a critical section. Implement generalized Dekker protocol using Test & Set atomic operation.

  Write a method that will traverse a B-tree in postorder

Define postorder traversal of a B-tree recursively to mean first traversing all the subtrees of root. Write a method that will traverse a B-tree in postorder.

  Create a webpage to gather information for a national survey

Working for a data gathering company, you are asked to create a webpage to gather information for a national survey.

  What is generated by the generate controller script

What is generated by the generate controller script? What must be placed in an application's controller class? In what directory are templates placed?

  What is the difference between a station and a node

What is the difference between a station and a node? What are the main characteristics of a circuit switched network? What are its advantages and disadvantages?

  Analyze the activities involved in log management

Analyze the activities involved in log management. How to select the appropriate data to log. Give two examples for protecting the equipment in an organization.

  Describe a mechanism to access the customer records

Describe a mechanism to access the telephone customer records by telephone #. Assume that we have a huge data in hand; say a million records or so.

  What was the most interesting thing you learned from video

What was the most interesting thing you learned from this video? Do you think internet/web should be regulated? If Yes How? If no, why?

  Describe total cost of ownership and include descriptions

using the module readings and the argosy university online library resources research methods of developing proposals

  Review problem on geographic information systems

Communication is the key concept where we exchange the ideas, data, information or exchange of feelings either by e-mail, verbal communication through meetings.

  Discuss common input-output technologies

Discuss common input / output technologies. Provide a brief overview of the organizations that have developed and promoted each format.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd