Design three boolean queries

Assignment Help Operation Management
Reference no: EM132221478

Question 1:

Suppose you have joined a search engine development team to design a search algorithm based on both the Vector model and the Boolean model. You have collected the following documents (unstructured) and plan to apply an index technique to convert them into an inverted index.

Doc 1:Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing.

Doc 2:Information retrieval is finding material of an unstructured nature that satisfies an information need from within large collections

Doc 3:Information systems is the study of complementary networks of hardware and software that people and organizations use to collect, filter, process, create, and distribute data. In the process of creating the inverted index, please complete the following steps:

a. Remove all stop words and punctuation, and then apply Porter's stemming algorithm to the documents. The list of stop words for this task is provided as follows: Is, The, Of, To, An, A, From, Can, Be, On, Or, That, Within, And, Use

b. Create a merged inverted list including the within-document frequencies for each term.

c. Use the index created in part (b) to create a dictionary and the related posting file.

d. You may like to test the inverted index by using the following keywords: information, system, index

e. Please design three Boolean queries, (for example, web AND search) and list the relevant documents for each query.

f. Please use the Vector model to query on the inverted index, and compare the result with the Boolean model. (Hint: you can use cosine similarity and set a similarity threshold). Question 2 (IR Evaluation)

Question 2 (IR Evaluation)

In this question, you are required to evaluate the performance of different search engines.

First, please find two search engines you are familiar with, such as Google, Bing, Yahoo!, etc.

Second, please choose one target from the following list, and design two queries to search in both search engines. So both query 1 and query 2 have to be tested in both search engines.

i. Target 1: obtain the course information for S779.

ii. Target 2: obtain the price of the new Samsung Tablet.

iii. Target 3: obtain the manual of installing tera term.

iv. Target 4: obtain the oracle SQL tutorial.

v. Target 5: obtain the price of new Xbox one.

Third, select the first 20 results in both search engines, if they return the target, then mark them as relevant documents, otherwise, they are irrelevant. Assume that you have 14 relevant documents in total (retrieved and not-retrieved).

The following questions are based on your search results.

a) List your target, results and designed search queries (You can use any keywords you think are related to the target).

Get the precision and recall values for 20 documents for query 1 in search engine 1. Interpolate them to 11 standard recall levels. Then plot them into a chart.

Get the precision and recall values for 20 documents for query 2 in search engine 1. Interpolate them to 11 standard recall levels. Then plot them into the same chart as above.

Now find the average precision of query 1 and query 2 for search engine 1 and plot it into the same chart.

So you will have total of 3 curves in one single chart.

b) List your target, results and designed search queries

Get the precision and recall values for 20 documents for query 1 in search engine 2. Interpolate them to 11 standard recall levels. Then plot them into a chart.

Get the precision and recall values for 20 documents for query 2 in search engine 2. Interpolate them to 11 standard recall levels. Then plot them into the same chart as above.

Now find the average precision of query 1 and query 2 for search engine 2 and plot it into the same chart.

So, you will have total of 3 curves in one single chart, separate to that of part (a). Plot the averages for Search Engine 1 and Search Engine 2 on a separate chart, and compare the algorithms in terms of precision and recall. Which search engine do you think is superior? Why?

Reference no: EM132221478

Questions Cloud

How nurse manager or leader play a role in the reengineering : How does the nurse manager or leader play a role in the reengineering of health care? Continuous quality improvement (CQI) is the responsibility of all nurses.
Explain the employment-at-will doctrine : Explain the employment-at-will doctrine. How does the role of a labor union change this approach to labor relations?
Assess the readiness of the health care organization : Your essay should assess the readiness of the health care organization or network in addressing the health care needs of citizens in the next decade.
Calculate the capacity of each machine center : Determine how much extra capacity he can get without causing another operation to become the bottleneck.
Design three boolean queries : Information retrieval is finding material of an unstructured nature that satisfies an information need from within large collections.
What is your assessment of the role of company leadership : What is your assessment of the role of the company’s leadership, policies, operating practices and organizational culture in the implementation of its strategy?
Why nursing theory is important to today nursing practice : Think about your unique nursing practice specialty area and the population you serve. Is there a clear connection to practice and theory in your specific.
What is stericycles business unit level strategy : Define market segmentation and explain how it is a marketplace condition. What is Stericycles Business Unit Level Strategy?
Identify and discuss two to three specific laws : Identify and discuss two to three specific laws that apply to the company. For example, the laws mentioned in the EEOC readings.

Reviews

Write a Review

Operation Management Questions & Answers

  Work for an employment agency

Consider that your work for an employment agency that specializes in recruiting nursing staff for major hospitals. Your company needs to hire a variety of different types of nurses (prenatal, triage, and so on) in the coming months, and you are part ..

  Describe structure and size of health care organization

Describe the structure and size of a health care organization that has a board of directors and a formal committee structure.

  Identify the behaviors often incorporated by entrepreneurial

Identify the effective general leadership traits that have contributed to The Little Guys' ongoing success. Identify the behaviors often incorporated by entrepreneurial leaders.

  About the pros and cons of unions

Now that you have learned much about the pros and cons of unions, what direction should US labor unions should take in the future? What direction should employers take in the future regarding unions?

  Identify the strategic direction elements of strategic plan

Identify the strategic direction elements of the strategic plan and interpret impact of environmental factors.

  Nominal rate of return on a perpetual preferred stock

What will be the nominal rate of return on a perpetual preferred stock with a $100 par value, a stated dividend of 7% of par, and a current market price of (a) $62.00, (b) $87.00, (c) $111.00, and (d) $140.00? Round your answers to two decimal places..

  Identify the steps in developing social media strategy

Identify the steps in developing a social media strategy and give an example of how a product or a business might execute each step in the process of developing a social media strategy. Set objectives and goals: what to be achieved from social media ..

  The variable cost of a burger meal

The variable cost of a burger meal is 50 cents, the revenue of a burger meal is $1.00. Your fixed costs are $200,000 a year. You need to make 18% annual return on your $1,000,000 investment. How many burger meals do you have to sell to make the 18% a..

  What is a phd degree in business administration

What is a Ph.D. degree in Business Administration? How it is different from a DBA? What is a Ph.D. dissertation?

  Company total book value of debt

What is the company's total book value of debt? What is the company's total market value of debt? What is your best estimate of the aftertax cost of debt?

  Define organization development

Define organization development and why it is relevant to an organization in today's marketplace? What environmental factors will be important to OD in the future? Why?

  These pillars simultaneously and not in sequence

You will learn how to develop and apply methods and tools for developing the four pillars of any TQM company. Why do I recommend that you should work with these pillars simultaneously and not in sequence?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd