Solution-What features would you use for your machine

What features would you use for your machine learning

Assignment Help Other Subject

Reference no: EM133284824

Natural Language Processing

Question 1: What is Distributional Hypothesis in the context of distributional semantics?

Give a short explanation with some examples.

Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two widely used techniques for topic modeling. Give a short overview of the two approaches and any similarities/differences between them.

Question 2. You are a Data Scientist for an e-commerce site for electronics which also supports 3rd party sellers. You would like to build a system to find and match the same products that sellers on your website sell so that you can present them in a single product page. You decide to use product titles to compute product similarity. Which similarity metric , Jaccard or Cosine, would you use and why?

Consider the following table which lists electronic items for sale on two ecommerce shopping websites. Products in row -1 are the same product, row-2 are different TV models of the same brand and row-3 are different products.

Considering your answer to 2a) will your similarity calculation approach work on this dataset? Explain with examples.

Suppose that you are given IDF scores for all tokens. Can this help you come up with a better approach for computing title similarity? Explain with examples.

Question 3. a. Recommender systems are a subtype of information filtering systems that help users discover new and relevant items by presenting items similar to their previous interactions or preferences. Some famous examples of recommender systems are Amazon's "Books you may like" and Netflix's "Because you watched" carousels.

You are building a recommender system for your food delivery service startup and have data on co-purchases for food items f1, f2, . . ., fn (for example, food item f1 is commonly bought together with food item f4). How can you use techniques such as Word2Vec to recommend similar items to users who may have bought or show interest in any one of the items?

b. Word2Vec implements two different neural models: skip-gram and continuous bag of words (CBOW). Briefly explain the differences between the two models. Under which circumstances would you prefer the skip-gram model over CBOW?

Question 4. You are building a product classification system for an online electronics store. The system should classify an incoming stream of millions of products to one of the 3000+ leaf level product types in the taxonomy such as laptops, smart TVs, wireless headphones, car speakers, among others. The system should be very precise because it's important to assign products to the right category to facilitate the customer shopping experience. Each instance in yourdataset has product title, description and image fields. See example below:

What features would you use for your machine learning-based classifier?

Assume that you only have access to product titles in your dataset (i.e., you have less data to play with) instead of product titles, description and images. How will this affect feature engineering and the NLP pipeline for your classifier?

Obtaining training data is paramount for a large-scale classification system. You have a limited budget and can't hire an army of analysts to manually label every single instance. Discuss some strategies for obtaining training data for the classifier.

How would you handle products that are misclassified?

Question 5. Sentiment analysis: consider the following review of a restaurant:

"I took my father out for dinner to Le Bistro on New Year's Eve. The décor and service were fantastic. We enjoyed the food, especially their French countryside specials and their Chardonnay collections. However, my father thought the menu prices were a bit on the high side. Valet parking was also expensive. Overall, we definitely recommend Le Bistro for special occasions!"

Overall rating: 8 stars out of 10"

Identify the opinion object(s), feature(s), opinion(s), opinion holder(s) and opinion time in this review.

Design a sentiment analysis system for restaurant reviews (see example in 5a). Your answer should make use of the techniques discussed in class. The output of the system should assign a sentiment label of Positive or Negative to reviews.

Reference no: EM133284824

Questions Cloud

Describe forms of violence experienced in gay relationships : Describe forms of violence experienced in gay and lesbian relationships. What is same-sex partner identity abuse

Has rodney committed a crime : Has Rodney committed a crime? If yes, which crime and what are the elements you would have to prove to convict him

How many women have babies in usa while incarcerated : How many women have babies in USA while incarcerated? What medical or psychological benefits does it give to women

Do you believe that police corruption is rampant : Do you believe that police corruption is rampant in many police departments? Explain. (What do the statistics show? Is this perception or reality?)

What features would you use for your machine learning : Natural Language Processing What is Distributional Hypothesis in the context of distributional semantics and Design a sentiment analysis system for restaurant

What are the main reasons why supervisors fail to discipline : What are the main reasons why supervisors fail to discipline? To what extent is senior management responsible for supervisors' failure to discipline

Discuss process needed in order to obtain a search warrant : Discuss the process needed in order to obtain a search warrant - include a discussion of the level of proof needed for a search warrant (defined and cited)

How would a pre-sentence investigation inform judge : How would a pre-sentence investigation inform a judge's sentencing of a crime that is still illegal but being decriminalized

Why are college degrees required for probation : Why are college degrees required for probation and parole officers but not necessarily for entry-level police officers

User Account

All Pages

What features would you use for your machine learning

Reference no: EM133284824

Reference no: EM133284824

Questions Cloud

Reviews

Write a Review

Other Subject Questions & Answers

Cross-cultural opportunities and conflicts in canada

Sociology theory questions

A book review on unfaithful angels

Disorder paper: schizophrenia

Individual assignment: two models handout and rubric

Developing strategic intent for toyota

Gasoline powered passenger vehicles

An aspect of poverty in canada

Ngn customer satisfaction qos indicator for 3g services

Prepare a power point presentation

Information literacy is important in this environment

Associative property of multiplication

Assured A++ Grade

Academics

Major Subjects

Majors

Get In Touch

TERMS & POLICIES

HELP & SUPPORT