Write some code to create a list named categories

Assignment Help Other Subject
Reference no: EM132361630

Exercise 1 - Write some code to create a list named 'categories' that lists unique categories sorted alphabetically from the 'category' column of the data.

Exercise 2 - Write a function

python

def get_normalized_category_vector(game_categories,categories):

that takes two inputs

1. 'game_categories', a string of comma separated categories (think of this input as an entry from the 'category' column of the data); and

2. 'categories', a list of alphabetically sorted categories created in Exercise 1;

and returns a _normalized category vector_, defined below, as a 1-D numpy array.

A _category vector_ is defined as a vector of 1's and 0's where an entry is 1 if the board game has the corresponding category as one of its categories, or 0 otherwise.

Exercise 3 - Write a function

python

def get_similarity_score(v1, v2):

that takes two normalized category vectors (as 1-D numpy arrays) as inputs and returns a _cosine similarity score_ as an output.

The cosine similarity of two normalized vectors is their dot product. As an example:

python

v1 = np.array([0, 1/sqrt(2), 0, 0, 0, 1/sqrt(2), 0])

v2 = np.array([1/sqrt(3), 1/sqrt(3), 0, 0, 0, 0, 1/sqrt(3)])

assert get_similarity_score(v1, v2) == 1/sqrt(6)

If you feel you need more details, see: https://en.wikipedia.org/wiki/Cosine_similarity.

Exercise 4 - Write some code to create a sparse CSR matrix named 'game_graph' that represents a game graph as described previously.

A few points to note:

1. The input dataset, named 'data', has 4999 games.

2. Take the index of a game in the input dataframe to be the game's index. The index 0 of the input dataframe should also corresponds row 0 and column 0 of the output sparse matrix.

3. You will need to calculate normalized category vector for each of the games.

4. You will then need to find similarity between each pair of the games.

5. The final output **game_graph** should be a 4999x4999 CSR sparse matrix.

A few more points to note:

1. 4999x4999 is a fairly large matrix.

2. 4999 normalized game category vectors, each of size (1x84) also forms a large matrix.

3. Be cautious when using for loops with normal numpy arrays as they will take a considerable amount of time to run.

4. Storing these large matrices into a sparse matrix format would improve the performance significantly.

5. For efficiency's sake, sparse matrix operations like 'vstack()', 'transpose()', and 'dot()', may prove to be convenient.

Reference no: EM132361630

Questions Cloud

Preparing of code of ethics for fictional company : Begin by preparing of Code of Ethics for a fictional company which should include at minimum ten elements. Why did you include each of the ten elements?
Relate managerial hubris to ethical decision making : Relate managerial hubris to ethical decision making and the overall impact on the business environment. Evaluate whether the level of managerial hubris
Analyze role that culture plays in global business ethics : Explain exactly what it means to maintain an ethical culture within the organization. Analyze the role that culture plays in global business ethics.
Adopt to improve team dynamics and employee behaviors : Discuss methods you would adopt to improve team dynamics and employee behaviors. Describe control systems that can improve firm's operations-facilitate change
Write some code to create a list named categories : Write some code to create a list named 'categories' that lists unique categories sorted alphabetically from the 'category' column of the data
Discuss the evolution of the human resource department : Discuss the evolution of the Human Resource Department from the Personnel Department of the 1950s to the Human Resource Management Department that it is today.
What will an increase in technology allow a society to do : As displayed on a production possibilities curve, what will an increase in technology allow a society to do?
Should you sell the machine or keep it : Should you sell the machine or keep it? If you must commit to a posted price, what price maximizes profit?
Identify and using high-low context culture : Identify and using high/ low context culture, explain what are the cultural factors which might have created and worsened the situation between Danone and Wahah

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd