Solution-Write some code to create a list named categories

Write some code to create a list named categories

Assignment Help Other Subject

Reference no: EM132361630

Exercise 1 - Write some code to create a list named 'categories' that lists unique categories sorted alphabetically from the 'category' column of the data.

Exercise 2 - Write a function

python

def get_normalized_category_vector(game_categories,categories):

that takes two inputs

1. 'game_categories', a string of comma separated categories (think of this input as an entry from the 'category' column of the data); and

2. 'categories', a list of alphabetically sorted categories created in Exercise 1;

and returns a _normalized category vector_, defined below, as a 1-D numpy array.

A _category vector_ is defined as a vector of 1's and 0's where an entry is 1 if the board game has the corresponding category as one of its categories, or 0 otherwise.

Exercise 3 - Write a function

python

def get_similarity_score(v1, v2):

that takes two normalized category vectors (as 1-D numpy arrays) as inputs and returns a _cosine similarity score_ as an output.

The cosine similarity of two normalized vectors is their dot product. As an example:

python

v1 = np.array([0, 1/sqrt(2), 0, 0, 0, 1/sqrt(2), 0])

v2 = np.array([1/sqrt(3), 1/sqrt(3), 0, 0, 0, 0, 1/sqrt(3)])

assert get_similarity_score(v1, v2) == 1/sqrt(6)

If you feel you need more details, see: https://en.wikipedia.org/wiki/Cosine_similarity.

Exercise 4 - Write some code to create a sparse CSR matrix named 'game_graph' that represents a game graph as described previously.

A few points to note:

1. The input dataset, named 'data', has 4999 games.

2. Take the index of a game in the input dataframe to be the game's index. The index 0 of the input dataframe should also corresponds row 0 and column 0 of the output sparse matrix.

3. You will need to calculate normalized category vector for each of the games.

4. You will then need to find similarity between each pair of the games.

5. The final output **game_graph** should be a 4999x4999 CSR sparse matrix.

A few more points to note:

1. 4999x4999 is a fairly large matrix.

2. 4999 normalized game category vectors, each of size (1x84) also forms a large matrix.

3. Be cautious when using for loops with normal numpy arrays as they will take a considerable amount of time to run.

4. Storing these large matrices into a sparse matrix format would improve the performance significantly.

5. For efficiency's sake, sparse matrix operations like 'vstack()', 'transpose()', and 'dot()', may prove to be convenient.

Reference no: EM132361630

Questions Cloud

Preparing of code of ethics for fictional company : Begin by preparing of Code of Ethics for a fictional company which should include at minimum ten elements. Why did you include each of the ten elements?

Relate managerial hubris to ethical decision making : Relate managerial hubris to ethical decision making and the overall impact on the business environment. Evaluate whether the level of managerial hubris

Analyze role that culture plays in global business ethics : Explain exactly what it means to maintain an ethical culture within the organization. Analyze the role that culture plays in global business ethics.

Adopt to improve team dynamics and employee behaviors : Discuss methods you would adopt to improve team dynamics and employee behaviors. Describe control systems that can improve firm's operations-facilitate change

Write some code to create a list named categories : Write some code to create a list named 'categories' that lists unique categories sorted alphabetically from the 'category' column of the data

Discuss the evolution of the human resource department : Discuss the evolution of the Human Resource Department from the Personnel Department of the 1950s to the Human Resource Management Department that it is today.

What will an increase in technology allow a society to do : As displayed on a production possibilities curve, what will an increase in technology allow a society to do?

Should you sell the machine or keep it : Should you sell the machine or keep it? If you must commit to a posted price, what price maximizes profit?

Identify and using high-low context culture : Identify and using high/ low context culture, explain what are the cultural factors which might have created and worsened the situation between Danone and Wahah

User Account

All Pages

Write some code to create a list named categories

Reference no: EM132361630

Reference no: EM132361630

Questions Cloud

Reviews

Write a Review

Other Subject Questions & Answers

Cross-cultural opportunities and conflicts in canada

Sociology theory questions

A book review on unfaithful angels

Disorder paper: schizophrenia

Individual assignment: two models handout and rubric

Developing strategic intent for toyota

Gasoline powered passenger vehicles

An aspect of poverty in canada

Ngn customer satisfaction qos indicator for 3g services

Prepare a power point presentation

Information literacy is important in this environment

Associative property of multiplication

Assured A++ Grade

Academics

Major Subjects

Majors

Get In Touch

TERMS & POLICIES

HELP & SUPPORT