Solution-Demonstrate systematic knowledge and critical

Demonstrate systematic knowledge and critical understanding

Assignment Help Other Subject

Reference no: EM133373791

Principles of Data Science

Module - Principles of Data Science

Learning Outcome 1: Demonstrate systematic knowledge and critical understanding in topics in linear algebra, probability and statistical models, relevant to data science.

Task: This Individual Coursework involves investigating the different properties and operations involving vectors, matrices and probability, including Markov chains and Eigenfaces. You are encouraged to explore the topic, use your initiative, and show some originality, within the time available. Make sure you read each task through carefully, and answer all parts of each task.

Task 1: Markov Chains

Example: Flowers. Consider a particular variety of flower which can take one of three colours: red, white and pink. Suppose after each year, 40% of red flowers remain red but 60% of red flowers mutate to pink. For pink flowers, 30% mutate to red, 50% remain as pink, and 20% mutate to white. For white flowers, 90% remain as white and 10% mutate to pink. We can capture these probabilities in a transition matrix P where the columns correspond to red, pink and white flowers in the current year and the rows correspond to red, pink and white flowers in the next year.
      [0.4 0.3 0 ]
P = [0.6 0.5 0.1]
       [0 0.2 0.9]
If a certain garden initially has 80 of this variety of red flowers and 20 of this variety of white flowers then after one year:
     80     32
P [ 0 ] = [50]
     20     18
so we would expect to see 32 red, 50 pink and 18 white flowers in the garden. If we look at the garden after 5 years:
          80            80      20.88
PPPPP [ 0 ] = P5 [ 0 ] = [37.08]
          20           20       42.04
so we would expect to see approximately 21 red, 37 pink and 42 white flowers in the garden. This is an example of a system called a "Markov chain". The next state of the system depends only on its present state and does not depend on the earlier history of the system.

Assessment: Mice in a maze. Consider the closed maze shown below containing seven rooms (labelled 0 to 6) with doorways connecting the rooms. A number of mice are placed in the maze and the mice then move randomly from room to room. Each time step, a mouse either stays in the same room (with probability 0.6) or leaves the room it is in by choosing at random (equally likely) one of the doors out of the room.

(1) Construct a 7 × 7 transition matrix P (as a numpy array) to model the mice moving around the maze as a Markov chain.

(2) Suppose that 50 mice are placed in Room 0 and 90 mice are placed in Room 1. All other rooms are initially empty. Use numpy to calculate how many mice we would expect to see in each room at the end of each time step for each of the next 30 time steps. Use Python to plot your results on a line graph showing one line for each room, together with a legend. Briefly comment on what you observe. You might find numpy.linalg.matrix_power() helpful.

(3) Use numpy to find the eigenvectors of the transition matrix P from part (1), and explain how one of these eigenvectors is related to the number of mice we would expect to see in each room in the "long-run steady state". Also, for each room j, find the expected number of time steps for a particular mouse initially located in room j to first return to room j. Hint: if Pj is the probability that the particular mouse is in room j in the "long-run steady state" then the expected first return time is given by 1/Pj .

(4) Suppose the mouse population is in the "long run steady state" from part (3) when a large piece of poisoned cheese is placed in Room 6 so that any mouse that enters that room instantly dies.

(a) Modify the transition matrix P from part (1) and estimate the number of time steps until 80% of the mice have died.
(b) The expected number of time steps needed for a mouse starting from room ?? to reach room k for the first time can be calculated using matrix operations and is denoted ????k . For a particular destination room k, let ?? be the transition matrix P from part (4)(a) but with row k and column k both deleted, and let ?? be the 6 × 6 identity matrix. The sum of each column of the matrix (?? - ??)-1 gives the values ????k . Use this method to find the expected number of time steps needed for a mouse to die, for each of the possible rooms in the maze that the mouse could start from. Briefly comment on what you observe.

Task 2. Eigenfaces

The Olivetti Faces dataset consists of 400 images of human faces. Five examples are given below.

Each image is 64 pixels wide by 64 pixels high. However, the Python code below assembles the 400 images as columns of the matrix Xall. Each image is greyscale (not colour) and all pixel values are between 0 and 1. The matrix Xsub is the last 300 columns of the matrix Xall.

The only Python libraries you may use in this task are numpy and matplotlib, except that you may use sklearn only to load the dataset as in the Python code below.

import sklearn.datasets
faces = sklearn.datasets.fetch_olivetti_faces()
Xall = faces.data.T print(Xall.shape) print(Xall.min(), Xall.max()) Xsub = Xall[:,100:]

(1) Use numpy to find the mean of each row of Xsub and reshape as a 4096 × 1 vector (call this xbar). Adapt the Python code below to visualise this "mean face" and compare it to the image stored in the column of Xall corresponding to the last two digits of your Student ID.

plt.imshow(**something**, cmap=plt.cm.gray, vmin=0, vmax=1)

(2) Use numpy.cov() to calculate the covariance matrix of Xsub, giving a 4096 × 4096 matrix (call this matrix ??). Then use numpy.linalg.eigh() to find the eigenvalues (??) and eigenvectors (columns of the matrix P) of ??. Use the Python code below to reverse the entries in ?? and the columns of P. Let ?? be the diagonal matrix consisting of the elements of ?? along its diagonal. Use numpy to confirm that P is an orthogonal matrix and that ?? = P??P??.

Make sure you use numpy.linalg.eigh() not numpy.linalg.eig() for this part.

V = V[::-1]
P = P[:,::-1]

(3) In the following Python code, we apply P?? (as a matrix transformation) to the faces in Xsub, where xbar is the "mean face" (a 4096 × 1 vector) from part (1).

Ysub = P.T@(Xsub-xbar)

(a) The eigenvectors of ?? (i.e. the columns of P) are called "eigenfaces" (Turk and Pentland, 1991). Visualise the eigenfaces corresponding to the largest 10 eigenvalues. Comment on what you discover. You will need to remove vmin and vmax from the plt.imshow() example given above.

(b) Show (by example) that if y is any one column of Ysub then P@y+xbar perfectly recreates the corresponding column of Xsub. This shows that each "face" in Xsub is a "linear combination" of eigenfaces plus the "mean face".

(c) Although P is a 4096 × 4096 matrix, generally only those eigenfaces corresponding to the largest eigenvalues of ?? are "useful". If there are k such eigenvalues (with k ≥ 2), let Xpartial be the matrix formed from matrix multiplication of the first k columns of P and the first k rows of Ysub, and adding xbar. Investigate how well Xpartial approximates Xsub for different values of k by using the numpy.linalg.norm() function applied to Xpartial minus Xsub (this provides a measure for "goodness of fit"). Build an appropriate line graph as part of your investigation.

Reference no: EM133373791

Questions Cloud

How do you prepare students for higher education experience : As you reflect on this reading which specific topic or issue resonates the most with you? Why? What big question relating to student advising remains

What type of divisional structure should organization have : The organization is impacted by many differences in terms of provincial legislation and customer preferences. What type of divisional structure should

What do you think motivates the professionals who work : What do you think motivates the professionals who work in these services? What employee motivation concerns might you have?

Perform a critical legal analysis of the policy : perform a critical legal analysis of the policy, including liability concerns; and recommend specific changes to the existing policy.

Demonstrate systematic knowledge and critical understanding : 7144CEM Principles of Data Science, Coventry University - Demonstrate systematic knowledge and critical understanding in topics in linear algebra, probability

What problems can training solve and what problems : What problems can training solve and what problems can training not solve? What is perhaps the biggest problem to overcome when trying to implement

Remedying purpose of past discrimination : What does it mean by remedying purpose of past discrimination? What are some arguments against affirmative action?

Concerned with unequal distribution of wealth : In political theory, classical liberalism is concerned with the unequal distribution of wealth. The founder of the harm principle was the British philosopher

What the issue or change is the history of the change : Examine an emerging issue that involves a new* piece of legislation, regulation, or administrative policy (either federally or provincially), or an amendment

User Account

All Pages

Demonstrate systematic knowledge and critical understanding

Reference no: EM133373791

Reference no: EM133373791

Questions Cloud

Reviews

Write a Review

Other Subject Questions & Answers

Cross-cultural opportunities and conflicts in canada

Sociology theory questions

A book review on unfaithful angels

Disorder paper: schizophrenia

Individual assignment: two models handout and rubric

Developing strategic intent for toyota

Gasoline powered passenger vehicles

An aspect of poverty in canada

Ngn customer satisfaction qos indicator for 3g services

Prepare a power point presentation

Information literacy is important in this environment

Associative property of multiplication

Assured A++ Grade

Academics

Major Subjects

Majors

Get In Touch

TERMS & POLICIES

HELP & SUPPORT