Reference no: EM132287884
Overview
The Institute for Statistics Education at Statistics.com asks students to rate a variety of aspects of a course as soon as the student completes it. The Institute is contemplating instituting a recommendation system that would provide students with recommendations for additional courses as soon as they submit their rating for a completed course. Consider the excerpt from student ratings of online statistics courses shown in Table 1 below, and the problem of what to recommend to student E.N.
Table 1
Ratings of online statistics courses: 4 = Best, 1 = worst, blank = not taken association table week 6.png
In R Your Job is To:
Consider a user-based collaborative filter. This requires computing correlations between all student pairs. For which students is it possible to compute correlations with E.N.? Compute them.
Then, tell me:
Which single course should we recommend to E.N. based on the single nearest student to E.N.? Explain why.
Based on the cosine similarities of the nearest students to E.N., which course should be recommended to E.N.?
What is the conceptual difference between using the correlation as opposed to cosine similarities? [Hint: how are the missing values in the matrix handled in each case?]
Then:
With large datasets, it is computationally difficult to compute user-based recommendations in real time, and an item-based approach is used instead. Returning to the rating data (not the binary matrix), let's now take that approach.
If the goal is still to find a recommendation for E.N., for which course pairs is it possible and useful to calculate correlations?
Just looking at the data, and without yet calculating course pair correlations, which course would you recommend to E.N., relying on item-based filtering? Calculate two course pair correlations involving your guess and report the results.
Finally:
Apply item-based collaborative filtering to this dataset (using R) and based on the results, recommend a course to E.N.