PCI – Chapter Two – Recommending Movies For Multiple Users

In order to provide recommendations for multiple critics, I have revised the SQL Schema and the queries for both saving pearson similarity metric for obtaining the movie rankings for a user.

First step is to rename table PearsonSim944 to PearsonSimCritics and add a column ‘CurrentUserId’.

For every user who you want to find movie recommendations for, run the following query:

INSERT INTO PearsonSimCritics(UserId,MoviesRatedCommon,PearsonSim,CurrentUserId)
SELECT b.UserId ‘UserId’,
count(a.rating) ‘MoviesRatedCommon’,
(sum(a.rating*b.rating)-(sum(a.rating)*sum(b.rating)/count(*))) / (sqrt((sum(pow(a.rating,2))-(pow(sum(a.rating),2)/count(*)))*(sum(pow(b.rating,2))-(pow(sum(b.rating),2)/count(*))))) ‘PearsonSim’,
FROM data a INNER JOIN data b ON a.itemid = b.itemid AND a.userId b.userId
INNER JOIN data c ON b.userId = c.userId
WHERE a.userId = 944
GROUP BY b.userid
ORDER BY ‘PearsonSim’ DESC

…and for obtaining movie recommendations for such users you’ve calculated similarity metrics for:

SELECT sum(b.rating*a.PearsonSim)/sum(a.PearsonSim) ‘Suggested Rating’,
k.rating ‘my rating’ , c.Title, a.CurrentUserId
FROM data b INNER JOIN PearsonSimCritics a ON b.userId = a.UserId
INNER JOIN item c ON b.itemId = c.movieId
LEFT JOIN data k ON c.movieId = k.itemId and k.userId = a.CurrentUserId
WHERE a.CurrentUserId = 944
HAVING count(b.ItemId) > 15

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s