Collaborative filtering

When building a model from a user’s profile, a distinction is often made between explicit and implicit forms of data collection. Examples of explicit data collection include the following:

Asking a user to rate an item on a sliding scale.
Asking a user to search.
Asking a user to rank a collection of items from favorite to least favorite.
Presenting two items to a user and asking him/her to choose the better one of them.
Asking a user to create a list of items that he/she likes.

Examples of implicit data collection include the following:

Observing the items that a user views in an online store.
Analyzing item/user viewing times
Keeping a record of the items that a user purchases online.
Obtaining a list of items that a user has listened to or watched on his/her computer.
Analyzing the user’s social network and discovering similar likes and dislikes

The recommender system compares the collected data to similar and dissimilar data collected from others and calculates a list of recommended items for the user. Several commercial and non-commercial examples are listed in the article on collaborative filtering systems.

One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com’s recommender system. Other examples include:

As previously detailed, Last.fm recommends music based on a comparison of the listening habits of similar users.
Facebook, MySpace, LinkedIn, and other social networks use collaborative filtering to recommend new friends, groups, and other social connections (by examining the network of connections between a user and their friends). Twitter uses many signals and in-memory computations for recommending who to follow to its users.

Collaborative filtering approaches often suffer from three problems: cold start, scalability, and sparsity.

Cold Start: These systems often require a large amount of existing data on a user in order to make accurate recommendations.
Scalability: In many of the environments that these systems make recommendations in, there are millions of users and products. Thus, a large amount of computation power is often necessary to calculate recommendations.
Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.

A particular type of collaborative filtering algorithm uses matrix factorization, a low-rank matrix approximation technique.

From: Wikipedia

Collaborative filtering

SOSERE Demo

Collaborative filtering

3 thoughts on “Collaborative filtering”