Schwarm SchwanengänseWhen building a model from a user’s profile, a distinction is often made between explicit and implicit forms of data collection. Examples of explicit data collection include the following:

  • Asking a user to rate an item on a sliding scale.
  • Asking a user to search.
  • Asking a user to rank a collection of items from favorite to least favorite.
  • Presenting two items to a user and asking him/her to choose the better one of them.
  • Asking a user to create a list of items that he/she likes.

 

Examples of implicit data collection include the following:

  • Observing the items that a user views in an online store.
  • Analyzing item/user viewing times
  • Keeping a record of the items that a user purchases online.
  • Obtaining a list of items that a user has listened to or watched on his/her computer.
  • Analyzing the user’s social network and discovering similar likes and dislikes

The recommender system compares the collected data to similar and dissimilar data collected from others and calculates a list of recommended items for the user. Several commercial and non-commercial examples are listed in the article on collaborative filtering systems.

One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com’s recommender system. Other examples include:

  • As previously detailed, Last.fm recommends music based on a comparison of the listening habits of similar users.
  • Facebook, MySpace, LinkedIn, and other social networks use collaborative filtering to recommend new friends, groups, and other social connections (by examining the network of connections between a user and their friends). Twitter uses many signals and in-memory computations for recommending who to follow to its users.

Collaborative filtering approaches often suffer from three problems: cold start, scalability, and sparsity.

  • Cold Start: These systems often require a large amount of existing data on a user in order to make accurate recommendations.
  • Scalability: In many of the environments that these systems make recommendations in, there are millions of users and products. Thus, a large amount of computation power is often necessary to calculate recommendations.
  • Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.

A particular type of collaborative filtering algorithm uses matrix factorization, a low-rank matrix approximation technique.

From: Wikipedia

Collaborative filtering
Tagged on:                 

3 thoughts on “Collaborative filtering

Comments are closed.