← All posts
DEEP DIVE · 12 MIN READ · NOVEMBER 2024

Recommender Systems: How they evolved over the past years

Over the past six years, recommender systems have undergone a remarkable transformation. A technical look at data sources, modeling techniques, and where the field is heading.

Over the past six years, recommender systems have undergone a remarkable transformation, driven by advancements in machine learning, increased data availability, and expanding applications. Let's delve into the technical details of this evolution.

Expanding data sources

The diversity and richness of data sources powering recommender systems have grown significantly.

Social media data

  • Graph-based features: analyzing social connections using techniques like node2vec or GraphSAGE to capture latent user representations.
  • Content engagement: leveraging likes, shares, and comments to understand user interests.
  • Text analysis: applying NLP to user-generated content for sentiment analysis and topic modeling.

IoT and sensor data

  • Location data: using GPS coordinates for context-aware recommendations.
  • Activity tracking: incorporating data from wearables.
  • Smart home data: inferring user routines from connected devices.

Unstructured data

  • Image processing: CNNs extract features from product images or user-generated photos.
  • Natural language processing: transformer models like BERT or GPT analyze text descriptions, reviews, and queries.
  • Audio analysis: mel-spectrograms and RNNs process audio for music or podcast recommendations.

Advances in modeling

Deep learning architectures

  • Autoencoders: Variational Autoencoders (VAEs) for collaborative filtering, learning latent representations of users and items.
  • Neural Collaborative Filtering: combining matrix factorization with multi-layer perceptrons to model complex user-item interactions.
  • Wide & Deep Learning: integrating memorization of sparse features with generalization of dense features.

Hybrid approaches

  • Content-collaborative hybrid: integrating content-based features with collaborative filtering, often via factorization machines.
  • Context-aware models: incorporating contextual information using tensor factorization or contextual bandits.
  • Multi-modal fusion: combining text, images, and behavior using attention or gated networks.

Reinforcement learning

  • Deep Q-Networks: modeling recommendation as a Markov Decision Process to optimize long-term engagement.
  • Policy gradient methods: REINFORCE-style algorithms to directly optimize recommendation policies.
  • Actor-critic: combining value estimation and policy optimization for more stable learning.

Session-based recommendations

  • RNNs: models like GRU4Rec for sequential user behavior.
  • Self-attention: transformer-based models like SASRec for complex dependencies in user sessions.
  • Graph neural networks: session graphs to model item-to-item transitions.

Evolving applications

Media and entertainment

  • Netflix: ensemble methods combining personalized video rankers and contextual bandits.
  • Spotify: collaborative filtering, NLP on lyrics, and audio feature analysis.

Social media

  • Facebook: deep learning models like DeepText for content understanding.
  • TikTok: multi-modal recommendation systems analyzing video content, user interactions, and creator information.

Healthcare

  • Treatment recommendations: collaborative filtering and content-based approaches based on patient characteristics and historical outcomes.
  • Clinical trial matching: hybrid models combining structured patient data with unstructured medical records.

Education

  • Knewton: probabilistic graphical models to create knowledge graphs and personalize learning paths.
  • Duolingo: spaced repetition and adaptive difficulty based on user performance.

Challenges and future directions

Algorithmic bias

  • Fairness-aware recommendation: balancing accuracy with fairness metrics like equal opportunity or demographic parity.
  • Debiasing techniques: adversarial debiasing or regularization to mitigate biases in training data.

Explainability and transparency

  • LIME: instance-level explanations for complex black-box models.
  • Attention visualization: highlighting features influencing recommendations.

Privacy and data governance

  • Federated learning: training on decentralized data without sharing raw user information.
  • Differential privacy: controlled noise to protect individual user data while maintaining model utility.

Emerging research directions

  • Causal inference: instrumental variables or propensity score matching to understand true recommendation impact.
  • Multi-task learning: simultaneously optimizing for engagement, diversity, and revenue.
  • Cross-domain recommendations: transfer learning to improve recommendations in domains with limited data.
  • Quantum computing: quantum algorithms for matrix factorization and similarity computations at scale.

Conclusion

The field has seen remarkable progress, driven by ML advancements, data availability, and diverse applications. The challenges of bias, explainability, and privacy present exciting opportunities for research and innovation, ensuring that the field of recommender systems remains dynamic and crucial in the years to come.

iNU
WRITTEN BY
AI Transformers
Practical AI for businesses that actually have to ship.
Keep reading

More from the field.

Got a pilot that stalled?

We'll do a free 30-minute diagnostic on it. No pitch, just an honest read.