Improving recommendation systems and search in the age of LLMs

The evolution of search and recommendation systems has seen the integration of language models and multimodal content to address traditional limitations. Semantic IDs are replacing hash-based IDs in recommendation models, significantly improving prediction efficiency. M3CSR introduces multimodal content embeddings clustered into trainable category IDs, outperforming other baselines and showing improved performance in cold-start scenarios. FLIP aligns ID-based recommendation models with language models by jointly learning from masked tabular and language data, surpassing other models. EmbSum uses precomputed textual summaries to enhance recommendations, showing promising results. CALRec uses a two-stage framework to fine-tune a pretrained LLM for sequential recommendations, outperforming several baselines. Yelp, Indeed, and Bing share how they utilized LLMs to improve data quality and recommendation systems success.

https://eugeneyan.com/writing/recsys-llm/