AI-Generated Data Can Poison Future AI Models

Generative artificial intelligence is booming, creating text, code, images, and music accessible to the average person and taking over the Internet, with “large language models” filling numerous websites. However, AI-generated content may introduce errors when used as training data for new models, leading to a phenomenon called “model collapse.” This process could result in models losing diversity and exacerbating existing biases. Potential solutions include standardized human-curated data sets and strategies for discerning human-generated data from synthetic content. The future of AI models and their impact on training data remain uncertain, as researchers aim to navigate the evolving landscape of generative AI technology.

https://www.scientificamerican.com/article/ai-generated-data-can-poison-future-ai-models/

To top