Devika Rao
Artificial intelligence is trained on data that is largely taken from the internet. However, with the volume of data required to school AI, many models end up consuming other AI-generated data, which can in turn negatively affect the model as a whole. With AI both producing and consuming data, the internet has the potential to become overrun with bots, with far less content being produced by humans.
Is AI cannibalization bad?
AI is eating itself. Currently, artificial intelligence is growing at a rapid rate and human-created data needed to train models is running out. "As they trawl the web for new data to train their next models on — an increasingly challenging task — [AI bots are] likely to ingest some of their own AI-generated content, creating an unintentional feedback loop in which what was once the output from one AI becomes the input for another," said The New York Times. "When generative AI is trained on its own content, its output can also drift away from reality." This is known as model collapse.
Still, AI companies have their hands tied. "To develop ever more advanced AI products, Big Tech might have no choice but to feed its programs AI-generated content, or just might not be able to sift human fodder from the synthetic," said The Atlantic. As it stands, synthetic data is necessary to keep up with the growing technology. "Despite stunning advances, chatbots and other generative tools such as the image-making Midjourney and Stable Diffusion remain sometimes shockingly dysfunctional — their outputs filled with biases, falsehoods and absurdities." These inaccuracies then carry through to the next iteration of the AI model.
No comments:
Post a Comment