Curated by THEOUTPOST
On Fri, 23 Aug, 8:01 AM UTC
2 Sources
[1]
Why GenAI might pose a risk to itself
In 2017, Facebook's AI bots Alice and Bob developed their own language, leading to the experiment's halt and sparking global AI concerns. A 2023 study highlights that large language models (LLMs) trained on synthetic data can suffer from Model Autophagy Disorder (MAD), degrading over time due to amplified biases and reduced novelty. Research also indicates that reliance on synthetic data may lower content quality and SEO performance, affecting digital content and AI model effectiveness.In 2017, Facebook introduced two experimental AI bots, Alice and Bob, to negotiate trades involving valuable items. As the bots learned and evolved, they created their own language to communicate, which caused concern among engineers. The team overseeing the experiment chose to shut down the bots when their conversations became incomprehensible. This unsettling development captured the attention of the global AI community. MAD-ness syndrome A peer reviewed paper published in 2023 by a group of researchers at Rice and Stanford Universities provides an indication of what might have happened to Alice and Bob. The paper talks about GenAI's susceptibility to go MAD (model autophagy disorder) without sufficient backstops. A large language model (LLM) evolves by continuously processing both real and synthetic data, though it often relies heavily on the latter, as seen with many popular models. When LLMs are trained on synthetic data -- potentially biased content generated by AI -- the model may deteriorate over time or, as researchers describe it, could develop MAD (Model Artifact Degradation). Shankar J, a brain-computer interface researcher, dives deep into domain specifics like neural networks, data analysis and electronics design as part of his coursework at the National Institute of Technology in Calicut. Shankar says models trained using GenAI content tend to amplify their own biases and errors. Degradation of models, reduced novelty, bias amplification and hallucination, he says, are some common symptoms of MAD-ness. A study by Gartner finds that by the end of 2024, 60% of the data used for developing AI and analytics will be synthetic. Shankar says some popular LLMs including GPT3, GPT-4, BERT, Claude and LLaMA and image generation models such as DALL-E and stable diffusion are prone to MAD. The Stanford paper draws analogies to mathematical concepts (contraction mapping, unstable feedback loops) and biological phenomena of mad cow disease, pointing to how synthetic data training models could risk GenAI going berserk. How soon a model degrades, the researchers say, depends on the number of iterations of training solely on synthetic data, on how complex the task or application is (more complex ones will degrade faster), and the model architecture. Decreasing quality Another research paper published last Feb and co-written by five researchers at King Abdullah University of Science and Technology, University of Macau and the China-based Ant Group share the same concerns. The paper titled 'Autophagy makes large models achieving local optima' says when LLMs rely on AI content for 'refined' learning, they tend to change the content. For text, they might alter the style or add details. In the case of images, some features might be changed. This, according to the research, is sufficient to cancel out the diversity in the data used to train future AI models. It also might affect the variety of information people are exposed to. This trend limits next generation model performance. "Given the decreasing quality of Gen AI-generated content (blogs, articles, LinkedIn posts), Gen AI has very likely entered this mad phase now," says Robin Alex Panicker, software entrepreneur and coder. There are studies that show how AI content generated for websites and social media posts could lack quality without people editing them. Some affected aspects are SEO, quality of content (use of same info repeatedly with minor syntax changes, regurgitated info). Google now has policies in place to push expert backed content. Unedited, un-intervened AI content could see a drop in search rankings. Experts say blogs and websites that leaned heavily on GenAI for content have seen big drops in traffic and ranking since Google's policy revisions in March. (With TOI inputs)
[2]
Why GenAI can become a threat to itself - Times of India
In 2017, Facebook deployed two experimental AI bots, Alice and Bob, to negotiate with each other to trade items that carried a certain value. But trepidation grew as the two bots, quietly learning and evolving, developed their own language to communicate. Software engineers at Facebook who were tasked with overseeing the experiment soon decided to pull the plug on the bots after the AI exchanges stopped making sense.The spooky episode piqued the interest of the world's AI community. The MAD-ness syndrome A peer reviewed paper published in 2023 by a group of researchers at Rice and Stanford Universities provides an indication of what might have happened to Alice and Bob. The paper talks about GenAI's susceptibility to go MAD (model autophagy disorder) without sufficient backstops. A large language model (LLM) evolves by continuously consuming data, both real and synthetic, but mostly the latter as has been the case with many popular models. LLMs that consume synthetic data -- potentially biased AI-generated content -- to train or improve the model are likely to degrade progressively over generations or as the researchers put it, could develop MAD. Shankar J, a brain-computer interface researcher, dives deep into domain specifics like neural networks, data analysis and electronics design as part of his coursework at the National Institute of Technology in Calicut. Shankar says models trained using GenAI content tend to amplify their own biases and errors. Degradation of models, reduced novelty, bias amplification and hallucination, he says, are some common symptoms of MAD-ness. A study by Gartner finds that by the end of 2024, 60% of the data used for developing AI and analytics will be synthetic. Shankar says some popular LLMs including GPT3, GPT-4, BERT, Claude and LLaMA and image generation models such as DALL-E and stable diffusion are prone to MAD. The Stanford paper draws analogies to mathematical concepts (contraction mapping, unstable feedback loops) and biological phenomena of mad cow disease, pointing to how synthetic data training models could risk GenAI going berserk. How soon a model degrades, the researchers say, depends on the number of iterations of training solely on synthetic data, on how complex the task or application is (more complex ones will degrade faster), and the model architecture. Decreasing quality Another research paper published last Feb and co-written by five researchers at King Abdullah University of Science and Technology, University of Macau and the China-based Ant Group share the same concerns. The paper titled 'Autophagy makes large models achieving local optima' says when LLMs rely on AI content for 'refined' learning, they tend to change the content. For text, they might alter the style or add details. In the case of images, some features might be changed. This, according to the research, is sufficient to cancel out the diversity in the data used to train future AI models. It also might affect the variety of information people are exposed to. This trend limits next generation model performance. "Given the decreasing quality of Gen AI-generated content (blogs, articles, LinkedIn posts), Gen AI has very likely entered this mad phase now," says Robin Alex Panicker, software entrepreneur and coder. There are studies that show how AI content generated for websites and social media posts could lack quality without people editing them. Some affected aspects are SEO, quality of content (use of same info repeatedly with minor syntax changes, regurgitated info). Google now has policies in place to push expert backed content. Unedited, un-intervened AI content could see a drop in search rankings. Experts say blogs and websites that leaned heavily on GenAI for content have seen big drops in traffic and ranking since Google's policy revisions in March.
Share
Share
Copy Link
Generative AI's rapid advancement raises concerns about its sustainability and potential risks. Experts warn about the technology's ability to create content that could undermine its own training data and reliability.
Generative AI (GenAI) has emerged as a groundbreaking technology, captivating industries and individuals alike with its ability to create human-like content. From text to images and even code, GenAI has demonstrated remarkable capabilities that have the potential to revolutionize various sectors 1. However, as the technology continues to advance at an unprecedented pace, experts are raising concerns about its long-term sustainability and the risks it may pose to itself.
One of the primary concerns surrounding GenAI is its ability to generate vast amounts of content that could potentially contaminate its own training data. As these AI models continue to learn and evolve, they risk incorporating their own generated content into future training sets, potentially leading to a degradation of quality and reliability over time 2.
Experts emphasize the critical importance of maintaining high-quality, authentic data for training GenAI models. As these systems become more sophisticated, distinguishing between human-generated and AI-generated content becomes increasingly difficult. This blurring of lines poses a significant challenge for developers and researchers who rely on clean, reliable data to improve and refine AI algorithms 1.
The proliferation of AI-generated content raises concerns about the integrity of information available online. As GenAI becomes more prevalent, there is a risk of flooding the internet with synthetic text, images, and videos. This could potentially lead to a scenario where distinguishing between authentic and artificially created information becomes increasingly challenging for both humans and machines 2.
The rapid advancement of GenAI also brings forth a host of ethical and legal considerations. Questions arise regarding copyright infringement, intellectual property rights, and the potential misuse of AI-generated content for malicious purposes such as deepfakes or misinformation campaigns 1. These concerns highlight the need for robust regulatory frameworks and ethical guidelines to govern the development and deployment of GenAI technologies.
As the potential risks associated with GenAI come to light, there is a growing call for responsible development and deployment of these technologies. Experts emphasize the importance of implementing safeguards, such as watermarking AI-generated content and developing more sophisticated detection methods to differentiate between human and AI-created materials 2.
Despite the challenges, many experts remain optimistic about the future of GenAI. They believe that with proper oversight, ethical considerations, and continued research, the technology can be harnessed to benefit society while mitigating potential risks. The key lies in striking a balance between innovation and responsible development, ensuring that GenAI remains a powerful tool for progress rather than a threat to its own existence 1.
Reference
[1]
[2]
Experts raise alarms about the potential limitations and risks associated with large language models (LLMs) in AI. Concerns include data quality, model degradation, and the need for improved AI development practices.
2 Sources
2 Sources
Researchers warn that the proliferation of AI-generated web content could lead to a decline in the accuracy and reliability of large language models (LLMs). This phenomenon, dubbed "model collapse," poses significant challenges for the future of AI development and its applications.
8 Sources
8 Sources
An exploration of how generative AI and social media could be used to manipulate language and control narratives, drawing parallels to Orwell's 'Newspeak' and examining the potential beneficiaries of such manipulation.
2 Sources
2 Sources
Leading AI companies like OpenAI, Anthropic, and Google encounter obstacles in development, raising questions about the future of generative AI and its ability to deliver on ambitious promises.
2 Sources
2 Sources
Recent executive orders by former President Trump aim to remove 'ideological bias' from AI, potentially undermining safety measures and ethical guidelines in AI development.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved