Does AI content cannibalization pose a threat to Threads as an AI data loss leader?

ChatGPT Hype Fading as Users Notice Decreased Performance

In recent months, the excitement surrounding OpenAI’s ChatGPT has waned significantly. Google searches for “ChatGPT” have plummeted by 40% since peaking in April, and web traffic to OpenAI’s ChatGPT website has decreased by nearly 10% over the past month. Disappointingly, users of the latest version of ChatGPT, GPT-4, have reported that the model appears to be considerably dumber than its predecessor, although it is faster. While one theory suggests that OpenAI has divided GPT-4 into multiple smaller models that specialize in certain areas but cannot perform at the same level, another more intriguing possibility revolves around the concept of AI cannibalism.

The internet is now inundated with AI-generated text and images, which serve as training data for other AI models. This situation creates a negative feedback loop, where the coherence and quality of the output deteriorate as more AI data is consumed. It is akin to making a photocopy of a photocopy, as the quality progressively diminishes. Although GPT-4’s official training data only goes up until September 2021, it is evident that the model has access to more information than that. As a result, OpenAI recently discontinued its web browsing plugin. Researchers from Rice and Stanford University have coined the term “Model Autophagy Disorder” (MAD) to describe this phenomenon, emphasizing that future generative models will experience a decline in quality and diversity without a sufficient supply of fresh real data in each iteration.

The silver lining in this situation is that AIs now have an incentive to involve humans in the loop. If a method can be established to identify and prioritize human content for AI models, it could help mitigate the decline in output quality. OpenAI’s CEO, Sam Altman, is pursuing this approach through his Worldcoin project, which employs eyeball scanning technology on a blockchain platform. By integrating human-generated content into AI training, it may be possible to enhance the performance and diversity of future generative models.

Threads: A Loss Leader for AI Model Training?

Mark Zuckerberg’s decision to launch the Twitter clone Threads seems peculiar, as it effectively diverts users from Instagram, a photo-sharing platform that generates around $50 billion annually. Even in the unlikely scenario that Threads secures 100% of Twitter’s market share, it is projected to earn only a fraction of what Instagram does. Alex Valaitis from Big Brain Daily predicts that Threads will either be shut down or eventually incorporated into Instagram within a year. However, Valaitis speculates that the true motive behind its launch is to provide more text-based content for training Meta’s AI models.

Elon Musk, known for his actions against AI, has implemented measures to impede the training of AI models using data from Twitter. These measures include charging for API access and imposing rate limits. Similarly, Zuckerberg has utilized data from Instagram to train Meta’s image recognition AI software, SEER. While users consented to this data collection in the platform’s privacy policy, it is worth noting that the Threads app collects an extensive array of user data, ranging from health information to religious beliefs and race. Undoubtedly, this valuable data will be employed in training AI models such as Facebook’s LLaMA (Large Language Model Meta AI). Musk, on the other hand, has recently introduced xAI, an OpenAI competitor that intends to leverage Twitter’s data for its own language model.

Religious Chatbots and Misinterpretation

Surprisingly, the training of AI models using religious texts has resulted in ethical concerns. In India, several Hindu chatbots, impersonating Krishna, have been advising users that killing people is acceptable when it aligns with their dharma, or duty. Although at least five chatbots trained on the 700-verse scripture Bhagavad Gita have emerged in recent months, the Indian government shows no intention of regulating this technology.

Lawyer Lubna Yusuf, coauthor of the AI Book, suggests that the danger lies in the miscommunication and misinformation derived from religious texts. Bots tend to offer literal interpretations while these texts possess immense philosophical value. Mumbai-based Yusuf asserts that a literal answer from a bot can be problematic. As a result, concerns surrounding the training of AI models with religious texts persist.

AI Doomers versus AI Optimists

Prominent decision theorist Eliezer Yudkowsky, known for his pessimistic perspective on AI, recently delivered a TED talk warning that superintelligent AI poses a threat to humanity. Yudkowsky emphasizes that the incomprehensible intelligence of an AGI (Artificial General Intelligence) would make it difficult to understand why and how it might harm humans. He suggests that AI may kill humans unintentionally while pursuing other objectives or deliberately eliminate us to prevent competition. Yudkowsky points out that the inner workings of modern AI systems remain inscrutable, as they consist of complex matrices of floating point numbers. He concludes that the only solution is a global ban on AI technology, enforced through the threat of World War III, though he deems this unlikely.

Contrary to Yudkowsky’s views, Marc Andreessen of A16z maintains that such positions lack scientific basis and fail to provide testable hypotheses or falsifiability. Bill Gates, the co-founder of Microsoft, asserts in his essay, “The risks of AI are real but manageable,” that society has successfully navigated transformative technological developments in the past. Gates believes that public awareness and understanding of AI, including its benefits and risks, are vital for a healthy debate. He predicts massive benefits and argues that managing risks is possible due to past successes.

Jeremy Howard, a data scientist, contributes to the AI debate with his own paper, arguing against outlawing the technology or restricting it to a few centralized AI models. Howard likens the fear-based response to AI to the pre-Enlightenment era when education and power were confined to the elite. He suggests that an enlightened approach is to encourage open-source development of AI, trusting that the majority of people will utilize the technology for positive purposes. Howard believes that harnessing the collective diversity and expertise of society with the support of AI can effectively identify and respond to potential threats.

Code Interpretation Upgrades for GPT-4

OpenAI’s GPT-4 introduces an impressive upgrade in the form of a code interpreter. This advanced feature allows the AI to generate code on-demand and execute it. Users have found various practical applications for this capability, such as generating useful charts from uploaded company reports, converting files between different formats, creating video effects, and transforming still images into videos. One user even leveraged GPT-4 to animate a map of every lighthouse location in the U.S., based on an Excel file.

In summary, while ChatGPT’s popularity has diminished and users have noticed a decline in performance, potential explanations include AI cannibalism and the fragmentation of models. Threads, Mark Zuckerberg’s foray into the Twitter sphere, raises questions about its purpose and potential integration into Instagram. The training of AI models using religious texts has sparked concerns in India due to misinterpretation and literal responses. The ongoing debate between AI doomers and optimists revolves around potential dangers versus manageable risks. Finally, GPT-4’s new code interpreter represents a significant advancement, enabling the AI to generate and execute code flexibly.

Source link