2 Sources
[1]
To understand the future of AI, take a look at the failings of Google Translate
The computer scientists Rich Sutton and Andrew Barto have been recognised for a long track record of influential ideas with this year's Turing Award, the most prestigious in the field. Sutton's 2019 essay The Bitter Lesson, for instance, underpins much of today's feverishness around artificial intelligence (AI). He argues that methods to improve AI that rely on heavy-duty computation rather than human knowledge are "ultimately the most effective, and by a large margin". This is an idea whose truth has been demonstrated many times in AI history. Yet there's another important lesson in that history from some 20 years ago that we ought to heed. Today's AI chatbots are built on large language models (LLMs), which are trained on huge amounts of data that enable a machine to "reason" by predicting the next word in a sentence using probabilities. Useful probabilistic language models were formalised by the American polymath Claude Shannon in 1948, citing precedents from the 1910s and 1920s. Language models of this form were then popularised in the 1970s and 1980s for use by computers in translation and speech recognition, in which spoken words are converted into text. The first language model on the scale of contemporary LLMs was published in 2007 and was a component of Google Translate, which had been launched a year earlier. Trained on trillions of words using over a thousand computers, it is the unmistakeable forebear of today's LLMs, even though it was technically different. It relied on probabilities computed from word counts, whereas today's LLMs are based on what is known as transformers. First developed in 2017 - also originally for translation - these are artificial neural networks that make it possible for machines to better exploit the context of each word. The pros and cons of Google Translate Machine translation (MT) has improved relentlessly in the past two decades, driven not only by tech advances but also the size and diversity of training data sets. Whereas Google Translate started by offering translations between just three languages in 2006 - English, Chinese and Arabic - today it supports 249. Yet while this may sound impressive, it's still actually less than 4% of the world's estimated 7,000 languages. Between a handful of those languages, like English and Spanish, translations are often flawless. Yet even in these languages, the translator sometimes fails on idioms, place names, legal and technical terms, and various other nuances. Between many other languages, the service can help you to get the gist of a text, but often contains serious errors. The largest annual evaluation of machine translation systems - which now includes translations done by LLMs that rival those of purpose-built translation systems - bluntly concluded in 2024 that "MT is not solved yet". Machine translation is widely used in spite of these shortcomings: as far back as 2021, the Google Translate app reached 1 billion installs. Yet users still appear to understand that they should use such services cautiously: a 2022 survey of 1,200 people found that they mostly used machine translation in low-stakes settings, like understanding online content outside of work or study. Only about 2% of respondents' translations involved higher stakes settings, including interacting with healthcare workers or police. Sure enough, there are high risks associated with using machine translations in these settings. Studies have shown that machine-translation errors in healthcare can potentially cause serious harm, and there are reports that it has harmed credible asylum cases. It doesn't help that users tend to trust machine translations that are easy to understand, even when they are misleading. Knowing the risks, the translation industry overwhelmingly relies on human translators in high-stakes settings like international law and commerce. Yet these workers' marketability has been diminished by the fact that the machines can now do much of their work, leaving them to focus more on assuring quality. Many human translators are freelancers in a marketplace mediated by platforms with machine-translation capabilities. It's frustrating to be reduced to wrangling inaccurate output, not to mention the precarity and loneliness endemic to platform work. Translators also have to contend with the real or perceived threat that their machine rivals will eventually replace them - researchers refer to this as automation anxiety. Lessons for LLMs The recent unveiling of the Chinese AI model Deepseek, which appears to be close to the capabilities of market leader OpenAI's latest GPT models but at a fraction of the price, signals that very sophisticated LLMs are on a path to being commoditised. They will be deployed by organisations of all sizes at low costs - just as machine translation is today. Of course, today's LLMs go far beyond machine translation, performing a much wider range of tasks. Their fundamental limitation is data, having exhausted most of what is available on the internet already. For all its scale, their training data is likely to underrepresent most tasks, just as it underrepresents most languages for machine translation. Indeed the problem is worse with generative AI: unlike with languages, it is difficult to know which tasks are well represented in an LLM. There will undoubtedly be efforts to improve training data that make LLMs better at some underrepresented tasks. But the scope of the challenge dwarfs that of machine translation. Tech optimists may pin their hopes on machines being able to keep increasing the size of the training data by making their own synthetic versions, or of learning from human feedback through chatbot interactions. These avenues have already been explored in machine translation, with limited success. So the forseeable future for LLMs is one in which they are excellent at a few tasks, mediocre in others, and unreliable elsewhere. We will use them where the risks are low, while they may harm unsuspecting users in high-risk settings - as has already happened to laywers who trusted ChatGPT output containing citations to non-existent case law. These LLMs will aid human workers in industries with a culture of quality assurance, like computer programming, while making the experience of those workers worse. Plus we will have to deal with new problems such as their threat to human artistic works and to the environment. The urgent question: is this really the future we want to build?
[2]
To understand the future of AI, take a look at the failings of Google Translate
The computer scientists Rich Sutton and Andrew Barto have been recognized for a long track record of influential ideas with this year's Turing Award, the most prestigious in the field. Sutton's 2019 essay The Bitter Lesson, for instance, underpins much of today's feverishness around artificial intelligence (AI). He argues that methods to improve AI that rely on heavy-duty computation rather than human knowledge are "ultimately the most effective, and by a large margin." This is an idea whose truth has been demonstrated many times in AI history. Yet there's another important lesson in that history from some 20 years ago that we ought to heed. Today's AI chatbots are built on large language models (LLMs), which are trained on huge amounts of data that enable a machine to "reason" by predicting the next word in a sentence using probabilities. Useful probabilistic language models were formalized by the American polymath Claude Shannon in 1948, citing precedents from the 1910s and 1920s. Language models of this form were then popularized in the 1970s and 1980s for use by computers in translation and speech recognition, in which spoken words are converted into text. The first language model on the scale of contemporary LLMs was published in 2007 and was a component of Google Translate, which had been launched a year earlier. Trained on trillions of words using over a thousand computers, it is the unmistakeable forebear of today's LLMs, even though it was technically different. It relied on probabilities computed from word counts, whereas today's LLMs are based on what is known as transformers. First developed in 2017 -- also originally for translation -- these are artificial neural networks that make it possible for machines to better exploit the context of each word. The pros and cons of Google Translate Machine translation (MT) has improved relentlessly in the past two decades, driven not only by tech advances but also the size and diversity of training data sets. Whereas Google Translate started by offering translations between just three languages in 2006 -- English, Chinese and Arabic -- today it supports 249. Yet while this may sound impressive, it's still actually less than 4% of the world's estimated 7,000 languages. Between a handful of those languages, like English and Spanish, translations are often flawless. Yet even in these languages, the translator sometimes fails on idioms, place names, legal and technical terms, and various other nuances. Between many other languages, the service can help you to get the gist of a text, but often contains serious errors. The largest annual evaluation of machine translation systems -- which now includes translations done by LLMs that rival those of purpose-built translation systems -- bluntly concluded in 2024 that "MT is not solved yet." Machine translation is widely used in spite of these shortcomings: as far back as 2021, the Google Translate app reached 1 billion installs. Yet users still appear to understand that they should use such services cautiously: a 2022 survey of 1,200 people found that they mostly used machine translation in low-stakes settings, like understanding online content outside of work or study. Only about 2% of respondents' translations involved higher stakes settings, including interacting with health care workers or police. Sure enough, there are high risks associated with using machine translations in these settings. Studies have shown that machine-translation errors in health care can potentially cause serious harm, and there are reports that it has harmed credible asylum cases. It doesn't help that users tend to trust machine translations that are easy to understand, even when they are misleading. Knowing the risks, the translation industry overwhelmingly relies on human translators in high-stakes settings like international law and commerce. Yet these workers' marketability has been diminished by the fact that the machines can now do much of their work, leaving them to focus more on assuring quality. Many human translators are freelancers in a marketplace mediated by platforms with machine-translation capabilities. It's frustrating to be reduced to wrangling inaccurate output, not to mention the precarity and loneliness endemic to platform work. Translators also have to contend with the real or perceived threat that their machine rivals will eventually replace them -- researchers refer to this as automation anxiety. Lessons for LLMs The recent unveiling of the Chinese AI model Deepseek, which appears to be close to the capabilities of market leader OpenAI's latest GPT models but at a fraction of the price, signals that very sophisticated LLMs are on a path to being commoditized. They will be deployed by organizations of all sizes at low costs -- just as machine translation is today. Of course, today's LLMs go far beyond machine translation, performing a much wider range of tasks. Their fundamental limitation is data, having exhausted most of what is available on the internet already. For all its scale, their training data is likely to underrepresent most tasks, just as it underrepresents most languages for machine translation. Indeed the problem is worse with generative AI: unlike with languages, it is difficult to know which tasks are well represented in an LLM. There will undoubtedly be efforts to improve training data that make LLMs better at some underrepresented tasks. But the scope of the challenge dwarfs that of machine translation. Tech optimists may pin their hopes on machines being able to keep increasing the size of the training data by making their own synthetic versions, or of learning from human feedback through chatbot interactions. These avenues have already been explored in machine translation, with limited success. So the forseeable future for LLMs is one in which they are excellent at a few tasks, mediocre in others, and unreliable elsewhere. We will use them where the risks are low, while they may harm unsuspecting users in high-risk settings -- as has already happened to laywers who trusted ChatGPT output containing citations to non-existent case law. These LLMs will aid human workers in industries with a culture of quality assurance, like computer programming, while making the experience of those workers worse. Plus we will have to deal with new problems such as their threat to human artistic works and to the environment. The urgent question: is this really the future we want to build?
Share
Copy Link
An analysis of AI's future through the lens of Google Translate's successes and shortcomings, highlighting the challenges faced by Large Language Models and their implications for various industries.
The recent Turing Award recognition of computer scientists Rich Sutton and Andrew Barto has brought attention to the ongoing debate about the future of artificial intelligence (AI). Sutton's 2019 essay, "The Bitter Lesson," argues that AI methods relying on heavy-duty computation are ultimately more effective than those based on human knowledge 1. This principle has been demonstrated repeatedly in AI history, including the development of large language models (LLMs) that power today's AI chatbots.
The history of language models dates back to Claude Shannon's work in 1948, with significant advancements in the 1970s and 1980s for translation and speech recognition. The first language model comparable to contemporary LLMs was published in 2007 as part of Google Translate 1. Today's LLMs use transformer technology, developed in 2017, which allows machines to better understand word context.
Machine translation has seen significant improvements over the past two decades, driven by technological advancements and larger, more diverse training datasets. Google Translate now supports 249 languages, a substantial increase from its initial three in 2006 2. However, this still represents less than 4% of the world's estimated 7,000 languages.
While translations between some language pairs, like English and Spanish, are often flawless, the service still struggles with idioms, place names, and technical terms. For many language pairs, Google Translate provides only a general understanding of the text, often containing serious errors. The 2024 annual evaluation of machine translation systems concluded that "MT is not solved yet" 1.
Despite its limitations, machine translation is widely used, with Google Translate reaching 1 billion app installs by 2021. Users seem to understand the need for caution, primarily using these services in low-stakes settings. A 2022 survey found that only about 2% of translations involved high-stakes situations like healthcare or law enforcement interactions 2.
However, the risks associated with machine translation errors in critical settings are significant. Studies have shown potential for serious harm in healthcare scenarios, and there are reports of machine translations negatively impacting asylum cases 1.
The translation industry still relies heavily on human translators for high-stakes settings like international law and commerce. However, the rise of machine translation has altered the landscape for these professionals. Many now focus more on quality assurance rather than primary translation work, leading to concerns about job security and what researchers term "automation anxiety" 2.
The recent introduction of the Chinese AI model Deepseek, which rivals OpenAI's GPT models at a fraction of the cost, suggests that sophisticated LLMs are becoming commoditized. This trend mirrors the widespread adoption of machine translation technologies 1.
However, LLMs face significant challenges. Their fundamental limitation is data, having already exhausted much of what's available on the internet. The training data likely underrepresents many tasks, similar to how it underrepresents most languages in machine translation. This problem is even more pronounced with generative AI, where it's difficult to determine which tasks are well-represented in an LLM 2.
While there are efforts to improve training data and explore avenues like synthetic data generation and learning from human feedback, these approaches have shown limited success in machine translation. The scope of the challenge for LLMs is significantly larger, suggesting that the path forward for AI may not be as straightforward as some tech optimists believe.
Summarized by
Navi
[1]
Goldman Sachs is testing Devin, an AI software engineer developed by Cognition, potentially deploying thousands of instances to augment its human workforce. This move signals a significant shift towards AI adoption in the financial sector.
5 Sources
Technology
3 hrs ago
5 Sources
Technology
3 hrs ago
RealSense, Intel's depth-sensing camera technology division, has spun out as an independent company, securing $50 million in Series A funding to scale its 3D perception technology for robotics, AI, and computer vision applications.
13 Sources
Technology
3 hrs ago
13 Sources
Technology
3 hrs ago
AI adoption is rapidly increasing across businesses and consumers, with tech giants already looking beyond AGI to superintelligence, suggesting the AI revolution may be further along than publicly known.
2 Sources
Technology
11 hrs ago
2 Sources
Technology
11 hrs ago
Elon Musk's artificial intelligence company xAI is preparing for a new funding round that could value the company at up to $200 billion, marking a significant increase from its previous valuation and positioning it as one of the world's most valuable private companies.
3 Sources
Business and Economy
2 hrs ago
3 Sources
Business and Economy
2 hrs ago
Amazon Web Services is set to unveil an AI agent marketplace, featuring Anthropic as a prominent partner, aiming to streamline AI agent distribution and accessibility for businesses.
2 Sources
Technology
19 hrs ago
2 Sources
Technology
19 hrs ago