Curated by THEOUTPOST
On Mon, 23 Dec, 12:00 AM UTC
8 Sources
[1]
We Did Not Reach the AI Promised Land in 2024
'Agentic' AI is the talk of the town in Silicon Valley and beyond, but can it avoid the hype pitfalls of AI in 2024? The old saying goes that, with tech, you should never buy the first generation of anything new. Wait for the devs to work out the kinks, then check back. We’re now two years into the AI “revolution,†and we’re being dragged into the third. AI should be the next big thing already; the ruffles should have been smoothed out, and the puzzle pieces should all fit. It's not there yet. This year was big on AI, but next year will show the true promise of on-device artificial intelligence come alive. Where have we heard that one before? AI has not lived up to many of the promises put forth by tech companies, both big and small. In 2024, AI-specific devices fell flat. AI on Mac or PC hasn’t made a strong impression either. There hasn't been a wave of AI applications that use new laptop's neural processors, and most applications rely on cloud computing. The main AI applications seem to be coders finding ways to kill their own industry. Otherwise, grifters are using AI to fill the internet with fakes, junk, and slop. On-device AI pushes regular consumers to write or summarize emails with AI. That doesn't exactly sound like the killer AI app. That's why big tech is now pushing "agentic" AI. Companies promise large language models will do all your busywork for you seamlessly, non-intrusively. Perhaps, with agents, AI can come alive in 2025. We have only seen some demos of how this AI will work. Surveys show that current AI features don’t enthuse Apple and Android users. In essence, big tech needs agentic AI to take off. Without it, regular users will wonder what the fuss was for. We don't know how these AI agents will work next year, but we know exactly how Silicon Valley will push it to users, whether we want them or not. This year brought us a slew of AI wearables and handheld devices, like the Humane AI Pin and the Rabbit R1. Both devices launched far too soon, with obtuse software that effectively provided little more than quick access to an AI chatbot like ChatGPT. There was an avalanche of bad products so big we didn’t have the time to cover it all. I’ve used Timekettle’s X1 Interpreter Hub, a pocket-sized translator stick that touts its AI translation capabilities. It could hold its own back and forth from English to Spanish in our tests. However, trying English to Urdu would start inputting random Pakistani celebrities or references to God in the middle of an interpretation. It was insulting and hilarious in equal measure to my Urdu-speaking colleague. It did worse in some other languages than the Google Translate app. And it wasn't just smaller brands that couldn't meet the full promise of device-specific AI. Meta's Ray-Ban glasses' AI image recognition features sometimes struggle to comprehend what's in front of them. At least those glasses can still take pictures without needing cloud-based AI, something other devices can't manage. The $700 Humane AI Pin didn't live up to its lofty promises. Reviewers noted it would often fail to identify objects in front of it correctly, and even when it was accurate, it was hampered by poor battery life and heat issues. Humane later recalled the charging pack due to concerns over fire risks. Once valued at around $850 million, the company reportedly saw more returns than sales into the middle of the year. The promise of device-specific AI was squashed again and again. The Rabbit R1 launched a few weeks after the Humane pin. CEO Jesse Lyu directly compared his $200 device to his rivals and claimed his “personalized operating system†and “Large action model†would be your true AI assistant. The launch was a disaster. Users quickly opened the LAM and found that the Android-based OS could run on phones. Most of its capabilities were facilitated through the cloud. The device could also connect to some outside apps, but white hat hackers and developers found they could access user data also available to internal Rabbit staff. There has been more AI-centric hardware, like the Plaud NotePin, which offers AI-based transcription and note-taking. It works thanks to a limited use case. Inevitably, you will ask whether your current device can handle these same capabilities. Google has Pixel Recorder, and iPhones and Macs have voice memos with transcription capabilities. To their credit, AI hardware developers have tried to improve their devices. In November, Rabbit updated its OS to allow “custom AI agents†with a Teach mode. This was essentially promised with the LAM half a year ago. The mode is still in beta, but the problem remains that the device does not have direct access to the apps you want it to use. In December, Humane started promoting its CosmOS, “built from the ground up for AI,†to devices outside the AI Pin. They want to put it in cars, use it for smart home tech, and even stick it in your TV to analyze on-screen action. The "intelligent conductor†will essentially operate like any other agentic offering, digging into your devices and information to perform tasks on your behalf. The switch from “AI device†to “AI agent device†was seamless. The promise of these devices failed to impress, but they now use the same hype strategy for agentic AI. We expect more of these kinds of devices at CES 2025 next month. They’ll use the same language for “AI assistant,†but it will be in the new Agentic flavor of the week. The jury is out on whether they’ll be good, but it doesn’t look good if these devices can’t figure out something your phone doesn’t already do. Chipmakers like Intel and Qualcomm hammered home the point about their neural processors or NPUs. That was the story with Qualcomm’s Snapdragon X Elite and X Plus chips. Microsoft christened any PC with Qualcomm’s ARM-based chip, a “Copilot+ PC.†All those “AI PCs†with Intel’s Meteor Lake were left out in the cold. I sat in front of Intel in January and asked one of the company’s senior VPs, Sachin Katti, whether the initial run of “AI PCs†was truly capable of running AI on-device. Yes, they could, he told me. The only issue was the lack of apps. For the first time in the history of tech, the technology outpaced the available applications. It was up to the developers to meet demand, he said. The biggest AI apps in 2024 were chatbotsâ€"like Perplexity, Claude, ChatGPT, and moreâ€"none of which required on-device AI processing. Then came Copilot+. It was the turning point for ARM-based chips on PC with the new Qualcomm Snapdragon X Elite and X Plus. Each chip had an NPU capable of 45 TOPS, or trillions of operations a second (a derived value that’s arguably not great at describing AI capabilities). None of those previous Intel chips met the requirements to be Copilot+. It wouldn’t be until AMD’s Strix Point and Intel’s Lunar Lake months later that Team Blue and Team Red could claim the coveted Copilot+ moniker. Using those features was another matter. The PCs shipped with the new Copilot button for instant access to Microsoft’s favored chatbot. However, the only on-device AI features included were a few AI image generators and live captions on video calls or in videos. Microsoft’s premiere AI feature, Recall, was supposed to give your PC “photographic memory†by screenshotting everything you did and then transcribing it with AI. Microsoft delayed the feature just before many OEMs planned to release their first laptops. Security researchers proved that screenshot transcriptions could be accessed without any real security layer. Microsoft only allowed Windows 11 beta testers access to the feature in November. Judging by the latest beta build, Recall still requires some fine-tuning. It works. If you’re okay with your life and some potentially sensitive info being screenshotted, it’s handy for those with bad memories. Then you get to Apple, and the current AI features arrived so late in 2024 that it was better if they were all delayed until 2025. The latest macOS Sequoia 15.2 stable build arrived in December, bringing the Image Playground and ChatGPT integration with Siri to Macs. At the very least, you only need an M-series Mac to access these features, unlike the iPhone, which requires an iPhone 15 Pro or iPhone 16 model. If you have an older Apple device, you’re not missing anything. Image Playground creates cartoonish images of you or your friends with faces that look like a cross between a lazy caricature artist and big-head mode in an old-school video game. ChatGPT Integration offers little more than a typical Google search. It also makes it difficult to find past chats through the built-in widget, which is now prominently on the top toolbar. The NPUs for these devices can only run simplistic or background AI tasks. For more complex AI tasks, like running the top-end AI models promoted by these companies, you need a GPU. A Nvidia GeForce RTX 4090 can do upwards of 1,300 TOPS, 26 times what today's top-end on-chip NPUs can do. In December, Nvidia launched the $250 Orin Nano, which was built specifically for running AI applications locally. The processor promises 67 TOPS. The latest and greatest Gemini models are available to new Chromebook Plus owners, so I’ve become acquainted with Google’s on-device AI, even beyond phones. In December, Google brought out Gemini 2.0, the advanced mode for Gemini Advanced subscribers. You would have to be a very dedicated user to tell the difference between models. The new version should have better coding and language ability, but if you only use it for text, the difference is that 2.0 Pro will be more verbose than 1.5 Pro. A big reason AI is becoming “agentic†is “the wall.†In AI circles, it's the colloquial term for how providing more training data to AI results in diminishing returns. OpenAI cofounder Ilya Sutskever, who hasn’t minced words about his former employer, told a conference crowd in Vancouver that AI developers are running out of data to train AI models, saying, “We have to deal with the data that we have. There’s only one internet.†That's not to say AI models can't improve. Sutskever, now a co-founder of the startup AI Labs, previously told Reuters that the age of "scaling" is over and that now is the time of "discovery." Newer models, like OpenAI's GPT-o1 model, are designed with better reasoning in mind. But better benchmarks don't necessarily result in better results for a base user. If you’re not already impressed with today’s AI models, you probably won’t be with next year’s big releases. That's why OpenAI is promoting Altera AI agents, and reports hint Sam Altman's big AI company will launch an autonomous AI agent codenamed "Operator." That's why agents have to take off. Anthropic, the makers of Claude, offered us a taste of what this entails in a demo released in October. Demos show how users could ask Claude 3.5 Sonnet to access Google Chrome, type out a Google search, and then add an event to the users' calendar. It's an entertaining demo, though you're offering the AI a deep look into your personal life. Anthropic noted that the AI accidentally stopped the company's screen recording at one point, which was all on its own. If the AI fails in any one part of a long chain of tasks, it can cause a cascade of issues for the entire prompt. Imagine if it books the wrong flight for you or puts the wrong time on your calendar for when you're supposed to pick up your mother from the airport. Late last year, I speculated about the rise of AI on PC. This was before Microsoft brought the Copilot key kicking and screaming into this world. I wondered what it would be like if AI could take over my PC and control settings without digging through Windows settings. Imagine telling your PC to bring up the controls for your laptop’s brightness setting without needing to surf through either Windows or whatever bloatware was first included on your device. What if it could do this without an internet connection, using models housed on-device so I don't have to worry about outside agencies accessing my emails or calendars? Settings aren't sexy, but making it easier for users would be a boon. Apple has promised that Apple Intelligence will instead be the kind of everyday-life assistant. It wants you to imagine if every iPhone, iPad, or Mac user had a butler capable of diving into your emails, pulling out the necessary information, and turning that into a calendar event. Agentic applications give AI access to a lot of your sensitive information. This isn’t the sort of AI that can be handled on-device; it requires cloud processing. Apple promises to keep your information safe with a private cloud computing structure that creates a firewall between your information and the company’s servers. So far, Microsoft's agent initiatives have focused on their enterprise end, specifically for those using 365 apps in business settings. It promotes a Copilot Studio for businesses to create their in-house AI agents. As its FAQ states, OpenAI has direct access to your chat logs on ChatGPT, but it claims it's limited to “authorized personnel.†Google has not spelled out its privacy plans for when Gemini goes agentic, but the company does have access to your activity, including your chats. It claims it uses this information to “improve Google products and machine-learning technologies.†Agentic AI is coming. Over time, it will slide onto our phones, computers, and other devices under the banner of “experimental†or “beta†features. Major chipmakers will continue to tout the TOPS value of their new CPUs, and Google, Microsoft, and Apple will try to outrace each other with their AI-based assistants. It will be the same old, in the endless march of hype.
[2]
Emerge's 2024 Story of the Year: The Race to Rule AI - Decrypt
OpenAI's rise to tech stardom reads like a Silicon Valley soap opera. The company that began 2024 in turmoil after Sam Altman's dramatic return has morphed from a cautious non-profit company to an AI powerhouse worth $157 billion. With a $13 billion investment from Microsoft and a deal to power Apple iPhones, the company is on track to generate $11.6 billion in revenue. Game over for all the other suckers trying to elbow their way into the AI market? Hardly. The money flooding into AI continued to reach tsunami proportions throughout 2024, with billions of dollars going into funding dozens of worthy competitors around the world, ranging from China's Moonshot AI to Paris-based Mistral. During the last week of November alone, Anthropic and Elon Musk's xAI pocketed $4 and $5 billion respectively. Investors who once saw OpenAI as the new big thing in tech are spreading their bets across a field of nimble, hungry competitors and anything ending in AI could be the next frontrunner -- especially after delays from OpenAI, including Sora, its ballyhooed audio cloning tool (and, supposedly, its next GPT model.) It seems like decades since ChatGPT burst onto the scene with its revolutionary chatbot, which users could communicate with as easily as talking to a friend. (The chatbot launched in November 2022 and added speech in September 2024.) Almost overnight it made Google search look antiquated, something that the olds use. These days, it's probably the first thing a GenZ thinks about when someone says "AI." The company that sparked the AI revolution with ChatGPT pushed the boundaries of AI capabilities in 2024 after the boom of GPT-3.5, releasing its multimodal GPT-4o in May with an unprecedented 88.7% score on the MMLU benchmark. By September, its new o1 model -- which is supposed to handle complex reasoning -- raised the bar again, achieving 83.3% accuracy on International Mathematics Olympiad questions -- a massive leap from GPT-4o's 13.4%. With over 200 million weekly active users, OpenAI's influence ran so deep that Google, deep in the throes of the Innovator's Dilemma, suddenly began to fear for its $160 billion search business. But money and fame don't guarantee a happy ending, especially in tech. Anyone remember the Blackberry? Google certainly isn't going the way of the Blackberry any time soon. Besides, it's not like it was caught with its pants down. OpenAI was launched in December, 2015 and two months later, the search giant unveiled its new Gemini Ultra foundational model, with the capacity to process 2 million tokens of context -- making OpenAI's GPT-4 look light weight. AI is so integral to Google that CEO Sundar Pichai announced the company is moving from mobile to an "AI-first" strategy. And not a second too soon: The switch to Gemini boosted the popularity of Google One -- the tier that offers access to Gemini Ultra -- to over 100 million subscriptions 24 hours after its release. Google is not just flexing technical muscle; ChatGPT is still widely used but Google may be onto something. Its foundational model also powers its RAG, short for "retrieval augmented generation," platform NotebookLM, which was first conceived to help people handle huge amounts of information and data on big amounts of files. The product didn't move a needle until an update changed the way people used RAG models. Welcome, Google's podcast generators. That feature alone was enough to boost the model's popularity, and increase user engagement with a pretty active Reddit community, interesting social experiments and even some business applications to explore. "I think we've learned a lot in the last year; what is really resonating with people, what is really useful, how they're using it every day," Raiza Martin, a product manager at Google Labs, told The Independent. Meanwhile, OpenAI has given us a voice mode to talk to, still keeps its 128k token context window, and under legal threat, removed its horny Scarlett Johanson voice. The Gemini lineup has been key for Google's outstanding performance. Since January 2023, its stock price doubled, reaching an ATH on July 10, 2024. Smaller than Google, but probably as important in terms of its role in the boom of AI chatbots this year is San Francisco-based Anthropic. Founded by former OpenAI researchers, Anthropic emerged as OpenAI's most formidable challenger in 2024, turning heads with explosive growth and deep-pocketed backers. The rivalry between ChatGPT and Claude, Anthropic's suite of large language models, is the equivalent of the Cold War in AI culture. When one company releases a feature, the other immediately strikes back. The two models are always competing for the top spot in the LLM Arena and the community is always trying to decide which one is the best. Anthropic's revenue grew over 1,000% this year after the launch of Claude 3.5 Sonnet in June, with a major part powered by third-party API users. Anthropic's rise mirrors the early days of OpenAI, but with an even steeper trajectory. And the investments keep flowing in. Amazon has poured $8 billion into the startup, while Google agreed to invest up to $2 billion. These cash infusions were also probably strategic bets from tech giants hedging against OpenAI's dominance, and a mirror of a proxy war between cloud computing providers, with Microsoft supporting OpenAI to benefit Azure versus Amazon supporting Claude to benefit AWS. Anthropic and OpenAI mine the gold, whereas Microsoft and Amazon basically sell the shovels. Anthropic's success is not just due to its models' quality. Overall, the company pushes for a broader shift in the AI landscape. While OpenAI chased consumer popularity, Anthropic focused on specialized models that prioritized safety, betting that enterprise customers would pay premium prices for targeted solutions. In Europe, French startup Mistral AI raised eyebrows by securing $1 billion in funding and a $6 billion valuation in June. Its open-source models are matching GPT-4's capabilities at a fraction of the cost. To put this in perspective, this means, according to calculations by Trending Topics, that "Mistral AI is worth €105m per employee," making it the most valuable startup in European history. A recent upgrade called "Le Chat," is being positioned as Mistral's ChatGPT killer. It offers basically everything ChatGPT does -- for free. It provides good outputs, handles code, generates images, supports agents and browses the web in real time. And Mistral is just getting started, with a planned expansion into the United States and a new office in the belly of the beast in Palo Alto, California. Chinese firms aren't just copying anymore -- they're innovating. Even with a major embargo imposed by the United States in an attempt to stifle innovation, companies like Baidu, Alibaba, and Baichuan AI, backed by massive government support, are developing models tailored for emerging markets in Southeast Asia, the Middle East, and Africa. They're building a parallel AI ecosystem that could rival anything coming out of San Francisco. Huawei, the massive telecom giant supported by the state, for example, released its own OS and is embedding it into its lineup of smartphones and home devices, creating a fully functional ecosystem. Another Chinese model, Yi Lightning beats GPT-4o and Claude 3.5 Sonnet in the LLM Arena, the new Deepseek has come to compete against OpenAI's reasoning model o1, and Baidu's Ernie reached 100 million users in December 2023 and boosted its user base to over 200 million users in April 2024 According to the World Economic Forum, China's AI market is estimated to top $61 billion next year with VCs pouring over $120 billion into AI ecosystems. Believe it or not, Meta plays the role of the good guy in this story. While competitors such as OpenAI and Anthropic fought over market share with expensive, locked-down models, Mark Zuckerberg's company took a different route: open sourcing its technology. Llama 3.2, released in September, showcases just how far Meta's open-source strategy has come. The model processes both text and images, powers everything from augmented reality apps to visual search engines, and with Meta looking to work with U.S. government agencies to use its model on national security applications. And beyond government and corporations, the Llama-powered Meta AI chatbot is a pretty promising ChatGPT competitor. Recently expanded into dozens of countries, the chatbot can generate images of similar or arguably better quality than Dall-e 3 and even animate them (which ChatGPT cannot), search the web, handle coding tasks, and imagine scenes on the spot. Beyond that, Meta has other generative AI models for audio generation, video editing, segmentation, drawing animation, and more With Llama 4 around the corner in 2025, promising even better handling of text, voice, and images, Meta's betting that open beats closed every time. The numbers back up Zuckerberg's gamble. Meta AI, the company's answer to ChatGPT, racked up 500 million monthly users, with India leading the charge. Over a million advertisers jumped on Meta's AI bandwagon in September alone, cranking out 15 million AI-generated ads. The company's revenue shot up 18.9% year-over-year to hit $40.6 billion in Q3, with ad impressions climbing 7% while prices rose 11%. All this AI goodwill doesn't come cheap. Meta's dumping $38 billion into capital expenditure this year, mostly on AI research and the hardware to run it. That includes cramming its data centers with 350,000 of Nvidia's prized H100 AI chips by year's end. But Zuckerberg's playing the long game. By partnering with cloud giants AWS, Google Cloud, and Microsoft Azure to host Llama models, Meta's building an ecosystem that could reshape how AI technology spreads -- and who profits from it. Not bad for a company that lost almost 75% of its value when it started to focus on the "metaverse" and grew 600% since its shift to artificial intelligence as its key business model. Is OpenAI still king of the hill? Technically, maybe. At least it is the most recognizable -- and valuable -- AI startup in the scene. But the hill itself has changed. The race isn't about raw power anymore -- it's about trust, accessibility, and real-world impact, and a few of these areas, notably trust and safety, are a bit murky right now for Sam Altman's unicorn OpenAI's early lead has evolved into a complex web of specialized players, each carving out their own niches. Some focus on consumer applications, others on enterprise solutions, and a few brave ones tackle the fundamental research that could unlock AGI. For the immediate future, Sam Altman seems very confident OpenAI can reach AGI next year, which would put OpenAI on top of the hill for probably a long, long time if -- and this is as "if" as it can get -- they succeed. However, other very respected and talented experts, like Meta's chief of AI research Yan LeCun, believe such achievement is accomplishable in around 10 years or so. Whether you decide to be optimistic or pessimistic will depend on who's your favorite rockstar. That said, the real winner of 2024's AI race is the users. Competition drives innovation, but it also forces companies to address concerns about safety, privacy, and accessibility. This is why the Amodei brothers left OpenAI to found Anthropic and why Ilya Sutskeyver left to found Safe Superintelligence. This is why Huawei developed its own mobile OS and used domestic technology to build one of the best phones of the year, why developers come up with customized, better versions of the most popular AI models, and why AI as a technology has become a social phenomenon in the last two years. As these AI titans battle for supremacy, they're building tools that transform everything from how we work to how we create and communicate. And that 'one AI to rule them all'? Maybe that was the wrong question all along. In 2025's AI landscape, diversity will likely play an important role. And that might be exactly what we need.
[3]
5 A.I. Trends to Watch in 2025: Agentic A.I., 3D Models, Synthetic Data and More
What Big Tech and leading A.I. companies are spending billions building in 2025. In the first eight months of 2024, Microsoft (MSFT), Meta (META), Google (GOOGL) and Amazon (AMZN) collectively recorded a staggering $125 billion in A.I.-related capital expenditures (CapEx) and operating costs, according to a September JPMorgan report. The cumulative CapEx of these four tech giants alone is expected to soar past the $200 billion mark for the entirety of 2024. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters A.I. startups, meanwhile, received unprecedented amounts of funding from investors eager to cash in on the technology's lucrative potential. OpenAI is set to finish out 2024 as the most well-funded A.I. company, most recently valued at $157 billion. Its rival Anthropic is gearing up for new fundraising that cold value it at $40 billion. Flush with cash, leading A.I. companies are now tasked with the challenge of proving to investors -- and the public -- that their pricy bets on the new technology will pay off. From an ongoing pivot into "agentic A.I." to emerging new scaling laws and wide-ranging explorations of A.I.'s myriad capabilities, here's a look at what 2025 will bring to the world of A.I.: Agentic A.I. will be "the next giant breakthrough" The buzzword refers to autonomous A.I. assistants able to complete tasks without human oversight. The potential of A.I. agents to enhance workplaces and everyday life quickly caught notice in Silicon Valley, with companies like Salesforce embracing agents as their next major product. Microsoft, too, has rolled out a slew of A.I. agents in recent months. In November, it unveiled several A.I. assistants customized for its Microsoft 365 suite, including an agent able to provide translation in nine different languages. OpenAI is also on the "agentic A.I." train, with an upcoming model expected to be able to perform tasks like booking travel and writing code. A.I. agents are "the thing that will feel like the next giant breakthrough," said Sam Altman, OpenAI's CEO, during a recent AMA on Reddit. The global market for A.I. agents is currently valued at more than $5 billion, according to the research firm MarketsandMarkets. By the end of the decade, this figure is expected to soar to $47 billion, driven in part by demand for agents amongst enterprise clients. Test-time compute could be a solution to A.I.'s training data crisis One of the key components of A.I.'s success in recent years has been the mass amounts of data fed into A.I. models. But there's only a finite amount of text, images and videos on the internet. To avoid a plateau in the technology's development, A.I. companies are turning to alternative ways to train their models. One of the most promising solutions is test-time compute, where A.I. models improve by reasoning and taking longer to think about potential answers before responding -- a theory most recently demonstrated by OpenAI's o1 model. On an earnings call in November, Nvidia (NVDA) CEO Jensen Huang described OpenAI's new model as "one of the most exciting developments" in scaling and noted that "the longer it thinks, the better and higher-quality answer it produces." Huang isn't alone in his optimism. Microsoft CEO Satya Nadella also pointed towards test-time compute as a new scaling law in November, while OpenAI co-founder Ilya Sutskever earlier this month highlighted it as a progression of A.I.'s pre-training era. Synthetic data is another promising solution Another solution to A.I.'s data crisis is replacing traditional data with information generated by the technology itself. The market for synthetic data is expected to skyrocket to $2.1 billion by 2028, representing an over 450 percent hike from 2022, according to BCC Research. Altman hinted at the potential of synthetic data a year ago when discussing A.I.'s dwindling data supply, remarking in an interview that "as long as you can get over the synthetic data event horizon, where the model is smart enough to make good synthetic data, I think it should be all right." OpenAI, alongside competitors like Anthropic, Meta, Microsoft and Google, have all reportedly begun using synthetic data in some way to train and fine-tune models. In October, the A.I. startup Writer unveiled a new A.I. model trained entirely by A.I.-generated data. The approach allowed the company to cut significant costs in developing the model, which totaled at a mere $700,000 in comparison to the millions doled out by other companies. OpenAI's GPT-4 model, for example, cost more than $100 million to train. "Large world models" will create 3D A.I. worlds Until now, much of A.I.'s visual outputs have remained two-dimensional, something tech pioneers are looking to shift in the coming years. "Large world models" are an emerging form of A.I. that aims to build interactive three-dimensional scenes advancing the worlds of movies, games and simulators. One of the largest players in this space is World Labs, a new startup established by Stanford A.I. pioneer Fei-Fei Li that raised $230 million earlier this year. The venture is looking to build large world models with "spatial intelligence," a form of intelligence that understands and interacts with the real world. To demonstrate this concept, Li has previously used the example of a cat reaching out to topple over a glass of milk and humans' ability to predict the consequence of this event and therefore take action to prevent the glass from falling. At the beginning of December, Google DeepMind launched its own large world model in the form of Genie 2, which simulates virtual environments that will be used to train and evaluate A.I. agents. The area will likely be a key focus for the lab going forward, as evidenced by its recent hiring of Tim Brooks, a former OpenAI researcher overseeing its video generator Sora. In an X post welcoming Brooks to his team, Google DeepMind CEO Demis Hassabis noted his excitement at "working together to make the long-standing dream of a world simulator a reality." A.I. search engines will reshape online search Google has long had a seemingly untouchable dominance on the search market. But with the advent of A.I., a proliferation of A.I.-powered search engines is looking to shake the tech giant's foothold. Not that Google hasn't embraced the technology itself. In 2024, it rolled out AI Overviews, a feature that provides users with A.I.-generated summaries rather than links. CEO Sundar Pichai predicts the feature will attract more than 1 billion users monthly users and is already "increasing overall search usage and user satisfaction," he told Wall Street analysts in October. But Google will have to contend with an increasingly crowded industry for search tools, as companies like OpenAI and Microsoft expand into the arena with the help of A.I. Meta is also reportedly preparing to launch its own A.I.-powered search engine, while the startup Perplexity AI has emerged as an especially formidable player. Recently valued at $9 billion, its A.I. search tools already process around 20 million queries on a daily basis, up from 2.5 million at the beginning of 2024.
[4]
The Future of AI Shouldn't Be Taken at Face Value
It costs a lot to build an AI company, which is why the most competitive ones are either existing tech giants with an abundance of cash to burn or start-ups that have raised billions of dollars largely from existing tech giants with an abundance of cash to burn. A product like ChatGPT was unusually expensive to build for two main reasons. One is constructing the model, a large language model, a process in which patterns and relationships are extracted from enormous amounts of data using massive clusters of processors and a lot of electricity. This is called training. The other is actively providing the service, allowing users to interact with the trained model, which also relies on access to or ownership of a lot of powerful computing hardware. This is called inference. After ChatGPT was released in 2022, money quickly poured into the industry -- and OpenAI -- based on the theory that training better versions of similar models would become much more expensive. This was true: Training costs for cutting-edge models have continued to climb ("GPT-4 used an estimated $78 million worth of compute to train, while Google's Gemini Ultra cost $191 million for compute," according to Stanford's AI Index Report for 2024). Meanwhile, training also got a lot more efficient. Building a "frontier" model might still be out of reach for all but the largest firms due to the sheer size of the training set, but training a fairly functional large language model -- or a model with similar capabilities to the frontier models of just a year ago -- has become relatively cheap. In the same period, though, inference has become much more affordable, meaning that deploying AI products once they've been built has gotten cheaper. The result was that companies trying to get users for their AI products were able, or at least tempted, to give those products away for free, either in the form of open access to chatbots like ChatGPT or Gemini, or just built into software that people already use. Plans to charge for access to AI tools were somewhat complicated by the fact that basic chatbots, summarization, text generation, and image-editing tools were suddenly and widely available for free; Apple Intelligence, for example, is able to handle a lot of inference on users' iPhones and Macs rather than in the cloud. These industry expectations -- high and rising training costs, falling inference costs, and downward price pressure -- set the direction of AI funding and development for the last two years. In 2024, though, AI development swerved in a major way. First, word started leaking from the big labs that straightforward LLM scaling wasn't producing the results they'd hoped for, leading some in the industry to worry that progress was approaching an unexpected and disastrous wall. AI companies needed something new. Soon, though, OpenAI and others got results from a new approach they'd been working on for a while: so-called "reasoning" models, starting with OpenAI o1, which, in the company's words "thinks before it answers," producing a "long internal chain of thought before responding to the user" -- in other words, doing something roughly analogous to running lots of internal queries in the process of answering one. This month, OpenAI reported that, in testing, its new o3 model, which is not available to the public, had jumped ahead in industry benchmarks; AI pioneer François Chollet, who created one of the benchmarks, described the model as "a significant breakthrough in getting AI to adapt to novel tasks." If this sounds like good news for OpenAI and the industry in general -- a clever way around a worrying obstacle that allows them to keep building more capable models -- that's because it is! But it also represents some new challenges. Training costs are still high and growing, but these reasoning models are also vastly more expensive at the inference phase, meaning that they're costly not just to create but to deploy. There were hints of what this might mean when OpenAI debuted its $200-a-month ChatGPT Pro plan in early December. The chart above contains more: The cost of achieving high benchmark scores has crossed into the thousands of dollars. In the near term, this has implications for how and by whom leading-edge models might be used. A chatbot that racks up big charges and takes minutes to respond is going to have a fairly narrow set of customers, but if it can accomplish genuinely expensive work, it might be worth it -- it's a big departure from the high-volume, lower-value interactions most users are accustomed to having with chatbots, in the form of conversational chats or real-time assistance with programming. AI researchers expect techniques like this to become more efficient, making today's frontier capabilities available to more people at a lower cost. They're optimistic about this new form of scaling, although as was the case with pure LLMs, the limits of "test-time scaling" might not be apparent until AI firms start to hit them. It remains an exciting time to work in AI research, in other words, but it also remains an extremely expensive time to be in the business of AI: The needs and priorities and strategies might have been shuffled around, but the bottom line is that AI companies are going to be spending, and losing, a lot of money for the foreseeable future (OpenAI recently told investors its losses could balloon to $14 billion by 2026). This represents a particular problem for OpenAI, which became deeply entangled with Microsoft after raising billions of dollars from the company. CEO Sam Altman has announced a plan to complete the conversion of OpenAI into a for-profit entity -- the firm began as a nonprofit -- and is in a better position than ever to raise money from other investors, even if actual profits remain theoretical. But Microsoft, a vastly larger company, still retains the rights to use OpenAI's technology and acts as its primary infrastructure provider. It's also entitled, for a term, to 20 percent of the company's revenue. As OpenAI grows, and as its independent revenue climbs (the company should reach about $4 billion this year, albeit while operating at a major loss), this is becoming less tolerable to the company and its other investors. OpenAI's agreement does provide a way out: Microsoft loses access to OpenAI's technology if the company achieves AGI, or artificial general intelligence. This was always a bit of a strange feature of the arrangement, at least as represented to the outside world: The definition of AGI is hotly contested, and an arrangement in which OpenAI would be able to simply declare its own products so good and powerful that it had to exit its comprehensive agreement with Microsoft seemed like the sort of deal a competent tech giant wouldn't make. It turns out, according to a fascinating report in The Information, it didn't: Microsoft Chief Financial Officer Amy Hood has told her company's shareholders that Microsoft can use any technology OpenAI develops within the term of the latest deal between the companies. That term currently lasts until 2030, said a person briefed on the terms. In addition, last year's agreement between Microsoft and OpenAI, which hasn't been disclosed, said AGI would be achieved only when OpenAI has developed systems that have the "capability" to generate the maximum total profits to which its earliest investors, including Microsoft, are entitled, according to documents OpenAI distributed to investors. Those profits total about $100 billion, the documents showed. This one detail explains an awful lot about what's going on with OpenAI -- why its feud with Microsoft keeps spilling into the public; why it's so aggressively pursuing a new corporate structure; and why it's raising so much money from other investors. It also offers some clues about why so many core employees and executives have left the company. In exchange for taking a multibillion-dollar risk on OpenAI before anyone else, Microsoft got the right to treat OpenAI like a subsidiary for the foreseeable future. Just as interesting, perhaps, is the mismatch between how AI firms talk about concepts like AGI and how they write them into legal and/or legally binding documents. At conferences, in official materials, and in interviews, people like Altman and Microsoft CEO Satya Nadella opine about machine intelligence, speculate about what it might be like to create and encounter "general" or humanlike intelligence in machines, and suggest that profound and unpredictable economic and social changes will follow. Behind closed doors, with lawyers in the room, they're less philosophical, and the prospect of AGI is rendered in simpler and perhaps more honest terms: It's when the software we currently refer to as "AI" starts making lots and lots of money for its creators.
[5]
2024 was a big year for Google and AI
Android 15: Availability, timeline, and what's new in Google's latest software release We'll remember 2024 for AI and LLMs (large language models). While ChatGPT remains a major player, and others such as Meta's Llama and Claude evolve, nothing like Google Gemini exists. There have been bumps in the road, especially with Gemini's Imagen image-generating system. However, Google's integration across its Workspace apps has made it an indispensable productivity tool, not to mention the new capabilities it brings to Google Assistant and smart home devices. ✕ Remove Ads Related Google Gemini: Everything you need to know about Google's next-gen multimodal AI Google Gemini is here, with a whole new approach to multimodal AI Posts A sneak peek of Google's big AI moves in 2024 Gemini rebranded the capabilities of the former Bard and Duet AIs. Gemini rolled out across platforms with a dedicated mobile app for Android, integration with the Google app on iOS, and staged integration into Google Workspace. The company also introduced NotebookLM, a novel and useful tool that functions like an intelligent personal digital notebook. Google focused on deepening the integration of AI into its existing products. Android 15, for instance, saw AI optimize core OS functions, leading to improved battery life and performance. Moreover, the company continued to refine its AI-powered camera features on Pixel phones, making computational photography a hot trend to watch in 2025. ✕ Remove Ads 5 Gemini The platform went from 0 to awesome Source: Joshuah Sharpe / Android Police Google teased Gemini's potential to redefine the boundaries of human-machine interaction at its I/O event in May 2023. When it launched, it didn't disappoint. Gemini seamlessly integrates text, code, and images. These features immediately made it an efficiency-enhancing tool for learning and research, generating creative content with ever-improving AI collaboration, and getting rote tasks done with less effort. Its expansive context window, capable of synthesizing more information than any competitor, cemented its status as a leading AI. Gemini emerged as a viable ChatGPT alternative in 2024. The platform's advanced reasoning skills, allowing for more nuanced and human-like conversations, set a new benchmark. As a Gemini Advanced user, I typically receive useful and accurate responses that improve in their understanding of context and intent. My only gripe is with the platform's troubled Imagen image generator. ✕ Remove Ads Related Google Gemini vs. Gemini Advanced: All the key differences explained What can Gemini Advanced do better and is it worth the price? Posts 4 Image creation Imagen stumbled, then made a lackluster return Despite Gemini's groundbreaking advancements, its image generation component, Imagen, got off to a rocky start. Google suspended its ability to generate images of humans in February. It came back online a few months later as a middling solution. Early iterations of Imagen were plagued by inaccuracies, producing images that ranged from misleading to bizarre, and the company was accused of manifesting cultural biases. ✕ Remove Ads While Google addressed these issues, the damage was done. The initial negative perception lingered, and with subsequent updates, Imagen failed to impress. What was initially touted as a revolutionary image-generation tool became the most underwhelming component of Gemini. 3 NotebookLM Google made AI personal Google quietly unveiled another important AI innovation in 2024: NotebookLM. This experimental project took a different approach, focusing on personalized AI experiences tailored to individual needs. NotebookLM trains its AI on the data you provide and organizes your links, images, videos, notes, and documents into a simple notebook-like interface. NotebookLM is faster than a human research assistant. In my experience, NotebookLM saves a lot of time and effort with rote research, information synthesis, and productivity tasks. It's equally adept at illuminating the big picture and managing minute details. ✕ Remove Ads NotebookLM summarizes meeting notes, generates outlines, and helps you study by creating quizzes from your documents. Its Audio Overviews feature allows it to create dynamic, natural-sounding virtual podcasts from your sources. By shifting the focus from a general-purpose AI to a personalized experience, Gemini will expand what AI means and its impact in 2025. 2 Android 15 The OS revved up on-device AI With the release of Android 15, Google began driving AI into its mobile operating system. Android 15 leverages on-device AI to optimize essential functions, improving battery life and performance. This includes intelligent resource allocation that adapts to individual app usage patterns, dynamic settings adjustments, and efficient management of background processes. ✕ Remove Ads AI is transforming the user experience in Android 15. Features like predictive app launching anticipate user needs, smart text selection streamlines interactions, and personalized recommendations within system settings offer tailored guidance. These advancements showcase Google's all-in approach to AI and how far we've come in one year. Related Android 15: Availability, timeline, and what's new in Google's latest software release Now available on Pixel 6 and newer Posts6 1 Photography Google enhanced Pixel cameras with AI processing ✕ Remove Ads We saw exciting advancements in mobile photography, primarily driven by AI processing. Google's Pixel phones, known for their camera quality, elevated their capabilities with AI-powered features. The Magic Eraser, introduced in 2021, became more precise in removing unwanted objects from photos. AI enhanced zoom capabilities by filling in details and reducing noise. Automatic editing suggestions offered new options for enhancing images with a single tap. This surge in AI-driven camera technology isn't limited to Google. Samsung made strides with its advanced scene recognition and image optimization tools. The competition between these manufacturers and others, like Apple, fueled rapid innovation. As a result, smartphone users can now capture, edit, and share better photos than before without using a computer-based image editor. 2025 will bring opportunities and risks It was a big year for Google's AI endeavors, marked by innovative products and deep integration of AI across its vast ecosystem. Gemini staked its LLM turf, boasting an expansive context window (now 2 million tokens in Gemini Pro) and impressive multimodal capabilities. This positioned Google as a leader in the rapidly evolving AI landscape, surpassing ChatGPT by certain measures. ✕ Remove Ads However, 2024 highlighted AI's complexities and potential pitfalls. Gemini's Imagen faced setbacks due to inaccuracies and biases. Meanwhile, Google's advancements in personalized AI with NotebookLM, on-device AI, and AI-powered photography showcased the potential for artificial intelligence to simplify and enhance our lives. Responsible development must remain a foremost concern as Google continues its AI initiatives. Google's missteps have been forgivable gaffes, but its progress has been remarkable and is accelerating. Still, we must remain vigilant for bad actors who use AI for nefarious purposes. Deepfakes, AI robocalls, and other threats must be kept at bay by combining human vigilance and technological intervention from industry leaders such as Google. ✕ Remove Ads
[6]
Having AI act as a second pair of eyes is its best use-case
The first wave of AI features has been defined by summarization. Apple, Google, and Samsung all offer some version of notes, notifications, or email summaries, in some cases without the need to connect to the internet at all. That makes a good bit of sense: large language models are trained on mountains of text, so it might follow that they're able to condense that text efficiently. ✕ Remove Ads If you've used any of these features, you know that their quality is mixed at best, but they're ultimately laying the groundwork for something better. The next big wave of AI tools is focused on not just feeding AI text, but letting it deal with the vast wilderness of things happening on your screen. The various implementations vary, but they all point to the same thing: a contextual AI that can act as a second pair of eyes on whatever you're doing. Here's why it's the sweet spot for generative AI, and why device makers are at a real advantage when it comes to offering these features to users. Related 8 common AI myths busted AI is not here to steal our jobs Posts A second pair of eyes Circle to Search, Pixel Screenshots, and Copilot Vision ✕ Remove Ads It's not dependent on generative AI, but Google's Circle to Search feels like the first 2024 example I can think of where letting software see your screen came with advantages that outweighed the costs. Circle to Search, which debuted on the Samsung Galaxy S24, but is technically a part of Android now, is essentially a specialized version of reverse image search. Long-press your phone's navigation bar and the screen will freeze, letting you circle anything on your phone you want to learn more about. That could be a pair of shoes someone's wearing in a TikTok video, or text on a poster. Circle to Search can pull up information about all of them, helping you find a product you want to buy, defining a term, or translating text you don't understand. Related How to use Circle to Search on the Google Pixel Tablet Just draw a circle to search for anything Posts ✕ Remove Ads As of December 2024, anything you've circled in Circle to Search can also be sent to Pixel Screenshots, a new app introduced alongside the Pixel 9 for cataloging screenshots. It uses AI to sort screenshots into different categories, and as part of a recent update, suggests the content from images as suggestions in GBoard. Those features, along with the general ability to just ask Gemini questions about what's on your screen (mostly focused on summarization, unless you're watching a video), gesture at what's possible when you give AI a view of what you're looking at. Microsoft ✕ Remove Ads Microsoft has started to take those basic ideas even further in the Edge browser. It's new experimental Copilot Vision feature lets you talk to the AI assistant while you're browsing and answer questions about whatever you're looking at. The feature is limited and capable of producing errors in much the same way a normal text chat with Copilot can, but it represents what I think might be the sweet spot for these kinds of AI features. You can ask for basic recommendations that you could probably answer for yourself just by exploring a website more thoroughly, but also make more specific requests, even letting Copilot help you cheat on a round of Geoguesser. The number of websites you can use Copilot Vision on is deliberately limited for now, which Microsoft says is part of the considerations for security and copyright it's making, but there are plans to expand. Any data connected to what you actually say during a Copilot Vision session, or the contextual website information connected to those questions and requests, isn't saved after you've turned off Copilot Vision. ✕ Remove Ads It seems like an even more natural way of getting help than Circle to Search or Pixel Screenshots, and I wouldn't be surprised if it became the norm across all the major AI platforms. Or at least the ones integrated into operating systems or web browsers. Letting AI view your screen can have drawbacks Device makers are in a unique position to guarantee users' safety Microsoft / Android Police The problem with all of these screen sharing features is that people are often looking at things they wouldn't want to share with Al. That's why Microsoft insists that Copilot Vision doesn't remember anything it "sees." The company was heavily criticized for privacy issues with Windows Recall, which, unlike Pixel Screenshots, captures images of your screen without your input to create a timeline of everything you've done on your computer. ✕ Remove Ads There were obvious problems with that idea -- an AI shouldn't capture a screenshot of your bank account or government ID -- and Microsoft had to completely overhaul how Recall works and stores screenshots to get it in a position to be actually released. The problem with all of these screen sharing features is that people are often looking at things they wouldn't want to share with Al. Owning an operating system and the hardware it runs on gives you a unique advantage with these kinds of AI features, because you can have precise control of what these models have access to and when. That's a key element of Apple's privacy-focused approach to AI on the iPhone, and one of several reasons it hasn't released an updated version of Siri that can access your phone's screen and apps. ✕ Remove Ads An AI less focused on general knowledge is good Source: Pexels The large language models that power generative AI might be trained on a huge amount of data, but their ability to actually have a deep well of accurate knowledge is not guaranteed. They can come up with incomplete answers just as often as they can lie to a straightforward question. The strength of AI apps like Google's NotebookLM is that they make an AI model responsible for answering questions about a much smaller amount of information: whatever sources you upload yourself. Letting AI see your screen feels like the upper limit of that same kind of skill, where the limitation you're providing is whatever you're seeing. It's wider than a few PDFs or YouTube videos, but it's much narrower than expecting an AI to be an answer machine for all human knowledge. That seems like the right level for a useful AI to be operating at. ✕ Remove Ads
[7]
Galaxy AI's half-baked tools offer too little to sell me on new Samsung phones
Today's most hyped AI apps don't do much for me. Indiscriminate content scraping soured me on image generation, and I avoid large language models when possible. The underlying technology fascinates me, but I don't seek out problems for it to solve. Still, we can't avoid it. Smartphone sales have plateaued for years, so manufacturers need to keep consumers interested. Galaxy AI's attention-grabbing features often look impressive and can do remarkable things. I'm not here to disparage AI in general. ✕ Remove Ads AI's processing demands justify new devices that exceed the average consumer's needs. For example, the high-end Galaxy Tab S10+ hardly outperforms the Tab S9 FE+ in streaming movies and browsing social media. However, only the nearly $1,000 S10+ supports Samsung's nascent toolkit. I argue that Galaxy AI isn't useful enough to justify an expensive, new toy that tackles everyday tasks a bit quicker than its half-price siblings. Related In the AI era, Google, Samsung, and Apple have made us all beta testers Say goodbye to finished software Posts Galaxy AI features are still new and nebulous Great on paper, less so in real life Source: Adobe Firefly Sure, Adobe Firefly, that's how humans usually sit. ✕ Remove Ads Experts and users generally agree that Galaxy AI offers less variety and utility than Google's offerings. While AI feature criticisms apply to multiple platforms, Samsung's practically carbon-copied new Galaxy phones fall in the crosshairs. Tools like live conversation translation and post-recording slow-motion video conversion show promise. They can make life better for everyday folks with ordinary mobile computing habits. With a few taps, they can connect us with neighbors or dramatize recordings of our dogs soaring for a ball. That is, if they work. Related I'm convinced AI will take over, but not in the way you think We're outsourcing people skills Posts5 ✕ Remove Ads Some AI features work great, and others, not so much. Consistency, or lack thereof, is the biggest complaint about today's much-talked-about software tricks. If an app has trouble transcribing a conversation and then struggles to translate it, what's it worth? Magically editing awkward limbs or removing strangers from a photo's foreground makes life easy until it still looks weird after six tries. With so much potential controversy surrounding generative AI, the gatekeeping of image creation and social media proofreading sometimes breaks features. 'AI-powered' remains a somewhat arbitrary distinction As muddy and misleading a definition as ever Source: Chris Thomas / ImgFlip Stroll down social media lane, or interview some smartphone-savvy circles, about the most popular AI feature. One answer leads by an incredible margin: Circle to Search. A slide of the finger produces a link to whatever you circled. The perfect example of AI's usefulness, right? A screenshot snipping tool tied to image recognition and fed to a search engine doesn't encapsulate the impressive technology we call "AI." Google Lens is seven years old, and Circle to Search is the evolution of 2015's Now on Tap. None of this is new or worthy of an upgrade to a costly Galaxy flagship. ✕ Remove Ads AI has a point, and overhyped apps miss it 2024's second Galaxy Unpacked event introduced the ability to draw a dinosaur on a beach. AI features inspire and depend on some groundbreaking techniques. However, focusing on the flashiest, most shocking results belies what it's often good for. We shouldn't need Samsung to blow our minds with functions that a slim, battery-powered device has no business performing. That isn't AI's only contribution. Its benefits offer considerable potential under the hood of everyday software. Circle to Search's effectiveness and popularity stem from the unprecedented technological support behind modern image recognition and language parsing. Counterintuitively, seemingly bog-standard tools not emblazoned with the AI label could benefit even more than apparently magical AI apps. ✕ Remove Ads Related Arm's senior VP explains AI's true impact, and it's not what you think Our interview with Arm exec Chris Bergey Posts Automations like Samsung Notes' formatting, summarization, and spellchecking (some of the few regularly praised Galaxy AI-exclusive features) don't embody what "artificial intelligence" intelligence is, but people consistently find them useful. Anything but flashy, Notes' tricks still use the advanced parallel processing methods and data-backed training popularized by AI madness. AI makes a less-than-obvious difference industry-wide by influencing specialized microchips, complex programming techniques, and frameworks like Arm ISAs and their AI-focused extension libraries. Automatic note formatting may be mundane, but it helps and underscores the technology's usefulness. Then again, mundanity plays another big part. ✕ Remove Ads Modern AI features aren't transformative They often feel tacked on, gimmicky, and superfluous Close A truly epic AI image with significant compression artifacts. When I read editor Will Sattelberg's Galaxy Z Fold 6 review, I assumed "sunglasses on dogs levels of gimmicks" was an idiom that had passed me by. Instead, Will was talking about his foray into Samsung's Sketch to Image. Like many, he experimented with Sketch to Image's novelty a few times and then forgot about it. AI is supposed to change our lives. Galaxy AI barely changes our phones. This may be blasphemy, but on-demand AI-generated wallpapers don't add much substance to the mobile experience. Diverse wallpaper apps and catalogs provide more choices than you can wade through. We could (and might) spend an entire article roasting the banality of outsourcing text messages to a supercomputer. ✕ Remove Ads Sometimes, AI features aren't there but somewhere else Source: Google Samsung's AI superiority claims only apply to some regions. Many countries and languages lack access to useful aspects. Potential European Galaxy buyers see this often due to legislation curbing harmful AI use. If you don't know when you'll get Galaxy AI, don't pay for a phone that can use it. On the other hand, phones' improving performance and efficiency allow for complex on-device operations. Galaxy AI can theoretically translate conversations, record transcripts, and perfect message grammar, even without internet access. Related AI on Android isn't worth buying into just yet If you're looking to buy a smartphone, today's AI tools aren't enough of a selling point Posts ✕ Remove Ads Still, a phone can't do everything. The cloud takes care of generative photo editing. You need to upload videos to convert them to slow motion. Summarizing texts and recording only work online. Some translation features work better with internet access. Why fork over a month's rent for a device that can't organize your notes without a supercomputer's help? AI-powered functionality could be so much more Source: USAII The most eye-catching features feel too superficial to represent how AI might improve our enjoyment and workflows. Image generation (which rumors indicate Galaxy AI will soon implement) provides a fun distraction or saves time designing social media promotions. Still, it isn't life-changing or driving me to shell out $1,500 on the hottest new foldable. ✕ Remove Ads Where's the innovation? In an October 2024 interview, Samsung's Head of Customer Experience mused on a future where AI adjusts settings before users need to manually. Most outlets ran with a sensationalist claim that AI would "replace the Settings menu," a concept rightfully deserving of the widespread ridicule it drew. Nobody wants half-baked prediction algorithms reconfiguring their phone. Related Is Samsung's rumored AI Settings push genius, or a joke? Is Samsung's bold move too much too soon? Posts5 That isn't far from the brazen creativity that could spur a major reimagining of mobile devices with AI's help. The multimodal and agentic AI hysterias clogging the discourse could drive the reinvention of human-device interaction if they can mitigate privacy and human agency concerns. ✕ Remove Ads The underdeveloped Rabbit R1 showed flashes of inspiration while failing to serve much purpose. Qualcomm, T-Mobile majority owner Telekom, and startup Brain AI demoed an app-eschewing phone at MWC 2024 that relied on a prototype AI agent to replace the interface as we know it. Still, those mock-ups are light years away from Galaxy AI's fun, but not industry-shaking, set of magic tricks. We won't reach unknown frontiers while wasting battery power by turning sketches of stovepipe hats into pictures of actual pipes. 0:32 Related This concept phone wants you to go app-free, but not entirely It's not unlike the Rabbit R1, but in a more familiar form factor Posts Galaxy AI isn't done innovating yet Samsung's best is (hopefully) yet to come Source: Android Police ✕ Remove Ads Easily beating Samsung to the punch, Google's world-leading data hoard and programming talent resulted in Gemini tools working consistently better than Galaxy AI. Still, Samsung won't give up. The One UI 7 beta introduced writing tools, call transcripts, and an attempt at unifying all your available data into granular suggestions improving your daily life, such as, "Take a nap, then pack for your trip." Convenient, streamlined features improve the near-future smartphone experience, at least as much as a cat wearing glasses. Meanwhile, advanced machine learning techniques push researchers toward complex goals like mapping the human brain. World-altering academic study benefits from AI's progression, and the consumer drives much of the hype and spending that fuels the machine. If Samsung dares to be different (in its features or phones), Galaxy AI could become a selling point. The engineers have their work cut out for them ✕ Remove Ads You can't convince me that Galaxy AI does enough to warrant replacing a two- or three-year-old flagship with the Galaxy S25. Samsung's AI toolkit fails to meet its lofty goals reliably and needs multiple doses of real-world utility. Features like AI notification and text summary, slated to arrive with the S25 family, look promising but won't upend the industry. Related Android manufacturers need to stop charging extra for AI I need to see more value Posts6 Samsung can learn from Google as they co-develop AR wearables that might leverage the Project Astra AI-powered assistant. Upcoming LLM additions and agentic AI upgrades to Bixby could turn the long-suffering voice assistant into a legitimately helpful pocket-size personality. Samsung makes great Galaxy phones, but it needs to get creative and use Galaxy AI for something we haven't seen before. It may take a bit of bravery, but it could pay off and extend Samsung's long-running lead in the Android market. ✕ Remove Ads
[8]
Gemini is getting complicated
Last week, Google debuted Gemini 2.0. The new family of AI models that power Google's chatbot of the same name comes with new capabilities, like the ability to directly access information from services like Google Search and natively create images and audio to include in its responses. Google says its recent AI models are built for the "new agentic era" we're entering, in which AI can access the internet and use tools to get things done for users. ✕ Remove Ads As of this week, Gemini Advanced subscribers have access to try a handful of new models: Gemini 2.0 Flash Experimental, Gemini 2.0 Experimental Advanced, and Gemini 1.5 Pro with Deep Research. These join the existing options of standard 1.5 Pro (for "complex tasks") and 1.5 Flash (for "everyday help"). It checks out that paying subscribers would get the chance to try new features early. But for a product that's supposed to take some of the work out of intricate processes like in-depth research and, eventually, higher-stakes assignments like booking travel, Gemini is getting increasingly tricky to understand and use. Welcome to Compiler, your weekly digest of Google's goings-on. I spend my days as Google Editor reading and writing about what Google's up to across Android, Pixel, and more, and sum it up right here in this column. This is the Google news you need to understand this week. A model for every task ✕ Remove Ads Gemini Advanced subscribers now have a total of five Gemini models to choose between. More complex workloads are more resource intensive, so employing different models for different tasks makes sense. If a simpler Flash model can answer a given query just as well as a more complex Pro model can, running it through Flash instead of Pro will save a little computing power -- a growing concern in the AI space. But a drop-down menu that lets users manually choose between five different models for each given query seems like an awfully obtuse way to manage Gemini's various capabilities. Learning the ins and outs of models with names like 1.5 Flash and 1.5 Pro with Deep Research seems like a big ask. Gemini 1.5 Pro with Deep Research, for example, is the only of the five that can carry out Gemini's Deep Research function that collates information from dozens or even hundreds of sources to create detailed reports. Gemini 2.0 Advanced, the newer, generally better model, still can't do that. If you ask it to, it'll do something, but it won't let you know that your query would be better suited for 1.5 with Deep Research. ✕ Remove Ads Isn't AI supposed to simplify our lives? The appeal of natural-language AI interfaces, theoretically, is that you don't need to know how they work to use them. As opposed to a more traditional application, where you need to learn the nuances of the UI and where to find various functions to accomplish complicated tasks, with something like Gemini or ChatGPT, you shouldn't need specialized knowledge -- only a reasonably well-formed query. Layering on a menu of abstract models to choose from for each input (is this query everyday help or a complex task?) seems at odds with one of the most valuable characteristics of this type of application: approachability. ✕ Remove Ads The option to manually pick which model your query runs through is a sensible perk for Advanced subscribers, but it shouldn't be a requirement. To make Gemini easier to use, I'd like to see a future version that decides which model is best suited for your query automatically, without manual oversight. As it stands, Gemini won't even let you know if you've used the wrong model for a given task. Isn't AI supposed to simplify our lives? Is Google Keep due for a glow-up? Android 16 Developer Preview 2 packs an interesting change: it makes Google Keep a system application, meaning you can't uninstall it without root access. At first blush, that might seem like more of an inconvenience than anything, but it likely means that Google has big plans for its note-taking app, including deeper system integrations -- the ability to launch the app from the lock screen on Pixel phones, for example. ✕ Remove Ads I'm excited about the possibility. I've used Keep for quick notes out of convenience for years, but I've never really liked it much. Compared to other apps I've used for note-taking -- Evernote, Obsidian, Apple Notes -- Keep's always seemed a little barebones. You can search your notes and add labels, but there's no robust categorization; you can't create folders, and the app is still clinging to its original concept of notes represented as sticky note-style cards. But if Keep does become a bigger focus for Google, picking up features like folders, some Gemini-powered AI categorization, and maybe a Quick Settings tile to open a new note on Android like Apple Notes has on iOS, I can see myself using it because I want to, and not just because it's the note-taking app I happen to have installed. Meanwhile... ✕ Remove Ads Google's Veo 2 video generator is looking wildly impressive. Google released a set of video clips (above) from its latest Veo 2 video generator this week, and for the most part, it's very hard to tell the clips weren't made by human hands. Veo 2 apparently has a better understanding of things like anatomy and physics than the original Veo did, which lets it create clips that have markedly less AI wonk and fewer hallucinations. You can sign up for a waitlist to try Veo 2 yourself at labs.google/videofx. Latest development Google says Veo 2 AI can generate videos without all the hallucinations Five fingers per hand is a big step for AI Posts ✕ Remove Ads Google's new Whisk experiment is a tool for visual brainstorming. Whisk lets you generate images based on a user-defined "setting," "scene," and "style." For each aspect, you can either upload an existing image or enter a text prompt. You also have the option to refine output images with additional prompting. The results aren't generally top-shelf quality, but Google positions Whisk more as a tool for ideation than creating ready-to-use imagery. You can try Whisk right now at labs.google/fx/tools/whisk. Full story Google's new Whisk AI lets you drop images in as prompts to make new images The latest Google Labs creation is fun Posts Gemini's fact-checkers are reportedly weighing in on subjects they don't know about. According to reporting from TechCrunch, contract workers who rate Gemini's responses are no longer able to bypass responses that fall outside their understanding, with guidance from Google reportedly reading, in part, "You should not skip prompts that require specialized domain knowledge." That's fairly troubling! Remember to keep double-checking information provided by AI before acting on it. ✕ Remove Ads Latest development New Google policy instructs Gemini's fact-checkers to act outside their expertise Google may undermine its accuracy claims Posts
Share
Share
Copy Link
A comprehensive look at the AI landscape in 2024, highlighting key developments, challenges, and future trends in the rapidly evolving field.
As 2024 draws to a close, the artificial intelligence (AI) industry finds itself at a critical juncture. Despite the immense hype and billions of dollars poured into AI development, the technology has not fully lived up to the lofty promises made by tech giants and startups alike 1. The year saw a mix of breakthroughs, setbacks, and a growing realization that the path to truly transformative AI is more complex than initially anticipated.
OpenAI continued to dominate headlines, evolving from a cautious non-profit to an AI powerhouse valued at $157 billion. With substantial backing from Microsoft and a deal to power Apple iPhones, the company is projected to generate $11.5 billion in revenue 2. However, the AI landscape remains fiercely competitive, with companies like Anthropic, Google, and numerous startups vying for market share and technological superiority.
Google, facing potential disruption to its core search business, made significant strides with its Gemini Ultra foundational model, boasting a context window of 2 million tokens 2. The company's pivot to an "AI-first" strategy underscores the transformative potential of AI across various sectors.
2024 saw notable progress in AI capabilities, particularly in the realm of "agentic AI" – autonomous assistants capable of completing tasks without human oversight. Companies like Microsoft, Salesforce, and OpenAI are heavily investing in this technology, with the global market for AI agents expected to reach $47 billion by the end of the decade 3.
However, the industry faced a significant challenge in the form of a looming data crisis. With the finite amount of high-quality training data available, AI companies are exploring alternative approaches such as test-time compute and synthetic data generation to continue improving their models 3.
A major breakthrough came in the form of "reasoning" models, exemplified by OpenAI's o1 and o3 models. These AI systems "think before they answer," producing long internal chains of thought before responding to users 4. While this approach has shown promising results in benchmarks, it also introduces new challenges, particularly in terms of inference costs and deployment strategies.
The year also saw advancements in AI-specific hardware and integration into existing products. Google's Android 15, for instance, leveraged on-device AI to optimize core OS functions, improving battery life and performance 5. Additionally, AI continued to enhance computational photography capabilities in smartphones, particularly in Google's Pixel lineup.
As AI capabilities expanded, so did concerns about ethical implications and potential misuse. The recall of Humane's AI Pin due to fire risks and privacy issues with Rabbit's R1 device highlighted the need for robust safety measures and responsible development practices 1. Regulatory bodies worldwide began to take a closer look at AI technologies, signaling a potential shift towards more stringent oversight in the coming years.
As the industry moves into 2025, several trends are expected to shape the AI landscape:
While the AI revolution may not have fully materialized as predicted in 2024, the groundwork laid this year sets the stage for potentially transformative developments in the near future 345.
Reference
[5]
A comprehensive look at the latest developments in AI, including OpenAI's Sora, Microsoft's vision for ambient intelligence, and the shift towards specialized AI tools in business.
6 Sources
6 Sources
DeepSeek's emergence disrupts the AI market, challenging industry giants and raising questions about AI's future development and societal impact.
3 Sources
3 Sources
A comprehensive look at the latest developments in AI, including OpenAI's internal struggles, regulatory efforts, new model releases, ethical concerns, and the technology's impact on Wall Street.
6 Sources
6 Sources
As ChatGPT turns two, the AI landscape is rapidly evolving with new models, business strategies, and ethical considerations shaping the future of artificial intelligence.
6 Sources
6 Sources
In 2024, AI made significant strides in capabilities and adoption, driving massive profits for tech companies. However, concerns about safety, regulation, and societal impact also grew.
13 Sources
13 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved