Curated by THEOUTPOST
On Fri, 17 Jan, 8:04 AM UTC
2 Sources
[1]
Build or buy? Scaling your enterprise gen AI pipeline in 2025
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Scaling adoption of generative tools has always been a challenge of balancing ambition with practicality, and in 2025, the stakes are higher than ever. Enterprises racing to adopt large language models (LLMs) are encountering a new reality: Scaling isn't just about deploying bigger models or investing in cutting-edge tools -- it's about integrating AI in ways that transform operations, empower teams and optimize costs. Success hinges on more than technology; it requires a cultural and operational shift that aligns AI capabilities with business goals. The scaling imperative: Why 2025 is different As generative AI evolves from experimentation to enterprise-scale deployments, businesses are facing an inflection point. The excitement of early adoption has given way to the practical challenges of maintaining efficiency, managing costs and ensuring relevance in competitive markets. Scaling AI in 2025 is about answering hard questions: How can businesses make generative tools impactful across departments? What infrastructure will support AI growth without bottlenecking resources? And perhaps most importantly, how do teams adapt to AI-driven workflows? Success hinges on three critical principles: identifying clear, high-value use cases; maintaining technological flexibility; and fostering a workforce equipped to adapt. Enterprises that succeed don't just adopt gen AI -- they craft strategies that align the technology with business needs, continually reevaluating costs, performance and the cultural shifts required for sustained impact. This approach isn't just about deploying cutting-edge tools; it's about building operational resilience and scalability in an environment where technology and markets evolve at breakneck speed. Companies like Wayfair and Expedia embody these lessons, showcasing how hybrid approaches to LLM adoption can transform operations. By blending external platforms with bespoke solutions, these businesses illustrate the power of balancing agility with precision, setting a model for others. Combining customization with flexibility The decision to build or buy gen AI tools is often portrayed as binary, but Wayfair and Expedia illustrate the advantages of a nuanced strategy. Fiona Tan, Wayfair's CTO, underscores the value of balancing flexibility with specificity. Wayfair uses Google's Vertex AI for general applications while developing proprietary tools for niche requirements. Tan shared the company's iterative approach, sharing how smaller, cost-effective models often outperform larger, more expensive options in tagging product attributes like fabric and furniture colors. Similarly, Expedia employs a multi-vendor LLM proxy layer that allows seamless integration of various models. Rajesh Naidu, Expedia's senior vice president, describes their strategy as a way to remain agile while optimizing costs. "We are always opportunistic, looking at best-of-breed [models] where it makes sense, but we are also willing to build for our own domain," Naidu explains. This flexibility ensures the team can adapt to evolving business needs without being locked into a single vendor. Such hybrid approaches recall the enterprise resource planning (ERP) evolution of the 1990s, when enterprises had to decide between adopting rigid, out-of-the-box solutions and heavily customizing systems to fit their workflows. Then, as now, the companies that succeeded recognized the value of blending external tools with tailored developments to address specific operational challenges. Operational efficiency for core business functions Both Wayfair and Expedia demonstrate that the real power of LLMs lies in targeted applications that deliver measurable impact. Wayfair uses generative AI to enrich its product catalog, enhancing metadata with autonomous accuracy. This not only streamlines workflows but improves search and customer recommendations. Tan highlights another transformative application: leveraging LLMs to analyze outdated database structures. With original system designers no longer available, gen AI enables Wayfair to mitigate technical debt and uncover new efficiencies in legacy systems. Expedia has found success integrating gen AI across customer service and developer workflows. Naidu shares that a custom gen AI tool designed for call summarization ensures that "90% of travelers can get to an agent within 30 seconds," contributing towards a significant improvement in customer satisfaction. Additionally, GitHub Copilot has been deployed enterprise-wide, accelerating code generation and debugging. These operational gains underscore the importance of aligning gen AI capabilities with clear, high-value business use cases. The role of hardware in gen AI The hardware considerations of scaling LLMs are often overlooked, but they play a crucial role in long-term sustainability. Both Wayfair and Expedia currently rely on cloud infrastructure to manage their gen AI workloads. Tan notes that Wayfair continues to assess the scalability of cloud providers like Google, while keeping an eye on the potential need for localized infrastructure to handle real-time applications more efficiently. Expedia's approach also emphasizes flexibility. Hosted primarily on AWS, the company employs a proxy layer to dynamically route tasks to the most appropriate compute environment. This system balances performance with cost efficiency, ensuring that inference costs don't spiral out of control. Naidu highlights the importance of this adaptability as enterprise gen AI applications grow more complex and demand higher processing power. This focus on infrastructure reflects broader trends in enterprise computing, reminiscent of the shift from monolithic data centers to microservices architectures. As companies like Wayfair and Expedia scale their LLM capabilities, they showcase the importance of balancing cloud scalability with emerging options like edge computing and custom chips. Training, governance and change management Deploying LLMs isn't just a technological challenge -- it's a cultural one. Both Wayfair and Expedia emphasize the importance of fostering organizational readiness to adopt and integrate gen AI tools. At Wayfair, comprehensive training ensures employees across departments can adapt to new workflows, especially in areas like customer service, where AI-generated responses require human oversight to match the company's voice and tone. Expedia has taken governance a step further by establishing a Responsible AI Council to oversee all major gen AI-related decisions. This council ensures that deployments align with ethical guidelines and business objectives, fostering trust across the organization. Naidu underscores the significance of rethinking metrics to measure gen AI's effectiveness. Traditional KPIs often fall short, prompting Expedia to adopt precision and recall metrics that better align with business goals. These cultural adaptations are critical to gen AI's long-term success in enterprise settings. Technology alone cannot drive transformation; transformation requires a workforce equipped to leverage gen AI's capabilities and a governance structure that ensures responsible implementation. Lessons for scaling success The experiences of Wayfair and Expedia offer valuable lessons for any organization looking to scale LLMs effectively. Both companies demonstrate that success hinges on identifying clear business use cases, maintaining flexibility in technology choices, and fostering a culture of adaptation. Their hybrid approaches provide a model for balancing innovation with efficiency, ensuring that gen AI investments deliver tangible results. What makes scaling AI in 2025 an unprecedented challenge is the pace of technological and cultural change. The hybrid strategies, flexible infrastructures and strong data cultures that define successful AI deployments today will lay the groundwork for the next wave of innovation. Enterprises that build these foundations now won't just scale AI; they'll scale resilience, adaptability, and competitive advantage. Looking ahead, the challenges of inference costs, real-time capabilities and evolving infrastructure needs will continue to shape the enterprise gen AI landscape. As Naidu aptly puts it, "Gen AI and LLMs are going to be a long-term investment for us and it has differentiated us in the travel space. We have to be mindful that this will require some conscious investment prioritization and understanding of use cases."
[2]
4 bold AI predictions for 2025
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As we wrap up 2024, we can look back and acknowledge that artificial intelligence has made impressive and groundbreaking advances. At the current pace, predicting what kind of surprises 2025 has in store for AI is virtually impossible. But several trends paint a compelling picture of what enterprises can expect in the coming year and how they can prepare themselves to take full advantage. The plummeting costs of inference In the past year, the costs of frontier models have steadily decreased. The price per million tokens of OpenAI's top-performing large language model (LLM) has dropped by more than 200 times in the past two years. One key factor driving down the price of inference is growing competition. For many enterprise applications, most frontier models will be suitable, which makes it easy to switch from one to another, shifting the competition to pricing. Improvements in accelerator chips and specialized inference hardware are also making it possible for AI labs to provide their models at lower costs. To take advantage of this trend, enterprises should start experimenting with the most advanced LLMs and build application prototypes around them even if the costs are currently high. The continued reduction in model prices means that many of these applications will soon be scalable. At the same time, the models' capabilities continue to improve, which means you can do a lot more with the same budget than you could in the past year. The rise of large reasoning models The release of OpenAI o1 has triggered a new wave of innovation in the LLM space. The trend of letting models "think" for longer and review their answers is making it possible for them to solve reasoning problems that were impossible with single-inference calls. Even though OpenAI has not released o1's details, its impressive capabilities have triggered a new race in the AI space. There are now many open-source models that replicate o1's reasoning abilities and are extending the paradigm to new fields, such as answering open-ended questions. Advances in o1-like models, which are sometimes referred to as large reasoning models (LRMs), can have two important implications for the future. First, given the immense number of tokens that LRMs must generate for their answers, we can expect hardware companies to be more incentivized to create specialized AI accelerators with higher token throughput. Second, LRMs can help address one of the important bottlenecks of the next generation of language models: high-quality training data. There are already reports that OpenAI is using o1 to generate training examples for its next generation of models. We can also expect LRMs to help spawn a new generation of small specialized models that have been trained on synthetic data for very specific tasks. To take advantage of these developments, enterprises should allocate time and budget to experimenting with the possible applications of frontier LRMs. They should always test the limits of frontier models, and think about what kinds of applications would be possible if the next generation of models overcome those limitations. Combined with the ongoing reduction in inference costs, LRMs can unlock many new applications in the coming year. Transformer alternatives are picking up steam The memory and compute bottleneck of transformers, the main deep learning architecture used in LLMs, has given rise to a field of alternative models with linear complexity. The most popular of these architectures, the state-space model (SSM), has seen many advances in the past year. Other promising models include liquid neural networks (LNNs), which use new mathematical equations to do a lot more with many fewer artificial neurons and compute cycles. In the past year, researchers and AI labs have released pure SSM models as well as hybrid models that combine the strengths of transformers and linear models. Although these models have yet to perform at the level of the cutting-edge transformer-based models, they are catching up fast and are already orders of magnitude faster and more efficient. If progress in the field continues, many simpler LLM applications can be offloaded to these models and run on edge devices or local servers, where enterprises can use bespoke data without sending it to third parties. Changes to scaling laws The scaling laws of LLMs are constantly evolving. The release of GPT-3 in 2020 proved that scaling model size would continue to deliver impressive results and enable models to perform tasks for which they were not explicitly trained. In 2022, DeepMind released the Chinchilla paper, which set a new direction in data scaling laws. Chinchilla proved that by training a model on an immense dataset that is several times larger than the number of its parameters, you can continue to gain improvements. This development enabled smaller models to compete with frontier models with hundreds of billions of parameters. Today, there is fear that both of those scaling laws are nearing their limits. Reports indicate that frontier labs are experiencing diminishing returns on training larger models. At the same time, training datasets have already grown to tens of trillions of tokens, and obtaining quality data is becoming increasingly difficult and costly. Meanwhile, LRMs are promising a new vector: inference-time scaling. Where model and dataset size fail, we might be able to break new ground by letting the models run more inference cycles and fix their own mistakes. As we enter 2025, the AI landscape continues to evolve in unexpected ways, with new architectures, reasoning capabilities, and economic models reshaping what's possible. For enterprises willing to experiment and adapt, these trends represent not just technological advancement, but a fundamental shift in how we can harness AI to solve real-world problems.
Share
Share
Copy Link
An in-depth look at the challenges and opportunities facing enterprises as they scale their AI operations in 2025, including the build vs. buy dilemma, emerging AI technologies, and cost considerations.
As enterprises race to adopt and scale generative AI technologies, 2025 marks a critical juncture in the evolution of artificial intelligence implementation. The focus has shifted from mere experimentation to enterprise-scale deployments, presenting both challenges and opportunities for businesses across sectors 1.
Success in scaling AI operations hinges on three key principles:
Companies like Wayfair and Expedia are leading the way, demonstrating how hybrid approaches to large language model (LLM) adoption can transform operations and set new standards for the industry 1.
The decision to build or buy AI tools is no longer binary. Wayfair's CTO, Fiona Tan, emphasizes the importance of balancing flexibility with specificity. The company utilizes Google's Vertex AI for general applications while developing proprietary tools for niche requirements. This approach allows for cost-effective solutions that often outperform larger, more expensive models in specific tasks 1.
Expedia employs a multi-vendor LLM proxy layer, enabling seamless integration of various models. This strategy, as described by Rajesh Naidu, Expedia's senior vice president, allows the company to remain agile while optimizing costs and adapting to evolving business needs 1.
Both Wayfair and Expedia demonstrate the power of LLMs in targeted applications that deliver measurable impact. Wayfair uses generative AI to enrich its product catalog and analyze outdated database structures, while Expedia has integrated AI across customer service and developer workflows, significantly improving customer satisfaction and accelerating code generation 1.
The role of hardware in scaling LLMs is often overlooked but crucial for long-term sustainability. Both Wayfair and Expedia currently rely on cloud infrastructure to manage their AI workloads, with an eye on potential future needs for localized infrastructure to handle real-time applications more efficiently 1.
As we look towards 2025, several bold predictions shape the future of AI:
Plummeting Inference Costs: The cost of using frontier models is expected to continue decreasing dramatically, driven by growing competition and improvements in accelerator chips 2.
Rise of Large Reasoning Models (LRMs): Following OpenAI's o1, a new wave of models capable of solving complex reasoning problems is emerging, potentially transforming various industries 2.
Transformer Alternatives: State-space models (SSMs) and liquid neural networks (LNNs) are gaining traction as more efficient alternatives to traditional transformer architectures, potentially enabling more AI applications to run on edge devices or local servers 2.
Evolving Scaling Laws: As traditional scaling approaches reach their limits, new vectors like inference-time scaling through LRMs promise to break new ground in AI capabilities 2.
Reference
[2]
As edge computing rises in prominence for AI applications, it's driving increased cloud consumption rather than replacing it. This symbiosis is reshaping enterprise AI strategies and infrastructure decisions.
2 Sources
2 Sources
As businesses move beyond the pilot phase of generative AI, key lessons emerge on successful implementation. CXOs are adopting strategic approaches, while diverse use cases demonstrate tangible business value across industries.
4 Sources
4 Sources
Generative AI is revolutionizing industries, from executive strategies to consumer products. This story explores its impact on business value, employee productivity, and the challenges in building interactive AI systems.
6 Sources
6 Sources
As AI continues to transform enterprise computing, companies are navigating new infrastructure paradigms. From cloud-based solutions to custom on-premises setups, businesses are exploring various options to gain a competitive edge in the AI-driven landscape.
4 Sources
4 Sources
A comprehensive look at the current state of AI adoption in enterprises, covering early successes, ROI challenges, and the growing importance of edge computing in AI deployments.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved