For a long time, the conversation around Artificial Intelligence (AI) in businesses was dominated by one question: "How much will this cost?" The expense of powerful computing, vast datasets, and specialized talent often felt like a huge barrier. However, a new trend is emerging among the leaders in AI: the focus is shifting dramatically. Instead of worrying about the price tag, top companies are now prioritizing getting their AI out the door as quickly as possible, dealing with costs later. This change is driven by the need for speed, flexibility, and the sheer capacity to handle growing demands.
The idea that AI is too expensive is starting to be debunked by companies that are using it at a large scale. Take Wonder, a food delivery and takeout company. They use AI for everything from suggesting what you might want to eat to planning the best routes for their delivery drivers. According to their CTO, James Chen, the AI part of a single order adds only a few cents to the cost. While this cost is going up, it's still tiny compared to the overall cost of running the business. What Wonder is really concerned about is having enough power (capacity) in their computer systems to handle all the orders, especially when demand is surging.
Similarly, Recursion, a company working in biotech using AI to speed up drug discovery, has found a smart way to manage their AI needs. They use a mix of their own powerful computers (on-premises) and rented computing power from cloud providers. This gives them the flexibility to run quick experiments and also handle massive training jobs. The key takeaway from companies like these is that when AI is used in real-world, large-scale applications, the main challenges aren't about paying for it, but about how fast they can get it running and keep it running smoothly.
Wonder's journey illustrates a common realization. They initially built their systems assuming they'd always have unlimited computing power readily available, allowing them to move at lightning speed. This proved to be an incorrect assumption as their business grew rapidly. They started receiving signals from their cloud providers that they were running out of space for processing power (CPUs) and data storage. This pushed them to expand to new geographic locations sooner than expected, a move that, while good practice for reliability, came earlier than they had planned.
This experience highlights that building for unexpected growth is crucial. The demand for AI services can skyrocket, and companies need the infrastructure to keep up. This isn't just about having enough servers; it's about ensuring these systems can handle complex tasks without slowing down.
Recursion's approach offers another perspective on flexibility. Their CTO, Ben Mabey, explains that when they first started building their AI systems, cloud providers didn't offer many suitable options for their specific needs, especially for the large-scale training required in biotech. So, they invested in their own hardware. This "vindication moment," as Mabey calls it, came when they needed even more computing power and found that cloud providers couldn't deliver it quickly enough. They now use a combination of their own powerful computing clusters, including Nvidia's latest GPUs, and cloud services. This hybrid model allows them to run massive, data-intensive training jobs on their own hardware for significant cost savings, while using the cloud for shorter, less demanding tasks.
The fact that Recursion's older gaming GPUs are still useful today also challenges the myth that AI hardware becomes obsolete quickly. This adaptability in infrastructure is key to staying ahead.
Mabey provides a clear comparison: running large AI training jobs on-premises can be up to 10 times cheaper than using the cloud, and over a five-year period, it can be half the cost. However, for smaller storage needs or less demanding tasks, the cloud can be very cost-competitive. This dual approach—using on-premise for heavy lifting and the cloud for agility—is becoming a smart strategy for many organizations.
This insight is critical for businesses. Making cost-effective AI solutions often requires a long-term commitment. Mabey points out a psychological barrier he's observed: some peers avoid investing heavily in their own compute, leading them to constantly pay for usage on demand. This fear of "burning money" can stifle innovation, as teams might use less AI to avoid racking up cloud bills.
While raw compute cost is becoming less of a constraint, other economic factors are coming to the forefront. For Wonder, the challenge lies in budgeting for the unknown. When new, powerful AI models are released, companies feel compelled to use them to stay competitive, even if their exact cost isn't fully understood. This makes budgeting more of an "art than a science."
A significant hidden cost is the need to repeatedly send context to AI models. Chen explains that over half, and sometimes up to 80%, of the cost can come from re-sending the same information with every request to large language models. While efficiency should theoretically reduce costs per transaction, the desire to explore new AI applications means costs can still climb unpredictably.
Wonder also envisions a future with personalized AI agents for each user, which would require creating many small, highly specialized "micro-models." While this offers incredible customization, the current cost of building and maintaining such models for millions of individuals is simply not feasible. This highlights that while we are moving beyond basic compute costs, the economics of advanced AI, like hyper-personalization, still present significant challenges.
This shift from cost-consciousness to deployment-first signals a maturing AI landscape. Here’s what it implies:
For businesses, this new reality demands a strategic re-evaluation of their AI approach:
For society, this means we can expect to see AI's impact grow more rapidly across various sectors—from healthcare and finance to entertainment and everyday services. The focus on deployment means AI will become more integrated into our lives, potentially leading to more personalized experiences, faster services, and groundbreaking scientific discoveries. However, it also underscores the importance of responsible development and deployment to ensure these advancements benefit everyone.
The message is clear: the race for AI dominance is no longer about who can spend the least, but who can build and deploy the most effectively and rapidly. The companies that master speed, flexibility, and capacity will be the ones shaping the future of AI and its impact on our world.