Rushing into genAI? Prepare for budget blowouts and broken promises

July 16, 2025 Yanac

Siobhan Kwiecien, director of HR Process Enablement at pharma and biotech company Thermo Fisher Scientific, has spent the past year working to embed generative AI (genAI) into her company’s internal chatbots. “And that’s been quite a road,” she said.

The consumption and pricing models for the technology were “all over the place,” and promises of “robust” AI solutions fell short, she said.

“So, we spent all this time going down this road, and then all of a sudden it’s like, hold on, that’s not doing exactly what we wanted to do,” Kwiecien said. “And I think that’s just the nature of where we are with some of these generative AI solutions.”

Kwiecien’s experience is common. While more than 80% of enterprises will leverage or deploy genAI by 2026, most if not all will run into staggering cost overruns and failed expectations. Gartner projects $644 billion in genAI-related spending for 2025, with cost management posing a key challenge. To handle that requires smart financial planning; IDC expects 75% of organizations will integrate cost containment measures around genAI projects by 2027.

Beyond managing unexpected costs, organizations face other challenges with deployments, including data risks, model inaccuracies, integration hurdles, talent shortages, ethical concerns, and high infrastructure demands. Unclear ROI, weak governance, and vendor lock-in further complicate the picture.

Fear of missing out leads to poor planning

The cost genAI deployments can quickly leap into millions of dollars and in some rare cases to more than $1 billion — catching organizations completely off guard. While the technology can deliver value, without strong planning, governance, and cost controls, it risks becoming an unpredictable investment as fees, training, integration and infrastructure demands add up.

Additionally, data cleaning, labeling, and management are resource-intensive processes, yet essential. Without high-quality, reliable data, genAI models suffer due to a well-known principle: garbage in, garbage out.

“Especially in HR, the risk of hallucinations is tricky,” Kwiecien said. “When you have someone asking about a benefits policy, you don’t want it hallucinating on that and making people think that they’re getting something that they’re not.”

Training large language models (LLMs) — especially custom ones — is computationally intensive and expensive. Hiring or contracting AI engineers, data scientists, and MLOps specialists is also costly.

And yet, the fear of missing out can drive many genAI pilots and push companies to embrace trendy tech like OpenAI’s ChatGPT — often under executive pressure without proper cost planning, according to Brian Greenberg, CIO of consulting firm RHR International in Chicago. Organizations often rely on oversimplified spreadsheets to track costs, missing key expenses like fine-tuning, APIs, consultants, data prep, and governance, he said.

A solid cost model should incorporate both implementation and the reality that many genAI experiments will fail — and that’s okay, Greenberg said. “The biggest cost issues happen when organizations rush in without any guard-rails, having a clear end in mind, or validating where AI actually adds value,” he said. “It’s akin to buying a Lamborghini to deliver pizzas; fast and fun, but wildly mismatched.”

RHR, Greenberg said, has taken a cautious, security-first approach to genAI. An internal task force leads ongoing testing with a strong focus on privacy, ethics, and enterprise controls. Moving at the right pace helps target key workflows, avoid shadow IT, and build trust in the workforce.

“It’s not about spending less, it’s about spending wisely,” Greenberg said.

Nate Suda, a senior research director at Gartner, agreed, saying organizations are jumping the gun because “nobody wants to be left on the platform as the train is pulling away.”

Many organizations create promising pilots but face challenges scaling them enterprise-wide. Gartner predicts that within two years, 40% of agentic AI and 30% of genAI projects will be terminated due to project failure. “We say cost is one of the greatest near-term threats to AI and genAI success. We’re seeing half of organizations pull back on initiatives, quite simply, because they got the cost model wrong,” Suda said.

Initial low costs from third-party genAI tools or platforms can escalate with scaling and licensing fees based on query-and-answer pricing models.

Defend, extend or upend? Which model to choose

GenAI or agentic AI is typically deployed by companies using three strategies: “Defend,” to maintain their competitive position; “Extend,” to expand capabilities; and “Upend,” to disrupt and capture new markets through innovation.

For example, defend — the most popular framework — could involve customer service chatbots that reduce churn by improving response times. An extend strategy could be used by a telecom company, for instance, to have genAI upsell personalized plans based on usage patterns. And an upend model could be the tack taken by an AI-first legal or medical startup challenging traditional service models.

In fact, many legal leaders say now is the time to scrap per-minute billing for outcome-based pricing; that’s because time no longer holds the same value since genAI tools can vastly speed up services.

Upend AI strategies are particularly risky, Suda said. Unlike past digital disruptors such as Netflix or Uber, an Upend AI strategy isn’t just helping one company lead — it’s shaking up entire industries all at once. “So, when we say to an executive, if you want to get near-term value creation from AI, it’s often a bit like playing the lottery,” Suda said. “It’s expensive, it’s time consuming, you’re probably not going to succeed, so this is not where you want to play.”

The costs for each AI strategy can vary dramatically. Cost estimates can sometimes exceeding the initial projections five- or 10-fold. That’s especially true with “extend” strategies, where organizations start small and notch some quick successes — then see costs spiral out of control.

Gartner Inc.

“Extend is where we are seeing the most value being created, but it’s where we’re also seeing the most unanticipated costs happening,” Suda said. “CEOs will ask, ‘How much does it cost us — like 10 bucks a month per user?’ And CEOs are like, ‘Fantastic, let’s give that to everybody, and then we’re going to see what happens.’”

The costs begin rising as organizations offer up genAI access to more employees, which commonly happens once initial value is discovered.

The main costs of doing genAI business

According to Gartner, excluding data transformation costs, a defend AI strategy using tools such as Microsoft Copilot, Google Gemini or DeepSeek costs about $500 per worker, while extend models range from $250,000 to $5 million and upend frameworks from $20 million to $250 million.

Microsoft 365 Copilot, Suda said, is a prime example of a per user/per month or year pricing model — and it’s typically fixed. “So, all you’re going to pay for is the number of users times the per-user license and maybe a little bit of integration,” he said.

But those so-called “fixed costs” are deceiving. Copilot costs, for example require far more than just the license fee of $500 per user, per year.

Like most tech projects, genAI initiatives come with a wide range of costs. The largest are often development-related, including engineers, data scientists, AI specialists, project managers, and the need to upskill workers; the latter is something many companies are now facing.

One problem that comes with a successful project is what Suda called the “hungry like a wolf” scenario, where users love the chatbot and can’t get enough of it. “One answer is not enough per interaction. So, a user is asking four questions each time and using it four times a day, not just once,” Suda said.

Companies also need to factor in infrastructure costs, cloud hosting services — from AWS, Microsoft, IBM and others — and the cost of GPUs and data tools. Vendors such as AWS and Microsoft offer cost-estimation frameworks, which outline scenarios from proofs of concept to full-scale deployments. Gartner Research also offers a cost modeling tool.

Additionally, the FinOps Foundation offers information around how to better estimate the cost of genAI workloads; how to forecast AI services costs; the effects of optimization on genAI forecasting; and comparisons between on-prem genAI FinOps versus Cloud FinOps.

A common mistake organizations often make is focusing too much on initial build and training, while underestimating ongoing costs such as inference and operations.

Another key factor is data—what data is being used use, where it lives, and how to prep it. This part is often overlooked, but critical to success.

Finally, teams often underestimate the human expertise required, or they get stuck deciding whether to build in-house or buy from providers. Even those with their own hardware can run into GPU and scalability challenges.

Every AI provider also has many different models, and the price difference for each one can be enormous. “We’re talking about tens of times to hundreds of times differences in price, depending on if you use this model, which is not so good, or this one which is the latest and greatest,” Suda said.

Gartner Inc.

AI vendor pricing models can be volatile, too. Most AI providers measure AI use by how many words or characters you use in a query or “prompt.” In other words by input and output.

Tokens – the new pricing model

Like most others, OpenAI’s unit of measure is a “token.” One token is equal to about four characters or three-fourths of an English word, and so one word is about 1.33 tokens. OpenAI charges per million tokens used.

“Folks don’t often fully understand, and they’re not used to ingest and output,” said Chris Hennesey, enterprise finance strategist at AWS. “So how do you go about creating a cost model? Because that’s, I think, something that a lot of companies either don’t do or they don’t do correctly.”

Model prices vary widely. GPT-4.1 mini costs 80 cents per million input tokens and $3.20 for a millionoutputs, while GPT-4.1 charges $3 for 1 million inputs and $12 for a million outputs. And those costs can be higher for other models.

Suda recommends organizations experiment with cheaper models first to see if they can handle the job just as well as the pricier ones. “The biggest way to minimize cost: You may not need 50 employees using that app. You may only need to offer it to the five who will get the real value from it,” he said.

AI cost effectiveness also depends on three factors: input/output quality, iteration count, and AI-ready data — the last being crucial if you’re starting from scratch, Suda said.

“What you’re going to find on the input and output tokens is a dramatic difference in price,” Suda said. “The build cost is quite low to get into the game, but once you begin using it, the costs go up. Say you have 400 users to start. By year four, you may have 2,000 users.”

“So, what happens? You’re consuming more and your costs goes up four times by year four,” he said.

“GenAI is not like Google, but some organizations use it like Google — you go into it and ask a question and get an answer. That doesn’t really happen,” Suda continued. “You get an answer and often think, ‘That’s not quite what I wanted.’ And so that makes them want to ask another question. That can multiply your cost quickly.”

The hidden costs that can add up

Thermo Fisher Scientific’s Kwiecien said one cost that hasn’t been considered involves testing. “Every time you ask a question and test it, that’s a cost,” she said. “I’m not just going to load that 500 time,s because that will cost me every time.

“We need to test how often AI gives good answers to common questions like ‘What’s the recruiting process?’ or ‘Where’s my 401(k) info?’” she said. “But each test costs money, so we have to balance accuracy with cost and decide how many times to test to be confident in the results.”

Thermo Fisher is currently using a virtual chatbot from ServiceNow, and hopes to make it more intuitive by adding a genAI layer. As a result, it’s currently eyeing genAI solutions from Microsoft, IBM and others.

Another cost can come with efforts to use genAI in hiring. Amy Ritter, vice president for Talent Acquisition at Thermo Fisher, said the company implemented a genAI-powered hiring app from Phenom to automate parts of its global manufacturing hiring platform. The company then had to invest in job preview videos to show candidates what it’s like to work at Thermo Fisher — covering the environment, required PPE, and key skills — since recruiters weren’t involved early in the process.

The cost of change management is also often overlooked, Ritter said. “We invested time and money visiting sites, engaging leaders, and building buy-in, which paid off with strong adoption at launch,” she said.

Injecting Phenom’s genAI into its HR hiring platform, however, netted big returns, Ritter said. It cut candidate screening time from 16 days to a just 7 minutes. Along with automating interview scheduling, cumulatively Thermo Fisher is saving over 8,000 hours a year in candidate screening, 12,000 hours in scheduling time and filling roles 10% faster, Ritter said.

And there are infrastructure costs — the cost of building out, running and maintaining server farms, including managed service, is also often underestimated, according to AWS’s Hennesey. “One insurance customer had 200 [proofs of concept] running, but couldn’t articulate the expected value — most were just experiments. Our advice: clearly define the problem, align it with organizational goals, and measure expected returns,” he said.

Moving from pilot to production can also be a soft spot for costs, as can shifting from on-prem to the cloud; the latter means new services and pricing models that need to be understood and forecast.

AWS’s Bedrock, Microsoft’s Azure AI Studio, Google’s Cloud Vertex AI, IBM’s Watson.ai and Cohere’s Platform are all fully managed service offerings that allow AI developers to build apps using top foundation models via a single API — no infrastructure management needed. “You pay on a per model, on a per region basis,” Hennesey said. “And, then you have to think about tokens.”

Making a “capacity commitment” to a vendor can cuts costs. So instread of buying capacity “on demand” organizations can make an LLM capacity commitment for a specific amount of time – whether one month or six months – and deliver up to a 60% savings, Hennesey said.

The bottom line: there’s still a lot of uncertainty around the cost of genAI projects because the technology is still in its early days — and still evolving.

“I feel like we’re not getting great answers, because people are unsure how it’s going to be used,” Kwiecien said. “And so it’s hard to understand what your usage may look like in the future, because we can’t tell how fast it’s going to take people to flip to that.

“How fast are we going to get the solutions to really answer the way that we want it to answer?” she said.Rushing into genAI? Prepare for budget blowouts and broken promises – ComputerworldRead More