AI Cost Analysis Guide

AI Development & ImplementationCloud Cost Optimization

10 min

The rise of AI adoption and its associated costs reflect recent reports that estimate global AI spending at $154 billion for 2023. Recent McKinsey research estimates that companies using generative AI (e.g., large language models), AI-powered IT operations solutions (a.k.a., AIOps), and other artificial intelligence technologies could add annual value from $2.6 to $4.4 trillion.

However, organizations frequently underestimate the cost of deploying and operating artificial intelligence, which leads to budget overruns and disappointing AI outcomes. This AI cost analysis guide describes factors driving up costs for the primary AI solution types, discusses some trends influencing pricing in either direction, and offers advice for controlling AI spending.

Granica Crunch is a data compression service that can reduce AI data spending. Use our cloud cost savings calculator to see how Crunch can shrink your data expenses and improve the value of your AI investments.

AI cost analysis and guidance

While numerous factors and trends may contribute to AI costs, they all differ depending on the specific type of selected AI solution and how it’s deployed. The FinOps Foundation describes three basic AI deployment models:

Third-party vendors closed source services. Packaged, fully managed, turnkey AI products like Google Vertex AI and Microsoft/OpenAI’s ChatGPT Enterprise are easy to implement and guarantee high-quality models, but they have the potential for privacy risks and higher overall costs than open-source solutions. These solutions may run in the vendor’s cloud or the customer’s public cloud. This category includes many SaaS solutions with embedded AI, such as BMC Helix and PagerDuty.
Third-party hosted open-source services. Platforms like Lambda, Replicate, Anyscale, and HuggingGPT offer customizable, prebuilt, and fine-tuned models that provide greater privacy and security control but require more expertise and take longer to generate ROI. These solutions often run in the customer’s public cloud environment (e.g., AWS). There are also open-source AIOps and MLOps products, including Kubeflow and Flyte.
DIY on cloud providers AI-centric services/systems. Cloud service providers like AWS, Azure, and GCP offer AI infrastructure and tools for organizations to build their own AI models, providing full control over privacy, security, compliance, and cost management but requiring significant expertise and development time. Some companies opt to use solutions like Databricks, Data Robot, Alteryx, and Dataiku to build their own models within their public cloud rather than using the tools provided by the CSP.

Different factors drive the total costs for each deployment type, and industry trends continue to influence pricing in various ways.

AI cost drivers & pricing trends

AI Solution Type	Examples	Primary Cost Drivers	Pricing Trends
Third-party vendors closed source services	Google Vertex AI, Microsoft/OpenAI ChatGPT Enterprise, BMC Helix, PagerDuty	Application API SKU usage based on words of text or tokens in and out.	Prices are rising due to increased demand and increased internal running costs for vendors. The most recent versions of LLMs and plugins are 5-20X more expensive than older versions.
Third-party hosted open-source services	Lambda, Replicate, Anyscale, HuggingGPT, Kubeflow, Flyte	GPU and RAM capacity commitments and consumption.	Training costs are dropping thanks to pre-trained/pre-tuned models and software frameworks, but new CPU/GPU/TPU generations have higher unit rates compared to previous generations.
DIY on cloud providers AI-centric services/systems	AWS Bedrock, AWS Sagemaker, Azure AI, GCP Vertex, Databricks, Data Robot, Alteryx, Dataiku	GPU and RAM capacity commitments and consumption; hiring, training, and development costs.	Managed middleware services and prebuilt model templates & patterns are reducing manpower, waste, and overall costs.

Third-party vendors closed source services

Third-party vendors offering closed solutions typically charge based on how many tokens (or words) are input and output for each application API SKU. Deployment and usage costs continue to trend upward, although commitment-based discounts can help.

AI cost drivers:

Pricing is based primarily on how many tokens or words are input and output
Dedicated capacity and longer commitment terms can provide discount opportunities
Organizations need fewer in-house AI experts, reducing operational costs

Pricing trends:

Increasing demand and running costs continue to drive up prices for vendors, who are passing the increases along to customers
Recent LLM and plugin versions cost up to 5-20X more than previous versions
Many providers including OpenAI now must pre-approve requests for large usage and capacity requirements
Some vendors like Microsoft are updating list prices to offer better time-based discounts

Third-party hosted open-source services

Open-source AI solution providers usually charge for infrastructure GPU/RAM capacity and consumption. Prices are stabilizing due to more cost-effective training and fine-tuning using pre-existing models and templates, as well as narrow-task model capabilities (like customer service bots, recommendation engines, and IT operations monitoring), though new processor generations have higher unit rates than before.

AI cost drivers:

Primary cost driver is infrastructure GPU/RAM capacity and consumption
Requires more in-house technical skills to customize and operate
Organizations must manage computational resources internally
Vendors charge added fees for model training, data processing/storage, and additional SKUs

Pricing trends:

Vendors are training, fine-tuning, and building proofs-of-concept (POCs) more cost-effectively, passing savings on to the customer
Providers focusing on narrow-task AI user stories are charging less than bigger vendors
Infrastructure unit prices are stabilizing for small and medium deployments
New CPU/GPU/TPU generations have higher per-unit rates than previous generations

DIY on cloud providers AI-centric services/systems

Like the other solution types, many DIY cloud AI platforms also charge for infrastructure GPU/RAM capacity and consumption, though the factors driving those costs up and down are different. Other DIY solutions like Databricks and Dataiku offer a mix of feature-based and consumption-based pricing.

Some of the biggest cost drivers are the time and resources required to develop, train, and maintain self-built models. New cloud resource management tools, as well as prebuilt model templates and patterns, are helping organizations control DIY AI costs.

AI cost drivers:

Infrastructure GPU/RAM capacity and consumption
Depending on the platform, organizations may have to manage some aspects of AI model performance, security, data, and computational resources themselves
Requires significant in-house AI expertise and other technical resources
Organizations monitor and manage SKUs for hardware, software, modeling, training, APIs, and other licenses

Pricing trends:

Tools like Tanzu Cloudhealth, Granica Crunch, and Netapp are helping organizations optimize compute resources, data management, and other cloud costs
Prebuilt model templates and patterns, as well as partnerships/integrations, are streamlining model deployment, reducing effort, waste, and overall costs
Reserved Instance/Savings Plans/Committed Use Discounts help organizations with forecasting and some cost savings Spot Instances and batch processing also contribute to decreasing unit costs and total costs

How to control AI spending

Many organizations make the mistake of approaching AI cost optimization from a purely financial or technical perspective. It’s also important to consider factors like vendor and contract management, security and compliance, and operational monitoring and optimization.

Ensure adequate budget constraints and forecasting

When creating budgets, consider the total costs of deployment and operation, including details like data storage and processing, time required for in-house resources to get up to speed, and the potential for real-world models to have higher running costs than POCs. Use cloud cost management software that provides real-time alerts when models approach or exceed budget thresholds.

Optimize technology and infrastructure costs

Organizations should choose an AI solution type that fits their budget and does not require any more AI expertise than their current or planned in-house team can handle. It’s important to right-size AI workloads to prevent unnecessary GPU consumption - this can be done manually, but there’s a healthy marketplace of vendor-specific and third-party tools that can automatically right-size resources for real-time cost management. In addition, it’s crucial to manage cloud data storage costs by compressing training data and deleting unattached storage volumes, either manually or with an automated tool.

Manage the vendor relationship

Continuously monitor current vendors for new discounts or price changes and, when possible, renegotiate the contract for a better deal. It’s important to avoid committing to too much for too long just to get those discounts, however, because they could prevent future advancements or strategy shifts.

Prevent privacy, security, and compliance issues

Avoid regulatory fines, reputational damage, and IP theft by protecting AI data with comprehensive data privacy policies and controls. Train all staff on writing prompts safely without inputting any sensitive or regulated information. Use PII data discovery and data masking tools, real-time input and output validation, AI firewalls, and other data privacy solutions designed for generative AI to mitigate the risk of data leaks and AI attacks.

Continuously monitor and optimize

Controlling AI costs is an ongoing process, so it’s important to monitor continuously to identify new inefficiencies or opportunities for additional savings. Establish cost and consumption thresholds and use resource utilization monitoring and cost allocation tools that send automatic alerts when those thresholds are approached or exceeded. Example metrics to monitor include percentage resource/token utilization, percentage of unused resources/tokens, and percentage of commitment-based discount waste.

In addition, establish performance metrics for AI model accuracy, latency, and time to response to help determine when to increase (or reallocate) computational resources. Monitoring model performance helps ensure that cost-saving measures don’t negatively affect AI outcomes.

Managing AI costs with Granica

The ultimate goal of AI cost analysis is to achieve a balance between saving money and improving AI outcomes. Plus, deploying the Granica Crunch data lakehouse-native compression service helps data and AI teams optimize and shrink the physical size of their columnar data files (such as Apache Parquet) by up to 60%, thus reducing their monthly cloud storage costs by the same percentage.

The resulting smaller physical files not only reduce at-rest costs, they reduce the cost (and time) to transfer data across cloud regions, addressing AI-related compute scarcity, compliance, disaster recovery and other use cases requiring bulk data transfers. Even better, smaller files speed query performance and reduce the data loading time when training models, leading to faster and more cost-effective AI development.

Also, Granica Screen is a highly compute-efficient data discovery and masking solution that makes data safe for use with AI, from training to inferencing. It delivers 5-10X lower cloud infrastructure costs (with state-of-the-art discovery accuracy) than traditional tools.

You could use Screen to process the same volume of data as those other tools and enjoy the cost savings. However, AI models improve by training on more information, so a better strategy is to spend the same amount of money to protect and train on 5-10X more data.

This is how Granica drives business value and AI outcomes, rather than just reducing costs.

Explore an interactive demo to see how Granica Crunch can help you increase the ROI of your AI investment, or watch a demo of Granica Screen in action.

Main Sources: