The rise of AI adoption and its associated costs reflect recent reports that estimate global AI spending at $154 billion for 2023. Recent McKinsey research estimates that companies using generative AI (e.g., large language models), AI-powered IT operations solutions (a.k.a., AIOps), and other artificial intelligence technologies could add annual value from $2.6 to $4.4 trillion.
However, organizations frequently underestimate the cost of deploying and operating artificial intelligence, which leads to budget overruns and disappointing AI outcomes. This AI cost analysis guide describes factors driving up costs for the primary AI solution types, discusses some trends influencing pricing in either direction, and offers advice for controlling AI spending.
Granica Crunch is a data compression service that can reduce AI data spending. Use our cloud cost savings calculator to see how Crunch can shrink your data expenses and improve the value of your AI investments.
While numerous factors and trends may contribute to AI costs, they all differ depending on the specific type of selected AI solution and how it’s deployed. The FinOps Foundation describes three basic AI deployment models:
Different factors drive the total costs for each deployment type, and industry trends continue to influence pricing in various ways.
AI Solution Type |
Examples |
Primary Cost Drivers |
Pricing Trends |
Google Vertex AI, Microsoft/OpenAI ChatGPT Enterprise, BMC Helix, PagerDuty |
Application API SKU usage based on words of text or tokens in and out. |
Prices are rising due to increased demand and increased internal running costs for vendors. The most recent versions of LLMs and plugins are 5-20X more expensive than older versions. |
|
Lambda, Replicate, Anyscale, HuggingGPT, Kubeflow, Flyte |
GPU and RAM capacity commitments and consumption. |
Training costs are dropping thanks to pre-trained/pre-tuned models and software frameworks, but new CPU/GPU/TPU generations have higher unit rates compared to previous generations. |
|
AWS Bedrock, AWS Sagemaker, Azure AI, GCP Vertex, Databricks, Data Robot, Alteryx, Dataiku |
GPU and RAM capacity commitments and consumption; hiring, training, and development costs. |
Managed middleware services and prebuilt model templates & patterns are reducing manpower, waste, and overall costs. |
Third-party vendors offering closed solutions typically charge based on how many tokens (or words) are input and output for each application API SKU. Deployment and usage costs continue to trend upward, although commitment-based discounts can help.
AI cost drivers:
Pricing trends:
Open-source AI solution providers usually charge for infrastructure GPU/RAM capacity and consumption. Prices are stabilizing due to more cost-effective training and fine-tuning using pre-existing models and templates, as well as narrow-task model capabilities (like customer service bots, recommendation engines, and IT operations monitoring), though new processor generations have higher unit rates than before.
AI cost drivers:
Pricing trends:
Like the other solution types, many DIY cloud AI platforms also charge for infrastructure GPU/RAM capacity and consumption, though the factors driving those costs up and down are different. Other DIY solutions like Databricks and Dataiku offer a mix of feature-based and consumption-based pricing.
Some of the biggest cost drivers are the time and resources required to develop, train, and maintain self-built models. New cloud resource management tools, as well as prebuilt model templates and patterns, are helping organizations control DIY AI costs.
AI cost drivers:
Pricing trends:
Many organizations make the mistake of approaching AI cost optimization from a purely financial or technical perspective. It’s also important to consider factors like vendor and contract management, security and compliance, and operational monitoring and optimization.
When creating budgets, consider the total costs of deployment and operation, including details like data storage and processing, time required for in-house resources to get up to speed, and the potential for real-world models to have higher running costs than POCs. Use cloud cost management software that provides real-time alerts when models approach or exceed budget thresholds.
Organizations should choose an AI solution type that fits their budget and does not require any more AI expertise than their current or planned in-house team can handle. It’s important to right-size AI workloads to prevent unnecessary GPU consumption - this can be done manually, but there’s a healthy marketplace of vendor-specific and third-party tools that can automatically right-size resources for real-time cost management. In addition, it’s crucial to manage cloud data storage costs by compressing training data and deleting unattached storage volumes, either manually or with an automated tool.
Continuously monitor current vendors for new discounts or price changes and, when possible, renegotiate the contract for a better deal. It’s important to avoid committing to too much for too long just to get those discounts, however, because they could prevent future advancements or strategy shifts.
Avoid regulatory fines, reputational damage, and IP theft by protecting AI data with comprehensive data privacy policies and controls. Train all staff on writing prompts safely without inputting any sensitive or regulated information. Use PII data discovery and data masking tools, real-time input and output validation, AI firewalls, and other data privacy solutions designed for generative AI to mitigate the risk of data leaks and AI attacks.
Controlling AI costs is an ongoing process, so it’s important to monitor continuously to identify new inefficiencies or opportunities for additional savings. Establish cost and consumption thresholds and use resource utilization monitoring and cost allocation tools that send automatic alerts when those thresholds are approached or exceeded. Example metrics to monitor include percentage resource/token utilization, percentage of unused resources/tokens, and percentage of commitment-based discount waste.
In addition, establish performance metrics for AI model accuracy, latency, and time to response to help determine when to increase (or reallocate) computational resources. Monitoring model performance helps ensure that cost-saving measures don’t negatively affect AI outcomes.
The ultimate goal of AI cost analysis is to achieve a balance between saving money and improving AI outcomes. Plus, deploying the Granica Crunch data lakehouse-native compression service helps data and AI teams optimize and shrink the physical size of their columnar data files (such as Apache Parquet) by up to 60%, thus reducing their monthly cloud storage costs by the same percentage.
The resulting smaller physical files not only reduce at-rest costs, they reduce the cost (and time) to transfer data across cloud regions, addressing AI-related compute scarcity, compliance, disaster recovery and other use cases requiring bulk data transfers. Even better, smaller files speed query performance and reduce the data loading time when training models, leading to faster and more cost-effective AI development.
Also, Granica Screen is a highly compute-efficient data discovery and masking solution that makes data safe for use with AI, from training to inferencing. It delivers 5-10X lower cloud infrastructure costs (with state-of-the-art discovery accuracy) than traditional tools.
You could use Screen to process the same volume of data as those other tools and enjoy the cost savings. However, AI models improve by training on more information, so a better strategy is to spend the same amount of money to protect and train on 5-10X more data.
This is how Granica drives business value and AI outcomes, rather than just reducing costs.
Explore an interactive demo to see how Granica Crunch can help you increase the ROI of your AI investment, or watch a demo of Granica Screen in action.
Main Sources: