Skip to main content

Cloud Cost Management Platform

Compact data is AI-ready data

Granica Crunch is the world’s first cloud cost management solution purpose-built to help data engineers lower the cost of their lakehouse data. It goes beyond FinOps observability to automatically remediate data-related inefficiency, lowering cloud storage costs by up to 60% and speeding queries by up to 56%.

The Data Cost Management Situation

The problem
The Problem

It’s hard to control cloud costs for ever-growing data while experimenting with AI. Even with low-cost cloud object storage backing your data lakehouse, a single PB of Parquet data has an annual at-rest cost of nearly $300k/year.

The alternatives
The Alternatives

Tiering into colder, less expensive cloud storage classes doesn't work for analytical and training data given it is regularly accessed. Manually optimizing the compression and encoding of Parquet to lower cloud costs is complex and risky.

The solution
The Ideal Solution

What data teams need is a way to automatically optimize compression of Parquet to lower cloud costs and speed access, analogous to how query optimizers increase query performance. We created Granica Crunch to solve this problem.

Cloud Cost Management Use Cases

Granica Crunch lakehouse-native compression optimization has three key uses cases, each of which delivers big quantifiable value.
Lower cloud costs to store data

60%

Lower costs to store & move data

Crunch shrinks the physical size of the columnar files in your cloud data lakehouse, lowering at-rest storage and data transfer costs by up to 60% for large-scale analytical, AI and ML data sets.

Faster cross-region replication

60%

Faster cross-region replication

Smaller files also make cloud data transfers and replication up to 60% faster, addressing AI-related compute scarcity, compliance, disaster recovery and other use cases demanding bulk data transfers.

Faster processing

56%

Faster processing

Smaller files are also faster files - they accelerate any data pipeline, query or process bottlenecked by network or IO bandwidth. Up to 56% faster based on TPC-DS benchmarks.

Get Big Cloud Storage Savings & Guaranteed ROI

The more data you have the more savings Granica Crunch can deliver. With a 25 PB cloud data lakehouse you'll likely see ~$11M in at-rest savings alone (over 3 years) given typical optimized compression rates. Data transfer savings and query acceleration provide additional value on top. We're so confident you'll save with Crunch that we guarantee a minimum 100% ROI on your investment.

1 Petabyte Data Lakehouse (Parquet)

  Year 1 Year 2 Year 3 Total
Data lakehouse storage costs before Granica $0.26M $0.34M $0.45M $1.05M
Data lakehouse storage costs after Granica $0.15M $0.20M $0.26M $0.60M
Gross savings generated by Granica Crunch $0.11M $0.15M $0.19M $0.45M

* Assumes 30% data growth; 43% compression rate on TOP of Snappy-Parquet

25 Petabyte Data Lakehouse (Parquet)

  Year 1 Year 2 Year 3 Total
Data lakehouse storage costs before Granica $6.6M $8.6M $11.2M $26M
Data lakehouse storage costs after Granica $3.8M $4.9M $6.4M $15M
Gross savings generated by Granica Crunch $2.8M $3.7M $4.8M $11M

 

Estimate cloud cost savings with Granica Crunch
Estimate your cloud cost savings

Slash the cloud storage costs for your Parquet data lakehouse files. Use our simple 2-input calculator to get a preliminary estimate.

Empower Data Engineers to Optimize Spend

Granica Crunch isn't your typical cloud cost management solution. It moves beyond FinOps observability to automatically mitigate data-related inefficiency — data engineers simply set it and forget it. Crunch continuously optimizes the compression of columnar data lakehouse formats such as Apache Parquet to control cloud costs, speed processing, and improve developer/data engineer productivity and efficiency.

Take a product tour

whitepaper

Read Our Latest White Paper "Building Trust, Impact and Efficiency into Traditional and Generative AI

Flatten Your Cloud Cost Curve

Granica’s cloud cost management platform  helps reduce your unit economic cost to store lakehouse data. Capture recurring savings relative to your pre-Granica baseline each and every month, not just once. Ten petabytes of existing columnar data typically translates into an annual gross savings of ~$1.2M, scaling as your data lakehouse grows. Crunch helps you meet your most crucial cloud cost optimization KPIs for both cost and performance.

  •   Lower your cloud storage unit economics
  •   Get ~$1.2M annual savings per 10 PB
  •   Speed data processing up to 56% faster
Unit economics2
Gartner Journey AI-ready data
Gartner®: A Journey Guide to AI-Ready Data

What’s truly meant by the term AI-ready data, and what are the best approaches to successfully deliver AI-ready data aligned to AI use-cases? Download Gartner's latest research report to learn how AI efforts are evolving the data management requirements for organizations, compliments of Granica.

 

 

Related Articles

Top Cloud Cost Management Tools: 2024 Review

Cloud Cost Optimization: The Ultimate Guide

Google Cloud Cost Optimization Guide

Frequently Asked Questions:

How much cloud cost reduction does Granica Crunch typically deliver?

Granica Crunch typically reduces columnar data-related cloud storage costs by 15-60%, depending on your data. In TPC-DS benchmark tests using Snappy-compressed Parquet as the input data, Crunch delivered an average 43% file-size reduction. The reduction rate directly translates into gross savings on an annual basis. If you’re storing 10 petabytes of AI data in Amazon S3 or Google GCS that translates into >$1.2M per year (growing as your data grows).

How does Granica Crunch compress and optimize columnar files?

Granica Crunch uses a proprietary compression control system that leverages the columnar nature of modern analytics formats like Parquet and ORC to achieve high compression ratios through underlying OSS compression algorithms such as zstd. The compression control system combines techniques like run-length encoding, dictionary encoding, and delta encoding to optimally compress data on a per-column basis.

Does Granica Crunch affect query performance and read/write latency?

Granica Crunch is NOT in the read or write path. It works in the background to adaptively optimize the OSS compression and encoding of your columnar lakehouse files stored in Amazon S3 and/or Google Cloud Storage. Apache Spark and Trino-based applications continue accessing the newly-optimized files normally. Query performance actually improves as there are fewer physical bits to read and transfer from object storage to the query engine.  TPC-DS benchmarks show an improvement of 2-56% depending on the query.

 

Is Granica Crunch compatible with my existing data formats, applications, and queries?

Yes, your production applications continue to read and write data as they normally would. Granica Crunch is fully compatible with Apache Spark and Trino query engines utilizing Apache Parquet, including any BI tools querying those engines. Granica-optimized tables can be queried using standard SQL without any modification.

We plan to enhance Granica Crunch to support other major query engines such as Presto, Hive, BigQuery, EMR, Databricks etc., support for other columnar formats such as ORC, and integration with popular data catalogs and schema registries.

How does Granica charge for Crunch?

Granica Crunch is licensed and priced based on the volume of compressed data stored per month, measured using uncompressed source bytes. Granica charges a minimum monthly commitment which allows you to crunch up to 1PB of source data. Thereafter, each incremental terabyte of source data crunched is billed at a $ per TB rate. Please contact us for a custom quote based on your environment.

Additional questions? (like "How does it work?")

Check out our detailed Crunch FAQ and documentation.

Optimize. Save. Accelerate.

Harness the inefficiency in petabyte-scale lakehouse data to improve the ROI on AI. Granica’s cloud cost management platform optimizes cloud storage spend while boosting pipeline performance and developer/OPEX efficiency.