Get answers to common questions.
What is Granica?
The Granica AI Efficiency Platform (or just "Granica" for short) is a suite of API services powering AI efficiency for data hungry organizations. It increases the efficiency and utility of your AI-related data and thus your downstream AI pipeline stages, enabling you to free up significant money, resources, and time which you can reinvest to improve AI performance and outcomes. Simply put, Granica increases the ROI of AI. For more check out the Granica overview.
Does my data leave my environment?
No it does not. Granica uses a self-managed, private deployment model in your public cloud environment, and only control plane data and telemetry metrics are shared with Granica corporate systems and employees.
What is the Granica deployment model?
Granica is deployed as managed software that runs in your AWS or GCP cloud account. Granica consists of a control and data plane that is lightweight; is deployed together with the first Granica product you choose to use; and provides shared infrastructure and services to all subsequently deployed Granica AI efficiency products. This platform/product packaging simplifies and streamlines the architecture, making it easy for you to quickly and cost-effectively enable additional Granica products as your needs dictate. For more see the Granica architecture.
How do I integrate Granica into my environment?
Granica is a developer-first platform designed to be consumed as an API by AI applications that work with cloud object storage such as Amazon S3 and Google GCS. Granica normalizes on the S3 protocol and is thus simple to consume. You integrate Granica by making a (typically) single line code change to interact with your buckets via the Granica API instead of your cloud vendor’s vanilla S3/GCS SDK. From then on your bucket operations (GETs, PUTs, DELETEs etc.) are handled by Granica, which can then apply AI efficiency services such as data reduction (with Granica Crunch) and privacy (with Granica Screen). Refer to the Granica API support details, as well as how to test an integration in the context of Granica Crunch.
How do you ensure my deployment is a success?
Your Granica instance generates usage, system health and performance telemetry data which is automatically shared with Granica engineering and support teams to enable predictive analysis, alerting, troubleshooting, and overall success. Telemetry data is preserved in a cloud storage bucket unique to each customer and deployment, entirely separate from customer data. A subset of this telemetry data is also available for you to view at any time via both the CLI and a graphical dashboard. No customer data is ever collected or analyzed.
How do I "undo" a Granica deployment?
What is Granica Crunch?
Granica Crunch is the data reduction service for enterprise AI. It is consumed as an API by applications directly working with AI-related data in S3/GCS. It reduces the storage cost associated with such data without archival and/or deletion, and thus frees up resources to improve model performance and overall AI outcomes.
How much cost reduction does Crunch typically deliver?
Crunch slashes your AI training-related cloud storage costs, typically by 25-60% depending on your data types and access patterns. It does this by lossessly reducing the aggregate size of your data (e.g. by compression) and by minimizing operational costs (e.g. by batching
PUTs). If you’re storing 10 petabytes of AI data in Amazon S3 or Google GCS that translates into >$1.3M per year (growing as your data grows) assuming a 50% data reduction rate. For more, see how Crunch helps you find savings to improve AI.
How does Crunch compare to alternatives?
AI data is hot data, and *Crunch** delivers the savings efficiencies you need without the trade-offs:
- With Crunch, your appdev teams stay focused on delighting their end users and customers, without the complexity, risk, and performance trade-offs associated with DIY compression.
- With Crunch, all your data stays fast and cost-effective, enabling you to maximize its use and value without the asynchronous access complexity and performance trade-offs associated with archival and tiering approaches.
How is Crunch able to deeply reduce data at large scale, even if it was already reduced?
Crunch performs data reduction at a byte-granular level, but for arbitrarily large data sets. Byte-level techniques have always been the gold standard for data reduction, however these techniques break down at TB-scale let alone PB-scale. As a result, current state-of-the-art reduction approaches must use much larger granularity at the kilobyte or even megabyte level, and thus leave the vast majority of the potential data reduction untapped.
Our team of research scientists and engineers have invented an entirely new, patented family of mathematically proven algorithms that implement byte-level data reduction for unstructured data, with high performance at petabyte scale. Our data reduction rates also increase as the scale of data grows, which means that the more data you crunch the more savings you get per byte.
Does Crunch affect read or write latency?
Granica Crunch is extremely fast, adding only ~50ms of latency to reads and ~100ms of latency to writes, and delivering sustained throughput of up to 1.5 GB/s (not Gbps) per node for transparent access to your data. For more see how Granica provides elastic scaling to support and maintain pipeline performance.
How do I securely share buckets with Granica?
Follow this short bucket-sharing guide to grant Granica read-only access to a bucket and make it accessible to us. Securely sharing buckets with us can facilitate the process of estimating savings and/or integration with your apps. For example, you can create and share server access logs and inventory reports to help us pre-validate your application integration. You may also want to share a sample of your AI data so we can optimize our data reduction algorithms for your specific environment.
How do you adjust Crunch pricing based on the value I actually receive?
Granica uses an outcome-based pricing model. When you use Crunch it generates savings (net of any infrastructure costs to generate those savings) by increasing the efficiency of your AI data. We then split the savings - you pay us a small % (called our "Crunch Savings Share") which is billed to you each month, and keep the majority remainder (called your "Bottom Line Savings"). There are zero additional costs. This means Crunch doesn't actually "cost" anything - it "makes" us both money by eliminating inefficiencies in your data.
By way of contrast, you do not pay based on storage consumed (as storage companies typically require) nor based on compute consumed (as analytics companies typically require). If Crunch does not reduce the cost of your cloud storage bill then your bill is $0. Because our outcome-based pricing model is essentially risk free there is no need to justify additional budget or funding. In addition, any future efficiency enhancements translate directly to additional bottom line savings for you. Check out our pricing for more detail.
How do you calculate “outcome-based savings” for Crunch?
Here’s how we calculate outcome-based savings (or just "savings") in an AWS environment, each month:
- We measure your baseline S3 costs including both storage at-rest and access costs (e.g. GET/PUT) without Crunch. For example, for at-rest costs we use the original unreduced data volume (in bytes) and apply published S3 pricing.
- We measure and subtract your actual S3 costs (both at-rest and access) with Crunch. For example, for at-rest costs we use the reduced data volume (in bytes) and apply published S3 pricing.
- We measure and subtract the (typically small) infrastructure costs incurred by Granica (including Crunch) while running in your environment. This includes 1-time costs to reduce data as well as costs to hydrate data upon access, all via underlying cloud primitives such as as EC2, EKS and spot instance costs, storage and access costs for metadata, reduced data etc.
- The result, or outcome, is monthly savings (i.e. cost-reduction)
You then pay us a small % of these savings and keep the majority remainder. Importantly, once a given volume of data is reduced, it keeps generating savings each and every month, perpetually (or until you choose to delete it). For more, check out our pricing details.
How do I "undo" crunching of my data, and/or the entire Granica deployment?
What is Granica Screen?
Granica Screen is the data privacy service for enterprise AI. It is consumed as an API by applications directly working with AI-related data in S3/GCS. It identifies, protects, and monitors all sensitive information in semi- and unstructured text data to:
- improve data security posture and prevent breaches
- unlock data for key business needs
- streamline data security and privacy monitoring
- enable regulatory compliance.
Granica Screen is built to enable privacy-enhanced computing.
How does Screen compare to alternatives?
Unlike traditional DLP and and data privacy solutions that use a side-scanning approach, Granica Screen plugs inline into your applications, detecting and protecting new, incoming sensitive data before it is ever persisted into your cloud object store. Our classification engine also provides both high precision (to mitigate false positives) and high recall (to mitigate false negatives), thus dramatically increasing your data security posture and mitigating breach risk. Screen also works with your existing data, providing you with comprehensive data privacy coverage. Finally, Screen is highly compute-efficient, lowering the cost to scan data by 10x and enabling you to scan and protect all your data, not just a sample.
Can I use Screen and Crunch together on the same data?
Yes, both products are built on the same Granica platform and are fully compatible with one other. You can maximize your AI efficiency benefits by using both together.
How do I get access to Screen?
Apply to our [Early Access program](https://pages.granica.ai/request-access-to-granica-screen and our Screen product team will get in touch with you.
Shared Granica infrastructure
How do you validate system and data integrity?
Granica (and thus API services such as Crunch) implements system and data integrity at many levels to ensure your data is always safe and intact.
- All crunched/source data (i.e. data processed by Crunch) is automatically backed up.
- The Crunch Index database is stored in multiple locations, both in cloud storage and on each node.
- Integrity checks (via hashing) occur when an object is crunched as well as before any object clean-up is performed.
- Frequent full integrity checks on metadata as well as on a random set of objects (including hydration).
- In the event of any integrity failure, the Granica team is alerted, crunching is paused and Crunch goes into read-only mode with new writes temporarily stored in vanilla form while out team investigates. When the integrity issue is resolved, crunching resumes.
For additional detail see Data Integrity.
Side Note: Our Integrity functions actually have more lines of code than our data reduction functions, i.e. your data integrity is always our first priority.
How do you ensure your API services are highly available?
Granica provides HA via the native capabilities of kubernetes clusters in AWS (via EKS) and Google (via GKE), configured to use multiple availability zones. Both EKS and GKE provide an SLA of 99.95% availability. For additional detail see High Availability.
How do you achieve “infinite” scale?
Granica automatically scales out additional nodes (beyond the 2-node minimum for HA) to dynamically handle arbitrarily large loads and maintain performance for enabled API services such as Crunch. Scaling is also completely elastic, so as load decreases Granica automatically shuts down unneeded nodes. This minimizes operational costs and maximizes your savings from Crunch. For additional detail see Elastic Scaling.