Use Case: Data Cleaning

Learn how Screen generates secure datasets

The problem

Many valuable datasets are difficult or risky to use because they contain sensitive data. For example, it is often not appropriate to give broad access to internal teams to this data, use this data for training machine learning models, or share this data with external vendors or APIs.

This creates a large barrier to entry to unlocking data for business needs. Even if you are aware that sensitive data exists in a dataset after sensitive data discovery or otherwise, it remains difficult to clean the data and generate secure datasets unless you are able to accurately and scalably identify sensitive data and remove or obfuscate it.

How we help

Granica Screen provides integrated tools to automatically redact, remove, or otherwise obfuscate sensitive data detected during the sensitive data discovery process. This process creates a secure copy of the original data with the desired transformations applied, which can then be used broadly with the risk of sensitive data exposure mitigated.

Screen can be configuration to apply a variety of transformations to sensitive data based on the type of sensitive data identified. See the configuration reference for more details.

Why we're the best solution

Best-in-class accuracy

Highly accurate sensitive data detection is the key to successfully cleaning a dataset. First, high recall of sensitive data is required to successfully detect sensitive data, as undetected sensitive data cannot be obfuscated and will continue to be included in cleartext in the "cleaned" data. High precision is also vital so that non-sensitive data is minimally disturbed, maintaining the value of the dataset.

Granica Screen provides demonstrably superior classification accuracy in terms of both recall and precision. We benchmark our performance on a variety of synthetic data, such as data generated by the Presidio Research library, as well as real datasets across a range of filetypes and industries.

Simple integration

Since data cleaning first requires information from sensitive data detection, Granica Screen's integrated solution is the simplest way to deploy a data cleaning solution. Alternative approaches which don't offer transformation options require the maintenance of separate infrastructure and connectors, increasing the management burden as well as the risk of error.

Scalable to petabytes of data and billions of objects

Granica Screen is built on top of Granica's data processing platform, which serves customers storing petabytes of data and billions of objects. Many vendors cannot handle both detection and transformation of data at scale, but Granica's platform has been built to efficiently handle large volumes of data.

See also