Configuring Granica Screen

Once you've installed the Granica platform with Granica Screen enabled, it's time to start monitoring and protecting your data. This can be managed through the Granica CLI interface.

1. Identify data to be protected by Granica Screen

Granica Screen supports scanning existing data stored in Amazon S3 or Google Cloud Storage (GCS). The first step is to identify buckets or data of interest are identified, which might be all buckets within your organization! The data of interest can then be configured for scanning in the Granica policy.

Currently, the following file types are supported - unsupported files will be skipped and will not affect the scanning process.

File TypeExtensionsScan MethodAvailable Now
Big Data.parquet, .snappy.parquetStructured ParsingYes
Comma/tab separated.csv, .tsvStructured ParsingYes
Text.json, .txt, .html, etc.Intelligent ParsingYes
Email.emlIntelligent ParsingYes
Archived/Compressed.gz, .zipDecompress and ParseYes
Image.jpeg, .png, .tiffOCRComing soon
Document.pdf, .doc, .xlsx, .pptxIntelligent ParsingComing soon

2. Specify types of sensitive data to identify

Within the Granica policy, the set of sensitive data to identify can be configured.

Currently, the following types of sensitive data are supported by standard classifiers. Custom classifiers can also be specified in addition to these, and Granica is continuously adding support for additional types of sensitive data.

Classification TypeDescription
Phone NumberA telephone number. Supports phone number formats from around the world.
Email AddressAn email address.
Social Security NumberA US social security number, excluding invalid social security numbers.
Credit Card / Debit Card NumberSupports payment cards issued by most vendors globally.
Vehicle Identification NumberUnique code used by automotive industry to identify an individual motor vehicle.
ABA Routing NumberAmerican Bankers Association identifier for a specific financial institution to facilitate payment processing
US Driver License NumberAlphanumeric identifier for a driver license. Supports variations in format across the US.
Passport NumberAlphanumeric identifier for a passport. Supports variations in format around the world.
Street AddressA street address. Includes subtypes US address, UK address, Dutch address, Swedish address, German address
IP AddressNumerical label identifying a host on a network. Supports IPv4 and IPv6.
PasswordA cleartext password.
Person NameA person's name, which can include first names, middle names or initials, and last names.
Tracking linkA link from a shipping carrier for delivery tracking, which may link to a public page with a name, address, or other information
Latitude/LongitudeA pair of coordinates identifying a location, with at least 3 decimal points of precision.
Device IDIdentifier for specific mobile device, commonly used for advertising. Examples include Apple IDFA ID, Android AAID/GAID, MAID(Mobile Ad ID)
User IDAlphanumeric unique user identifier. Supports common unique ID formats.

3. Specify report format and location

After the data is scanned, Granica Screen generates reports for each instance of sensitive data identified. The format and location of this report can be customized as follows within the Granica policy.

Output formatjson, csv, parquet
Output compressionnone, gzip, snappy (parquet only)
Output locationAn AWS S3 or GCS location. If unspecified, a bucket will automatically be created.

The generated report includes the following information for each instance of sensitive data:

nbigintIndex of result within result file
obj_keystringThe cloud object containing this instance of sensitive data
classification_typestringThe type of sensitive data identified
offsetbigintThe offset location within an unstructured file
classified_sizebigintThe length of the result within an unstructured file
rowbigintThe row number of a result within a tabular file
colbigintThe column number of a result within a tabular file
column_namebigintThe column name of a result within a tabular file, when available

4. Specify the redacted output format

In addition to generating a detection report, Granica Screen can directly redact sensitive data from a file and create a sanitized copy of the data at a separately configured cloud location. Appropriately redacted data can then be used in broader contexts to enable additional use cases while managing privacy risk.

A variety of redaction formats are supported, along with additional customization options.

Transformation TypeDescription
RedactionRemoval of sensitive data without replacement
ReplacementReplacement of sensitive data with a fixed value
Size-preserving replacementReplacement of sensitive data with a value of equal length, e.g. XXXX
Replace with sensitive data typeReplacement of sensitive data with a label identifying the type of sensitive data, e.g. [EMAIL]

Support for pseudonymization, encryption, and more is coming soon.

See also