Configuring Granica Screen
Once you've installed the Granica platform with Granica Screen enabled, it's time to start monitoring and protecting your data. This can be managed through the Granica CLI interface.
1. Identify data to be protected by Granica Screen
Granica Screen supports scanning existing data stored in Amazon S3 or Google Cloud Storage (GCS). The first step is to identify buckets or data of interest are identified, which might be all buckets within your organization! The data of interest can then be configured for scanning in the Granica policy.
Currently, the following file types are supported - unsupported files will be skipped and will not affect the scanning process.
|File Type||Extensions||Scan Method||Available Now|
|Big Data||.parquet, .snappy.parquet||Structured Parsing||Yes|
|Comma/tab separated||.csv, .tsv||Structured Parsing||Yes|
|Text||.json, .txt, .html, etc.||Intelligent Parsing||Yes|
|Archived/Compressed||.gz, .zip||Decompress and Parse||Yes|
|Image||.jpeg, .png, .tiff||OCR||Coming soon|
|Document||.pdf, .doc, .xlsx, .pptx||Intelligent Parsing||Coming soon|
2. Specify types of sensitive data to identify
Within the Granica policy, the set of sensitive data to identify can be configured.
Currently, the following types of sensitive data are supported by standard classifiers. Custom classifiers can also be specified in addition to these, and Granica is continuously adding support for additional types of sensitive data.
|Phone Number||A telephone number. Supports phone number formats from around the world.|
|Email Address||An email address.|
|Social Security Number||A US social security number, excluding invalid social security numbers.|
|Credit Card / Debit Card Number||Supports payment cards issued by most vendors globally.|
|Vehicle Identification Number||Unique code used by automotive industry to identify an individual motor vehicle.|
|ABA Routing Number||American Bankers Association identifier for a specific financial institution to facilitate payment processing|
|US Driver License Number||Alphanumeric identifier for a driver license. Supports variations in format across the US.|
|Passport Number||Alphanumeric identifier for a passport. Supports variations in format around the world.|
|Street Address||A street address. Includes subtypes US address, UK address, Dutch address, Swedish address, German address|
|IP Address||Numerical label identifying a host on a network. Supports IPv4 and IPv6.|
|Password||A cleartext password.|
|Person Name||A person's name, which can include first names, middle names or initials, and last names.|
|Tracking link||A link from a shipping carrier for delivery tracking, which may link to a public page with a name, address, or other information|
|Latitude/Longitude||A pair of coordinates identifying a location, with at least 3 decimal points of precision.|
|Device ID||Identifier for specific mobile device, commonly used for advertising. Examples include Apple IDFA ID, Android AAID/GAID, MAID(Mobile Ad ID)|
|User ID||Alphanumeric unique user identifier. Supports common unique ID formats.|
3. Specify report format and location
After the data is scanned, Granica Screen generates reports for each instance of sensitive data identified. The format and location of this report can be customized as follows within the Granica policy.
|Output format||json, csv, parquet|
|Output compression||none, gzip, snappy (parquet only)|
|Output location||An AWS S3 or GCS location. If unspecified, a bucket will automatically be created.|
The generated report includes the following information for each instance of sensitive data:
|n||bigint||Index of result within result file|
|obj_key||string||The cloud object containing this instance of sensitive data|
|classification_type||string||The type of sensitive data identified|
|offset||bigint||The offset location within an unstructured file|
|classified_size||bigint||The length of the result within an unstructured file|
|row||bigint||The row number of a result within a tabular file|
|col||bigint||The column number of a result within a tabular file|
|column_name||bigint||The column name of a result within a tabular file, when available|
4. Specify the redacted output format
In addition to generating a detection report, Granica Screen can directly redact sensitive data from a file and create a sanitized copy of the data at a separately configured cloud location. Appropriately redacted data can then be used in broader contexts to enable additional use cases while managing privacy risk.
A variety of redaction formats are supported, along with additional customization options.
|Redaction||Removal of sensitive data without replacement|
|Replacement||Replacement of sensitive data with a fixed value|
|Size-preserving replacement||Replacement of sensitive data with a value of equal length, e.g. XXXX|
|Replace with sensitive data type||Replacement of sensitive data with a label identifying the type of sensitive data, e.g. [EMAIL]|
Support for pseudonymization, encryption, and more is coming soon.