CLI for Fidescls
Project description
Fidescls: PII Detection and Classification
A part of the greater Fides ecosystem.
:zap: Overview
Fidescls (/fee-dhez classify/, from Fidēs, Latin for trust and reliability) is an open-source and extensible machine learning classification engine. Fidescls uses the Fides toolset (Fidesctl, Fidesops, and Fideslang) to assist in detecting and labeling potential sources of personal identifiable information, or PII, in your records and databases.
:rocket: Quick Start
Requirements
- Docker 12+
- Python 3.8+
- Make
Getting Started
-
Ensure that the required tools are installed and Docker is running, and clone this repository.
-
From the project's root directory, run the following command:
make api
This will start an instance of the API server, and allow you to begin making requests.
- Make a post request to the
classify
endpoint:
localhost:8765/text/classify
Sample Payload - Content Classification
{
"content": {
"data": [
"sample@aol.com",
"(555) 555-5555",
"4242-4242-4242-4242"
],
"method_params": {
"decision_method": "pass-through"
}
}
}
field | description |
---|---|
data |
A string, or list of strings, representing the data to be processed. |
decision_method |
A value of pass-through returns the higher-level PII classifications to which your data belongs. |
Successful Response:
{
"content": [
{
"input": "sample@aol.com",
"labels": [
{
"label": "EMAIL_ADDRESS",
"score": 1.0,
"position_start": 0,
"position_end": 14
},
{
"label": "DOMAIN_NAME",
"score": 1.0,
"position_start": 7,
"position_end": 14
}
]
},
{
"input": "(555) 555-5555",
"labels": [
{
"label": "PHONE_NUMBER",
"score": 0.4,
"position_start": 0,
"position_end": 14
}
]
},
{
"input": "4242-4242-4242-4242",
"labels": [
{
"label": "CREDIT_CARD",
"score": 1.0,
"position_start": 0,
"position_end": 19
}
]
}
]
}
Sample Payload - Context Classification
{
"context": {
"data": [
"email_address",
"phone_num",
"credit_card"
],
"method": "similarity",
"method_params": {
"possible_targets": [
"user.derived.identifiable.device.ip_address",
"user.provided.identifiable.financial.account_number",
"user.provided.identifiable.contact.email",
"user.provided.identifiable.contact.phone_number",
"account.contact.street",
"account.contact.city",
"account.contact.state",
"account.contact.country",
"account.contact.postal_code"
],
"top_n": 2
}
}
}
field | description |
---|---|
data |
A string, or list of strings, representing the data to be processed. |
possible_targets |
A list of potential Data Categories to classify your data into. |
top_n |
The number of closest results to return. |
Successful Response:
{
"context": [
{
"input": "email_address",
"labels": [
{
"label": "user.provided.identifiable.contact.email",
"score": 0.791374585498101,
"position_start": null,
"position_end": null
},
{
"label": "account.contact.postal_code",
"score": 0.7402522077965934,
"position_start": null,
"position_end": null
}
]
},
{
"input": "phone_num",
"labels": [
{
"label": "user.provided.identifiable.contact.phone_number",
"score": 0.5770164988785474,
"position_start": null,
"position_end": null
},
{
"label": "account.contact.postal_code",
"score": 0.44817613132976103,
"position_start": null,
"position_end": null
}
]
},
{
"input": "credit_card",
"labels": [
{
"label": "user.provided.identifiable.financial.account_number",
"score": 0.5742921242220389,
"position_start": null,
"position_end": null
},
{
"label": "account.contact.postal_code",
"score": 0.5587338672966902,
"position_start": null,
"position_end": null
}
]
}
]
}
To learn more about the difference between Context and Content Classification, see the Classifiers Guide.
You've now successfully begun classifying PII!
:book: Learn More
The Fides core team is committed to providing a variety of documentation to help get you started using Fidescls. As such, all interactions are governed by the Fides Code of Conduct.
Documentation
For more information on getting started with Fidescls and the Fides ecosystem of open source projects, check out our documentation:
- Documentation: https://ethyca.github.io/fidescls/
- Guides: https://ethyca.github.io/fidescls/guides/classifiers/
Support
Join the conversation on Slack and Twitter!
:balance_scale: License
The Fides ecosystem of tools (Fidescls, Fidesops and Fidesctl) are licensed under the Apache Software License Version 2.0. Fides tools are built on Fideslang, the Fides language specification, which is licensed under CC by 4.
Fides is created and sponsored by Ethyca: a developer tools company building the trust infrastructure of the internet. If you have questions or need assistance getting started, let us know at fides@ethyca.com!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.