Personally Identifiable Information (PII)

Legal Technical

Information that identifies, relates to, or could reasonably be linked to a particular individual. GDPR refers to the same concept as “personal data”; U.S. statutes call it “PII,” “personal information,” or specific subcategories such as PHI (health information under HIPAA) and NPI (financial information under GLBA). Conventionally, PII is divided into two classes (1) direct PII and indirect PII (2) and the distinction has real operational consequences. Direct PII (also called direct identifiers) is information that singles out an individual on its own. Examples: full legal name, Social Security number, taxpayer identification number, passport or driver’s license number, account number, email address, telephone number, biometric template (fingerprint, face, iris, voiceprint), full-face photograph, and any other unique identifier assigned to a single person. Indirect PII (also called indirect identifiers or quasi-identifiers) is information that does not identify a person on its own but can do so when combined with other attributes. Examples: ZIP code, birth date, gender, race or ethnicity, employer, job title, marital status, education level, IP address, device identifier, browser fingerprint, transaction patterns. A well-known result (Sweeney, 2000) showed that 87% of the U.S. population can be uniquely identified from ZIP code, birth date, and gender alone (pure indirect PII). The distinction matters because a process that strips only direct identifiers will often leave records re-identifiable from the indirect ones. Definitions also vary in scope at the edges: IP addresses and device identifiers are personal data under GDPR and CCPA but are not always treated as PII under U.S. sectoral and state laws.

Detect & redact PII in your documents

PII Anomalyzer scans PDFs, Word, and Excel files for 55+ entity types using on-device AI. Your data never leaves your machine.