What Does Redacted Mean?
To redact a document means to remove or obscure sensitive information from it before sharing or publishing. In a redacted document, parts of the text are hidden, usually with a black bar, deletion, or replacement, to protect personal data, classified information, or material that cannot legally be released. The word comes from the Latin redactus, meaning “edited” or “brought back,” and is used today in courts, regulatory agencies, healthcare organizations, and any institution that handles confidential information.
Redacted vs. masked, deleted, anonymized
These words get used interchangeably all the time. They mean different things, and confusing them can leak data you thought you had hidden.
| Term | What it means |
|---|---|
| Redacted | The sensitive content has been removed or hidden, but the surrounding document and context remain visible. |
| Masked | A value has been hidden but partially preserved. A Social Security number shown as XXX-XX-1234 is masked, not redacted. |
| Deleted | The content is gone entirely, often with no indication that anything was removed. |
| Anonymized | Identifiers are stripped so the data cannot be linked back to an individual, while the rest of the document remains usable. See anonymization. |
| Classified | The document is restricted under government secrecy rules, but not necessarily redacted yet. |
The distinctions are not academic. A redacted FOIA response shows you the document with parts blacked out. An anonymized dataset strips identity but leaves the rest of the record usable. These are different operations with different compliance implications. The formal framework that organizes them under HIPAA and GDPR is called de-identification.
Why documents get redacted
Any document that leaves the room it was written in usually has parts that need to disappear first. Five reasons cover most cases:
- Legal discovery and litigation. Court filings frequently contain personal information (Social Security numbers, account numbers, minors’ names) that must be redacted under court rules before public filing.
- Freedom of Information Act (FOIA) requests. U.S. federal agencies must release records on request, but they must also redact information covered by FOIA exemptions: national security, personal privacy, law enforcement, deliberative process, and others.
- Healthcare and HIPAA compliance. Medical records shared for research, audits, or insurance review must have protected health information removed before disclosure.
- Contract redlining and corporate disclosure. Commercial contracts shared with investors, regulators, or the public (through SEC filings) often have pricing, customer lists, or trade secrets redacted.
- Government and intelligence documents. Classified documents released publicly, declassified for historical research, or produced under congressional inquiry typically have national-security material redacted.
Famous examples of redacted documents
Every reader has seen at least one of these:
- The Mueller Report (2019) was released with substantial redactions on grounds including grand jury material, ongoing investigations, and personal privacy.
- The Epstein documents released by federal courts in 2024 contained redactions to protect victim identities and uncharged third parties.
- The 9/11 Commission Report included pages redacted under national security classifications, with some sections later declassified and released years afterward.
- The Snowden NSA disclosures were redacted by press outlets before publication to omit operational details.
- SEC EDGAR filings routinely include redacted exhibits where companies have negotiated “confidential treatment” for commercially sensitive material.
Each example illustrates a different trigger: privacy, national security, witness protection, or commercial confidentiality.
What gets redacted
The most common categories of redacted content are:
- Direct personal identifiers: Social Security numbers, taxpayer IDs, full names, account numbers, addresses, phone numbers, dates of birth
- Protected health information: Medical record numbers, diagnoses, treatment dates, billing information
- Financial information: Bank account numbers, credit card numbers, internal financial statements
- Minors’ information: Names, schools, and parental information for individuals under 18
- Trade secrets: Pricing, customer lists, technical methods, internal communications
- National security information: Classified content under Executive Order 13526
- Witness and victim identities: Especially in criminal proceedings and witness protection contexts
- Sources and methods: Intelligence sources, undercover operations, confidential informants
This list maps closely to what privacy law calls personally identifiable information (PII), the broader category that GDPR, CCPA, HIPAA, and similar statutes regulate.
The four ways modern tools handle sensitive content
“Redact” is one of four distinct ways an automated PII detection tool can deal with sensitive content in a document. Each has a different visual result and a different use case.
- Redact. Solid black bars permanently cover the text, and the underlying content is removed from the file. Right for documents being released to a third party, filed in court, or otherwise leaving your control. Once committed, the redacted text cannot be recovered.
- Replace. The sensitive value is swapped with a numbered placeholder like
<<PERSON1>>or<<SSN3>>. A legend page maps the placeholders back to the originals so the document can be re-identified later if needed. Useful for sharing with external AI tools, vendors, or reviewers who do not need the underlying values but may need to understand relationships between entities. - Highlight. A semi-transparent colored overlay marks every detected entity without altering the document. Nothing is hidden, nothing is removed. Useful as a QA step: review every detection, confirm or reject each one, then commit a final Redact or Replace.
- Mask. The value is replaced character by character while preserving its shape.
Jane DoebecomesJ*** D**. A Social Security number123-45-6789becomesXXX-XX-6789. Common in audit logs and customer service screens where a downstream system needs to see that a name or card number is present without seeing the value itself.
These four are user-facing methods, not file-level operations. Underneath, all of them involve some combination of bounding-box detection, text-layer manipulation, and byte-level export. The technical detail matters because shortcuts on the file-level export are what cause the most spectacular redaction failures.
The mistake that’s leaked classified documents for years
Federal agencies, law firms, and major news organizations have all been embarrassed by the same bug: they “redacted” a PDF by drawing black rectangles on top of the text, then published the file. Readers selected the text underneath the rectangle, copy-pasted it into another document, and read every word.
The DOJ, the Pentagon, the NYPD, and several international law firms have all shipped this exact mistake. Once the file is out, it cannot be un-leaked.
The fix is simple in principle: the sensitive content must be removed from the file itself, not just hidden from view. Doing that reliably across thousands of pages is where the difficulty starts.
Is redaction reversible?
Properly redacted content should not be reversible. The original text must be removed from the file, not merely hidden. But improper redaction (visual overlays, low-resolution image edits, surviving metadata) can be reversed by anyone with the right tools, sometimes years later.
Professional redaction software treats redaction as a permanent, file-level transformation. The redacted document is exported as a new file with the sensitive content stripped at the byte level. The original is kept separately if it’s needed for later reference or re-identification.
Why automated redaction has become standard
Manual redaction takes about 15 to 30 minutes per contract. It does not scale to a 5,000-page discovery production, a quarterly compliance audit, or a hospital that needs to share thousands of records with researchers.
Modern redaction tools use AI to scan a document for dozens of sensitive entity types (names, account numbers, medical record numbers, addresses, dates of birth) and apply consistent redactions in seconds. Accuracy is not perfect, so professional workflows pair automated detection with human review.
The harder problem is keeping the underlying data on the redactor’s own machine. Cloud-based redaction services upload your sensitive documents to someone else’s server to process them, which is exactly the data movement most privacy regulations are designed to prevent. Desktop redaction tools that run on-device sidestep this entirely. The document never leaves the building.
If you handle documents that contain PII at any volume, this is the workflow worth knowing about. See the full redaction entry for the technical breakdown, or read the companion field manual on privacy regulations for AI for guidance on which laws apply to which workflows.