Features

Stop redacting documents by hand. Let AI do the heavy lifting.

PII Anomalyzer uses dual AI models to detect PII, paired with a manual draw-to-redact toolbar — so you catch everything, miss nothing, and never upload a file to do it.

Detection

Context-aware AI, not just regex

PII Anomalyzer runs two independent AI models on every scan and combines their results. The dual-model approach catches what single-model tools miss: "Jordan" is recognized as a person in one sentence and a country in the next.

  • 54+ entity types detected automatically
  • Dual-model detection for higher accuracy
  • Pattern recognizers for US, UK, AU, and NZ postal formats
  • Surface form propagation — detect once, match everywhere
  • Form field PII detection for fillable PDFs

Entity Categories

Identity & Personal

PERSON, EMAIL, PHONE, ADDRESS, URL, IP_ADDRESS, USERNAME...

Government & Legal IDs

SSN, DRIVER_LICENSE, PASSPORT, NATIONAL_ID, VISA_NUMBER...

Financial

CREDIT_CARD, BANK_NUMBER, IBAN, CURRENCY, TAX_ID, CRYPTO...

Health & Insurance

HEALTH_INSURANCE_ID, MEDICAL_CONDITION, MEDICATION, NHS...

Travel & Logistics

RESERVATION, FLIGHT_NUMBER, TRAIN_TICKET, SERIAL_NUMBER...

Supported Formats

PDF

Native text, forms, scanned

DOCX

Word documents

XLSX

Excel spreadsheets

XLS / XLSM / XLSB

Legacy & macro-enabled

Plain Text

Paste or load directly

Scanned Docs

Built-in OCR (multiple engines)

Document Mode

Import any document, see results side by side

Drop in a PDF, Word document, or Excel spreadsheet. PII Anomalyzer auto-detects the document type — native text, fillable form, or scanned image — and applies the right processing pipeline. A side-by-side viewer shows the original next to the de-identified version with native vector PDF rendering.

  • Auto-detects PDF type: native, form, or scanned
  • Non-PDF formats converted automatically
  • Side-by-side original vs. de-identified view
  • Coordinate-aware text processing
  • Built-in OCR for scanned documents
De-identification

Four ways to protect sensitive data

Choose the right method for your workflow — from permanent removal to non-destructive annotation.

Redact

Black boxes permanently cover PII. The underlying text is removed from the PDF layer — gone for good.

Destructive

Replace

Numbered identifiers like <<PERSON1>> replace the original text. A legend page is appended for reference.

Destructive

Highlight

Semi-transparent overlays in entity-specific colors mark detected PII for review — non-destructive.

Non-destructive

Mask

Character-level replacement preserving text length. "Jane Doe" becomes "J*** D**" — readable shape, hidden content.

Destructive
Draw-to-Redact

Manual precision for what AI misses

Handwritten text, logos, signatures, and visual PII in images — some content needs a human eye. Draw redaction rectangles directly on the PDF, resize with drag handles, and nudge with arrow keys for pixel-perfect placement.

  • Select, Draw, Delete, and Clear All toolbar modes
  • Click-and-drag to create redaction boxes on any page
  • Resize by dragging edges, nudge with arrow keys
  • Always renders as solid black — explicit "remove this" intent
  • Persists between de-identification runs for iterative refinement
  • Zoom-independent — coordinates stored in PDF points

Toolbar

Select

Click to select, drag to move

Draw

Click and drag to create redaction boxes

Delete

Remove selected rectangle

Clear All

Remove all rectangles

Batch Processing

Process multiple documents at once

Queue multiple documents for processing in a single workflow. Each document gets its own de-identified output, modified PDF, and results table — combined into a single exportable dataset.

Re-identification & Export

De-identify, share safely, translate back

Export a de-identified document using the Replace method, share it with AI tools or external reviewers, then use re-identification mode to translate their responses back to the original context. You can also use it to verify redactions before final release. Export de-identified text, results tables (XLSX), and modified PDFs.

.txt De-identified text
.xlsx Results table with entity type, confidence, method
.pdf Modified PDF with redactions applied
Privacy by Design

100% offline. Zero exceptions.

Every AI model, every algorithm, every computation runs on your machine. No API calls, no telemetry, no analytics, no data transmission of any kind. There is no server to breach because there is no server.

No internet required

Works completely offline after installation

All models bundled

NLP and OCR engines ship with the app — nothing to download separately

No telemetry

Zero analytics, zero tracking, zero data collection

See it in action

Download the free trial and start detecting PII in your documents today.

7-day free trial · $249/year · Windows & macOS