← Back to Features

Word Document Anonymizer

Remove PII from Microsoft Word and LibreOffice documents

📘 DOCX (Microsoft Word)
📄 ODT (LibreOffice/OpenDocument)

LogScrub processes Word documents by extracting text, anonymizing PII, and repackaging the document with sanitized content. The original formatting, styles, and structure are preserved.

📸 Screenshot placeholder: Document preview with side-by-side original and anonymized content

How It Works

  1. Extract — Document is unpacked and text content is extracted from all paragraphs, tables, headers, and footers
  2. Analyze — 90+ PII detection patterns scan the text for sensitive data
  3. Anonymize — Detected PII is replaced with consistent placeholders
  4. Repackage — Modified text is placed back into the document structure
  5. Download — You get a new document with PII removed but formatting intact

Before & After Examples

Employment Contract

Legal Document Text
Before
EMPLOYMENT AGREEMENT

This agreement is entered into between
ACME Corporation and John Michael Smith
(SSN: 123-45-6789).

Employee Contact Information:
Email: john.smith@personal-email.com
Phone: (555) 123-4567
Address: 742 Evergreen Terrace
         Springfield, IL 62701
After
EMPLOYMENT AGREEMENT

This agreement is entered into between
ACME Corporation and [NAME-1]
(SSN: [SSN-1]).

Employee Contact Information:
Email: [EMAIL-1]
Phone: [PHONE-1]
Address: [ADDRESS-1]
         [CITY-1], [STATE-1] [ZIP-1]

Meeting Notes

Internal Document
Before
Meeting Notes - January 15, 2024
Attendees: Sarah Johnson, Mike Chen

Action Items:
• Sarah to email client at
  client@bigcorp.com by Friday
• Mike to call vendor at 415-555-0199
• Review server logs from 192.168.1.50
After
Meeting Notes - January 15, 2024
Attendees: [NAME-1], [NAME-2]

Action Items:
• [NAME-1] to email client at
  [EMAIL-1] by Friday
• [NAME-2] to call vendor at [PHONE-1]
• Review server logs from [IP-1]

What Gets Preserved

✓ Preserved

✓ Anonymized

Document Metadata

Word documents often contain hidden metadata with sensitive information:

LogScrub focuses on visible text content. For complete metadata removal, consider using your word processor's built-in "Inspect Document" feature alongside LogScrub.

Tables and Structured Content

LogScrub processes text within tables, maintaining cell structure:

Table Content
Before
| Name        | Email              | Phone        |
|-------------|-------------------|--------------|
| Alice Brown | alice@corp.com    | 555-111-2222 |
| Bob Green   | bob.g@company.net | 555-333-4444 |
After
| Name     | Email      | Phone      |
|----------|------------|------------|
| [NAME-1] | [EMAIL-1]  | [PHONE-1]  |
| [NAME-2] | [EMAIL-2]  | [PHONE-2]  |

Features

Live Preview

See the original document rendered alongside the extracted/anonymized text before downloading.

Consistency Mode

Same values get the same replacement throughout the document. If "John Smith" appears in the header and body, both become [NAME-1].

Custom Patterns

Add your own regex patterns for company-specific identifiers, project codes, or internal reference numbers.

Supported Formats

Note: Legacy .doc format is not supported. Please save as .docx first.

Ready to anonymize your documents?

Drop your DOCX or ODT file into LogScrub to get started.

Launch LogScrub