LogScrub Features
Free, browser-based PII removal and data anonymization for multiple file formats
LogScrub helps you remove personally identifiable information (PII) and sensitive data from files before sharing them. Everything runs in your browser — no data is ever uploaded to any server.
Supported File Formats
PCAP Files Network
Anonymize network captures: IP addresses, MAC addresses, DNS queries, HTTP headers, TLS SNI, and more.
Text & Log Files
Scrub log files with 95+ built-in patterns for IPs, emails, phone numbers, API keys, and custom patterns.
JSON Files
Context-aware detection of secrets in JSON logs. Finds sensitive values by key names like "password" or "token".
CSV Files
Sanitize tabular data with column-aware processing. Perfect for database exports and analytics data.
SQL Dumps Database
Anonymize PostgreSQL, MySQL, and SQLite dumps. Preserves SQL structure while scrubbing INSERT values.
Email Headers Analysis
Visualize mail routing paths, parse spam reports, check TLS encryption, and anonymize sensitive headers.
PDF Documents
Extract and anonymize text from PDF files while preserving document structure.
Word Documents DOCX/ODT
Scrub Microsoft Word and LibreOffice documents. Removes PII from document text and metadata.
Spreadsheets XLSX/ODS
Anonymize Excel and LibreOffice Calc spreadsheets. Process multiple sheets with cell-level precision.
GPS & Fitness GPX/FIT
Anonymize GPS tracks from fitness apps and devices. Strip heart rate, timestamps, and other personal data.
Pattern Matching Rules
LogScrub uses a powerful Rust/WebAssembly engine with 95+ built-in regex patterns to detect sensitive data. All patterns are validated and optimized for accuracy and performance.
Detection Categories
Contact Information
Email addresses, phone numbers (US, UK, International), and usernames.
Network Identifiers
IPv4, IPv6, MAC addresses, hostnames, URLs, and domain names.
Financial Data
Credit cards (with Luhn validation), IBANs, UK sort codes, bank accounts, and cryptocurrency addresses.
Identity Documents
SSN, UK NHS numbers, National Insurance, passports, and international IDs (AU, IN, SG, ES).
API Keys & Secrets
AWS, Stripe, GitHub, OpenAI, Anthropic, Slack tokens, JWTs, and generic secrets.
Dates & Times
ISO dates, US/UK formats, timestamps, and Unix epochs.
Validation
Many patterns include checksum validation to reduce false positives:
- Credit Cards - Luhn algorithm validation
- IBANs - MOD-97 checksum
- UK NHS Numbers - NHS checksum algorithm
- UK National Insurance - Format and prefix validation
- Australian TFN - Check digit validation
Custom Rules
Add your own detection rules with full regex support:
- Custom Regex - Define patterns with standard regex syntax
- Plain Text - Match exact strings (useful for specific identifiers)
- Presets - Save and load rule configurations
ML Name Detection
Beyond pattern matching, LogScrub offers optional machine learning detection using a pre-trained Named Entity Recognition (NER) model that runs entirely in your browser.
All ML processing happens locally. No data is sent to any server.
Detect names that aren't in email or username formats.
Identify cities, countries, and place names.
Find company and organization names.
Technology Stack
- Library: Transformers.js by Hugging Face
- Model: BERT-based Named Entity Recognition (NER)
- Runtime: ONNX format executed via WebAssembly
- Caching: Model downloaded once, cached in IndexedDB
Available Models
- DistilBERT NER (~250 MB) - Fast, good accuracy, recommended
- BERT Base NER (~420 MB) - Best accuracy, slower
- BERT Base NER (uncased) (~420 MB) - Case-insensitive matching
ML detection complements pattern matching — use both together for comprehensive PII detection.
Key Features
All processing happens locally. Your data never leaves your device.
Pre-built rules for IPs, emails, phone numbers, SSNs, credit cards, API keys, and more.
Same value always maps to same placeholder, preserving data relationships.
Add your own patterns for company-specific identifiers or data formats.
Download the original→replacement mapping for reverse lookups when needed.
Trim log files to a specific time window. Set a custom range or pick a duration preset to focus on the relevant time period.
Rust/WebAssembly engine processes files at native speed.
Common Use Cases
- Support Ticket Attachments — Remove customer PII before sharing logs with vendors
- Bug Reports — Sanitize network captures and logs before posting to issue trackers
- Security Audits — Anonymize data for third-party security assessments
- GDPR/Privacy Compliance — Redact personal data before data processing
- Training Data Preparation — Remove PII from datasets used for ML/AI training
- Documentation Examples — Create sanitized examples from real production data
Ready to anonymize your data?
No installation required. Just open LogScrub and drop your files.
Launch LogScrub