← Back to Features

Pattern Matching Rules

High-performance Rust/WebAssembly engine with comprehensive regex patterns for detecting sensitive data in any text format.

95+ Built-in Patterns

Detection Categories

📧 Contact Information

  • Email addresses
  • Email Message-IDs
  • Phone numbers (US format)
  • Phone numbers (UK format)
  • Phone numbers (International, with +)
  • Phone numbers (International, no +) E.164

🌐 Network Identifiers

  • IPv4 addresses
  • IPv6 addresses
  • MAC addresses
  • Hostnames
  • URLs (with credential detection)
  • URL parameters

💳 Financial Data

  • Credit cards Luhn
  • IBANs MOD-97
  • ICCIDs (SIM cards) Luhn
  • UK Sort Codes
  • UK Bank Accounts
  • Bitcoin addresses
  • Ethereum addresses
  • Money amounts

🪪 Identity (US)

  • Social Security Numbers Format
  • ITINs (Individual Tax ID)
  • Passport numbers
  • Driver's license numbers

🇬🇧 Identity (UK)

  • NHS Numbers Checksum
  • National Insurance Format
  • UK Sort Codes
  • UK Bank Accounts
  • UK Postcodes

🌍 Identity (International)

  • Australian TFN Checksum
  • Canadian SIN Luhn
  • India PAN
  • Singapore NRIC Checksum
  • Spain NIF/NIE Checksum
  • VIN (Vehicle ID) Check Digit

🔑 API Keys & Tokens

  • AWS Access Keys
  • AWS Secret Keys
  • Stripe API Keys
  • GitHub Tokens
  • OpenAI API Keys
  • Anthropic API Keys
  • xAI API Keys
  • Cerebras API Keys
  • Slack Tokens
  • NPM Tokens
  • SendGrid Keys
  • Twilio Keys
  • GCP API Keys
  • Bearer Tokens
  • JWTs

🔐 Secrets

  • Generic secrets (key=value)
  • High entropy strings
  • Private keys (PEM format)
  • Basic auth credentials
  • Database connection strings
  • Session IDs
  • URL credentials

📅 Dates & Times

  • ISO dates (YYYY-MM-DD)
  • US dates (MM/DD/YYYY)
  • UK dates (DD/MM/YYYY)
  • Times (HH:MM:SS)
  • ISO datetimes
  • Unix timestamps
  • Common Log Format

📂 File Paths

  • Unix paths (/home/user/...)
  • Windows paths (C:\Users\...)

📞 VoIP / SIP

  • SIP usernames & display names
  • SIP URIs & contacts
  • SIP Call-IDs & branch params
  • SIP User-Agent & Via headers
  • SIP realm, nonce, response

📍 Location

  • GPS coordinates
  • UK postcodes
  • US ZIP codes

#️⃣ Hashes & IDs

  • UUIDs
  • MD5 hashes
  • SHA1 hashes
  • SHA256 hashes
  • Docker container IDs

Validation

Many patterns include checksum validation to reduce false positives:

Pattern Validation Method Description
Credit Cards Luhn Algorithm Validates the check digit using mod 10
IBANs MOD-97 ISO 7064 modular arithmetic check
UK NHS Numbers NHS Checksum Weighted sum with mod 11 check digit
UK NINO Format Validation Valid prefix letters and suffix
Australian TFN Check Digit Weighted sum validation
Singapore NRIC Check Letter Modular arithmetic with letter mapping
Spain NIF/NIE Check Letter Mod 23 with letter table
Canadian SIN Luhn Algorithm Mod 10 check digit validation
VIN Check Digit Position 9 transliteration check (ISO 3779)
ICCID Luhn Algorithm Validates check digit, verifies 89 prefix
Phone (Intl, no +) E.164 Validation Validates country code and expected digit count

Application-Specific Patterns

LogScrub includes patterns for common log formats:

Email Server Logs

VoIP / SIP

SQL Dumps

Custom Rules

Create your own detection patterns:

Custom Regex Rules
Define patterns using standard regex syntax. LogScrub uses the Rust regex engine which supports most common regex features including groups, alternation, and character classes.

Adding a Custom Rule

  1. Click + Custom Rule in the sidebar
  2. Enter a descriptive label
  3. Write your regex pattern
  4. Choose a replacement strategy
  5. Save the rule

Plain Text Patterns

For exact string matching without regex complexity:

  1. Click + Plain Text
  2. Enter the exact text to match
  3. The text will be matched literally (no regex interpretation)

Replacement Strategies

Strategy Example Output Use Case
Label [EMAIL-1] Clear identification of data type
Redact ████████ Complete visual redaction
Fake john.smith@example.com Realistic-looking fake data
Fake (Country) +447291635804 Fake data preserving country codes and TLDs
Template USER_{n} Custom format with variables

Start detecting PII in your logs

95+ patterns ready to use. No configuration needed.

Launch LogScrub

All FeaturesML DetectionDocumentation