Skip to main content
Documentation

Master Search Lists in Aid4Mail

Harness the power of batch filtering with search lists. Learn how to create optimized keyword lists, use advanced operators, and maximize performance for large-scale email investigations.

1 per line

Term Format

PCRE2

Regex Support

Top-Down

Processing

10x

Faster

1

Introduction

Search lists are powerful tools in Aid4Mail that enable batch filtering using multiple keywords, patterns, and expressions. They're essential for large-scale investigations where you need to search for hundreds or thousands of terms efficiently.

When to Use Search Lists

Multiple keyword investigations
Pattern-based searches
Reusable filter sets
Compliance searches

Key Advantage

Aid4Mail processes search lists from top to bottom, stopping at the first match. This "first-match-wins" approach allows for sophisticated prioritization strategies that can dramatically improve search performance.

2

Basic Rules

Search lists follow specific formatting rules to ensure proper parsing and optimal performance.

File Format

  • • Plain text file (.txt extension)
  • • UTF-8 encoding recommended
  • • One search term per line
  • • Blank lines allowed for organization

What NOT to Include

  • • No Boolean operators (AND, OR, NOT)
  • • No search operators in the list
  • • No comments or annotations
  • • No quotes unless searching for them literally

Example Search List File

confidential
proprietary
trade secret
insider trading

corrupt*
fraud*
embezzle~

money<+3>laundering
{[R]=\b(classified|restricted)\b}

Notice the organization with blank lines between related terms. This improves readability without affecting functionality.

3

Wildcards and Operators

Search lists support the same powerful wildcards and operators as Aid4Mail's regular filtering, including PCRE2 regular expressions.

Fast Performance

Plain Text (Fastest)

Especially fast with 20+ characters

intellectual property theft

Asterisk (*)

Zero or more characters

corrupt*

Question Mark (?)

Exactly one character

organi?e

Hash (#)

Zero or one non-alphanumeric

don#t

Medium Performance

~

Stemming

Finds word variations using dictionary

steal~ → steal, stole, stolen

Slower Performance (Use Sparingly)

<+n>

Ordered proximity

money<+3>laundering
<n>

Any order proximity

insider<5>trading
<.>

Same sentence

confidential<.>project
<*>

Same paragraph

trade<*>secret
{[R]=pattern}

Complex regex

{[R]=\b[A-Z]{2,4}-\d{4,6}\b}

Warning: Limit proximity operators to 2-3 per search term. Excessive use significantly impacts performance.

4

Performance Optimization

Understanding how Aid4Mail processes search lists is crucial for creating high-performance filters that can handle millions of emails efficiently.

Primary Performance Rule

Aid4Mail processes the list from top to bottom. As soon as a match is found, remaining terms are skipped. This means term ordering directly impacts performance.

  1. Most likely matches → Place at the TOP
  2. Least likely matches → Place at the BOTTOM

Optimization Technique

Group related plain terms into a single regex for better efficiency:

❌ Inefficient:

inappropriate
unwelcome
unwanted

✅ Efficient:

{[R]=\b(inappropriate|unwelcome|unwanted)\b}

Performance Impact

🚀

10x faster

With proper ordering

📉

90% reduction

In processing time

Instant matches

For common terms

5

Search List Ordering

Proper ordering is the key to maximizing search list performance. Follow these prioritization rules for optimal results.

Ordering Priority (Top to Bottom)

1

Most Common Terms

Terms appearing in majority of emails

meeting, report, update, project
2

Industry/Context Specific

Terms common in your investigation domain

confidential, proprietary, contract
3

Simple Wildcards

Basic pattern matching

corrupt*, fraud*, *@company.com
4

Stemming Terms

Word variations

steal~, discriminate~, harass~
5

Proximity Searches

Terms near each other

insider<5>trading, money<+3>laundering
6

Complex Regex

Resource-intensive patterns

&#123;[R]=\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]&#123;2,&#125;\b&#125;
6

Best Practices

Follow these proven practices to create effective, maintainable, and high-performance search lists.

Do's

  • Use # for apostrophes (don#t)
  • Group related terms with blank lines
  • Combine similar plain terms into regex
  • Test with sample data before production
  • Use stemming for language variations

Don'ts

  • Don't use overly broad terms alone
  • Don't include redundant variations
  • Don't use 3+ proximity operators
  • Don't mix AND/OR/NOT in the list
  • Don't forget to test edge cases
7

Advanced Techniques

Leverage regular expressions and advanced patterns for sophisticated filtering scenarios.

Email Address Pattern

{[R]=\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b}

Matches any email address format. Useful for finding communications with external parties.

Phone Number Pattern

{[R]=\b(\+?1[-.]?)?\(?[0-9]{3}\)?[-.]?[0-9]{3}[-.]?[0-9]{4}\b}

Matches various US phone number formats including international prefix.

Case/Reference Numbers

{[R]=\b[A-Z]{2,4}-\d{4,8}\b}

Matches patterns like CASE-2024001, REF-123456. Adjust the ranges for your specific format.

8

Real-World Examples

Complete search list examples for common investigation scenarios, optimized for performance.

Financial Fraud Investigation

# Common business terms (high frequency)
{[R]=\b(meeting|report|update|invoice|payment)\b}

# Financial terms (medium frequency)
{[R]=\b(account|transfer|wire|deposit|withdrawal)\b}
data retention policy
audit trail

# Fraud indicators (simple wildcards)
embezzl*
fraud*
launder*
misappropriat*

# Stemming for variations
steal~
manipulate~
falsif~

# Proximity searches (slower)
money<+3>laundering
offshore<5>account
insider<.>trading

# Complex patterns (slowest)
{[R]=\$[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})?}
{[R]=\b[A-Z]{2,4}-\d{6,8}\b}

Performance Note: This list is optimized with common terms first, progressing to more complex patterns. Expected to process millions of emails efficiently.

HR Investigation

# Common workplace terms
{[R]=\b(employee|staff|team|office|department)\b}

# HR-specific terms
{[R]=\b(performance|review|feedback|evaluation)\b}
human resources
personnel file

# Potential issues
discriminat*
harass*
hostile*
retaliat*

# Stemming
intimidate~
threaten~
bully~

# Context searches
inappropriate<.>behavior
hostile<+3>environment
sexual<5>harassment

Tip: For HR investigations, consider creating separate lists for different violation types to improve precision and reduce false positives.

Ready to Optimize Your Email Investigations?

Download Aid4Mail and experience the power of intelligent search list filtering.