Introduction
Aid4Mail’s powerful filtering capabilities are designed to meet the exacting needs of professionals in email forensics and eDiscovery. This guide provides comprehensive coverage of Aid4Mail’s filter syntax.
Purpose of Filtering
Example Query
Date>=2023 AND Type:Personal AND ("project alpha" OR "confidential acquisition")
This filter finds all personal emails from 2023 onwards mentioning sensitive topics.
Search Terms
Search terms are the foundation of Aid4Mail’s filtering capabilities, allowing you to pinpoint specific content within large email datasets.
Case Sensitivity
Searches are case-insensitive by default. "EVIDENCE" matches all case variations.
Whole Words
Matches whole words only. "car" won’t match "cargo" or "scar".
Exact Phrases
Use double quotes for exact phrases: "intellectual property".
Combining Terms
Multiple terms use AND implicitly. Add Boolean operators for complex queries.
Forensics Examples
confidential
Simple keyword search
"trade secret"
Exact phrase match
Date:2022 AND (embezzle* OR fraud*) AND (account* OR financ*)
Complex financial investigation query
Wildcards
Wildcards represent unknown or variable parts of search terms, essential for handling variations in spelling and unknown data.
General Wildcards
*
Asterisk
Zero or more characters
corrup* → corrupt, corruption
?
Question Mark
Exactly one character
saniti?e → sanitise, sanitize
#
Hash
Zero or one non-alphanumeric
data#breach → data breach, databreach
~
Tilde
Stemming (word variations)
steal~ → steal, stole, stolen
Proximity Wildcards
<n>
Word Distance
Within n words
<.>
Same Sentence
In same sentence
<*>
Same Paragraph
In same paragraph
Boolean Operators
Combine or exclude search terms to create precise queries for pinpointing relevant emails within large datasets.
AND
All terms required
OR
Any term matches
NOT
Exclude terms
XOR
Either but not both
Order of Precedence
Aid4Mail processes operators in this specific order:
- Parentheses ( )
- NOT
- AND
- XOR
- OR
Proximity Searching
Find words or phrases that appear near each other in an email. This is crucial for identifying relevant conversations and context in investigations.
Proximity Operators
<n>
Numeric Proximity
Within n words of each other
bribe<5>official
<.>
Same Sentence
Terms in the same sentence
insider<.>trading
<*>
Same Paragraph
Terms in the same paragraph
confidential<*>agreement
Forensics Example
(trade<5>secret) AND (steal<.>proprietary)
Finds emails discussing trade secrets within five words of each other, in the same email as discussions of stealing proprietary information within the same sentence.
Searching by Email Type
Powerful capabilities for deduplication, finding deleted emails, and focusing on personal communications.
Deduplication
Eliminate duplicate emails to streamline review and improve efficiency.
Exclude duplicates:
NOT Type:Duplicate
Shorthand:
-Type:Duplicate
Unpurged Mail (Deleted)
Search deleted emails that haven’t been permanently removed—crucial for forensic investigations.
Type:Unpurged
Include only deleted emails
Forensics Example:
Type:Unpurged AND Date>=2023-01-01 AND From:suspect@company.com
Personal Mail
Focus on person-to-person communications, excluding newsletters, marketing emails, and automated notifications.
Type:Personal
Include only personal emails
eDiscovery Example:
Date>=2023 AND Type:Personal AND (confidential OR proprietary)
Tokenization and Stemming
Advanced linguistic processing techniques that enhance search capabilities by handling variations in language use, spelling, and word forms.
Tokenization
Recognizes and matches similar characters and whole words within text.
Character Matching:
Handles punctuation, diacritical marks, ligatures, typographic quotes
Examples:
Acme Corp
→ "Acme Corp", "ACME CORP" naïve
→ "naive", "naïve" Stemming
Finds words with the same root using dictionary-based processing.
Usage:
Use the ~
wildcard at the end of a word
Examples:
embezzle~
→ embezzle, embezzled, embezzling steal~
→ steal, stole, stolen Best Practices
- • Use with Caution: These features may introduce false positives
- • Document Settings: Record tokenization and stemming settings for defensibility
- • Language Consideration: Use appropriate dictionaries for investigation languages
- • Combine Techniques: Use with proximity searches and Boolean operators
Best Practices and Tips
Following these best practices will help you create more effective, defensible, and efficient searches in Aid4Mail.
Optimizing Filter Performance
Query Structure Order (Most to Least Efficient):
Date restrictions
Date:, Received:, NewerThan:
Folder restrictions
In:, Label:, FolderName:
Metadata filters
Is:Replied, Type:Personal, -Type:Duplicate
Header field searches
From:, To:, Subject:
Message body searches
Header:, SenderMessage:, Message:
Attachment content searches
(place last)
Optimized Example:
Date>=2023-01-01 AND Type:Personal AND NOT Type:Duplicate AND From:executive@company.com AND (confidential<10>project)
Forensic Considerations
Documentation
- • Record all search parameters
- • Document modifications to search lists
- • Note stemming dictionary changes
- • Maintain chain of custody
Quality Control
- • Sample results for accuracy
- • Check for false positives/negatives
- • Ensure reproducibility
- • Include deleted content when relevant
Common Pitfalls to Avoid
Search Issues:
- • Overly broad searches
- • Ignoring context
- • Overlooking variations
- • Assuming completeness
Content Issues:
- • Neglecting non-text content
- • Missing regional spelling differences
- • Overlooking common typos
- • Ignoring abbreviations