Fake thread with suspicious indicators

Severity: medium
Type: rule
Source: github.com/sublime-security/sublime-rules

Fake thread contains suspicious indicators, which can lead to BEC, credential phishing, and other undesirable outcomes.

Threat classification

Sublime's own taxonomy (not MITRE ATT&CK).

Category	Values
Attack types	BEC/Fraud, Credential Phishing, Spam
Tactics and techniques	Evasion, Social engineering

Event coverage

Message attribute
attachments (collection)
body
body.current_thread
body.html
body.links (collection)
body.plain
body.previous_threads (collection)
headers (collection)
headers.auth_summary
headers.domains (collection)
headers.hops (collection)
headers.return_path
recipients
recipients.to (collection)
sender
sender.email
subject
type

Rule body MQL

type.inbound
// fake thread check
and (length(headers.references) == 0 or headers.in_reply_to is null)
and (
  subject.is_reply
  or subject.is_forward
  // fake thread, but no indication in the subject line
  // current_thread pulls the recent thread, but the full body contains the fake "original" email
  or (
    not (subject.is_reply or subject.is_forward)
    and any([body.current_thread.text, body.html.display_text, body.plain.raw],
            3 of (
              strings.icontains(., "from:"),
              strings.icontains(., "to:"),
              strings.icontains(., "sent:"),
              strings.icontains(., "date:"),
              strings.icontains(., "cc:"),
              strings.icontains(., "subject:")
            )
    )
    and length(body.current_thread.text) + 100 < length(coalesce(body.html.display_text,
                                                                 body.plain.raw
                                                        )
    )
  )
)

// negating bouncebacks
and not any(attachments,
            .content_type in ("message/delivery-status", "message/rfc822")
)
// negating Google Calendar invites
and (
  (
    headers.return_path.domain.domain is not null
    and headers.return_path.domain.domain != 'calendar-server.bounces.google.com'
  )
  or headers.return_path.domain.domain is null
)
// not mimecast secure message from internal source
and not (
  strings.istarts_with(headers.message_id, '<Mimecast.')
  and strings.iends_with(headers.message_id, '.mimecast.lan>')
  and headers.hops[0].received.server.raw == "relay.mimecast.com"
  and strings.icontains(headers.hops[0].received.source.raw, 'mimecast.lan')
)

// and not solicited
and not profile.by_sender().solicited
and 4 of (
  // language attempting to engage
  (
    any(ml.nlu_classifier(body.current_thread.text).entities,
        .name == "request"
    )
    and any(ml.nlu_classifier(body.current_thread.text).entities,
            .name == "financial"
    )
  ),

  // invoicing language
  (
    any(ml.nlu_classifier(body.current_thread.text).tags, .name == "invoice")
    or any(ml.nlu_classifier(body.current_thread.text).entities,
           .text == "invoice"
    )
  ),

  // urgency request
  any(ml.nlu_classifier(body.current_thread.text).entities, .name == "urgency"),

  // cred_theft detection
  any(ml.nlu_classifier(body.current_thread.text).intents,
      .name == "cred_theft" and .confidence in~ ("medium", "high")
  ),

  // commonly abused sender TLD
  strings.ilike(sender.email.domain.tld, "*.jp"),

  // headers traverse abused TLD
  any(headers.domains, strings.ilike(.tld, "*.jp")),

  // known suspicious pattern in the URL path
  any(body.links, regex.match(.href_url.path, '\/[a-z]{3}\d[a-z]')),

  // link display text is in all caps
  any(body.links, regex.match(.display_text, '[A-Z ]+')),

  // link display text contains invisible characters (U+200F)
  any(body.links, strings.contains(.display_text, "\u{200F}")),

  // Low reputation link with display text ending in a document extension
  any(body.links,
      .href_url.domain.root_domain not in $tranco_1m
      and .href_url.domain.valid
      and .href_url.domain.root_domain not in $org_domains
      and .href_url.domain.root_domain not in $high_trust_sender_root_domains
      and (
        any($file_extensions_macros, strings.ends_with(..display_text, .))
        or strings.ends_with(.display_text, 'pdf')
      )
  ),

  // display name contains an email
  regex.contains(sender.display_name, '[a-z0-9]+@[a-z]+'),

  // Sender domain is empty
  sender.email.domain.domain == "",

  // sender domain matches no body domains
  all(body.links,
      .href_url.domain.root_domain != sender.email.domain.root_domain
  ),

  // body contains name of VIP
  (
    any($org_vips, strings.icontains(body.html.inner_text, .display_name))
    or any($org_vips, strings.icontains(body.plain.raw, .display_name))
  ),

  // new body domain
  any(body.links, network.whois(.href_url.domain).days_old < 30),

  // new sender domain
  network.whois(sender.email.domain).days_old < 30,

  // new sender
  profile.by_sender().days_known < 7,

  // excessive whitespace
  (
    regex.icontains(body.html.raw, '((<br\s*/?>\s*){20,}|\n{20,})')
    or regex.icontains(body.html.raw, '(<p[^>]*>\s*<br\s*/?>\s*</p>\s*){30,}')
    or regex.icontains(body.html.raw,
                       '(<p class=".*?"><span style=".*?"><o:p>&nbsp;</o:p></span></p>\s*){30,}'
    )
    or regex.icontains(body.html.raw, '(<p>&nbsp;</p>\s*){7,}')
    or regex.icontains(body.html.raw, '(<p>&nbsp;</p><br>\s*){7,}')
    or regex.icontains(body.html.raw, '(<p[^>]*>\s*&nbsp;<br>\s*</p>\s*){5,}')
    or regex.icontains(body.html.raw, '(<p[^>]*>&nbsp;</p>\s*){7,}')
  ),

  // body contains recipient SLD
  any(recipients.to,
      strings.icontains(body.current_thread.text, .email.domain.sld)
  ),
  (
    // bec
    any(ml.nlu_classifier(body.current_thread.text).intents,
        .name == "bec" and .confidence != "low"
    )
    // previous thread contains matching domain but mismatch local part in current thread
    and any(body.previous_threads,
            .sender.email.domain.root_domain == recipients.to[0].email.domain.root_domain
            and .sender.email.email != recipients.to[0].email.email
    )
  )
)

// negate highly trusted sender domains unless they fail DMARC authentication
and (
  (
    sender.email.domain.root_domain in $high_trust_sender_root_domains
    and not headers.auth_summary.dmarc.pass
  )
  or sender.email.domain.root_domain not in $high_trust_sender_root_domains
)
and not profile.by_sender().any_messages_benign

Detection logic

Scope: inbound message.

Fake thread contains suspicious indicators, which can lead to BEC, credential phishing, and other undesirable outcomes.

inbound message
any of:
- length(headers.references) is 0
- headers.in_reply_to is missing
any of:
- subject.is_reply
- subject.is_forward
- all of:
 - none of:
 
 subject.is_reply
 subject.is_forward
 - any of [body.current_thread.text, body.html.display_text, body.plain.raw] where:
 
 at least 3 of 6: . contains any of 6 patterns
 
 from:
 to:
 sent:
 date:
 cc:
 subject:
 - length(body.current_thread.text) + 100 < length(coalesce(body.html.display_text, body.plain.raw))
not:
- any of attachments where:
  - .content_type in ('message/delivery-status', 'message/rfc822')
any of:
- all of:
  - headers.return_path.domain.domain is set
  - headers.return_path.domain.domain is not 'calendar-server.bounces.google.com'
- headers.return_path.domain.domain is missing
not:
- all of:
 - headers.message_id starts with '<Mimecast.'
 - headers.message_id ends with '.mimecast.lan>'
 - headers.hops[0].received.server.raw is 'relay.mimecast.com'
 - headers.hops[0].received.source.raw contains 'mimecast.lan'
not:
- profile.by_sender().solicited
at least 4 of:
- all of:
 - any of ml.nlu_classifier(body.current_thread.text).entities where:
 
 .name is 'request'
 - any of ml.nlu_classifier(body.current_thread.text).entities where:
 
 .name is 'financial'
- any of:
 - any of ml.nlu_classifier(body.current_thread.text).tags where:
 
 .name is 'invoice'
 - any of ml.nlu_classifier(body.current_thread.text).entities where:
 
 .text is 'invoice'
- any of ml.nlu_classifier(body.current_thread.text).entities where:
 - .name is 'urgency'
- any of ml.nlu_classifier(body.current_thread.text).intents where all hold:
 - .name is 'cred_theft'
 - .confidence in ('medium', 'high')
- sender.email.domain.tld matches '*.jp'
- any of headers.domains where:
 - .tld matches '*.jp'
- any of body.links where:
 - .href_url.path matches '\\/[a-z]{3}\\d[a-z]'
- any of body.links where:
 - .display_text matches '[A-Z ]+'
- any of body.links where:
 - .display_text contains '\\u{200F}'
- any of body.links where all hold:
 - .href_url.domain.root_domain not in $tranco_1m
 - .href_url.domain.valid
 - .href_url.domain.root_domain not in $org_domains
 - .href_url.domain.root_domain not in $high_trust_sender_root_domains
 - any of:
 
 any of $file_extensions_macros where:
 
 strings.ends_with(.display_text)
 .display_text ends with 'pdf'
- sender.display_name matches '[a-z0-9]+@[a-z]+'
- sender.email.domain.domain is ''
- all of body.links where:
 - .href_url.domain.root_domain is not sender.email.domain.root_domain
- any of:
 - any of $org_vips where:
 
 strings.icontains(body.html.inner_text)
 - any of $org_vips where:
 
 strings.icontains(body.plain.raw)
- any of body.links where:
 - network.whois(.href_url.domain).days_old < 30
- network.whois(sender.email.domain).days_old < 30
- profile.by_sender().days_known < 7
- body.html.raw matches any of 7 patterns
 - ((<br\s*/?>\s*){20,}|\n{20,})
 - (<p[^>]*>\s*<br\s*/?>\s*\s*){30,}
 - (<o:p> </o:p>\s*){30,}
 - ( \s*){7,}
 - (  \s*){7,}
 - (<p[^>]*>\s*  \s*\s*){5,}
 - (<p[^>]*> \s*){7,}
- any of recipients.to where:
 - strings.icontains(body.current_thread.text)
- all of:
 - any of ml.nlu_classifier(body.current_thread.text).intents where all hold:
 
 .name is 'bec'
 .confidence is not 'low'
 - any of body.previous_threads where all hold:
 
 .sender.email.domain.root_domain is recipients.to[0].email.domain.root_domain
 .sender.email.email is not recipients.to[0].email.email
any of:
- all of:
  - sender.email.domain.root_domain in $high_trust_sender_root_domains
  - not:
    
    headers.auth_summary.dmarc.pass
- sender.email.domain.root_domain not in $high_trust_sender_root_domains
not:
- profile.by_sender().any_messages_benign

Inspects: attachments[].content_type, body.current_thread.text, body.html.display_text, body.html.inner_text, body.html.raw, body.links, body.links[].display_text, body.links[].href_url.domain, body.links[].href_url.domain.root_domain, body.links[].href_url.domain.valid, body.links[].href_url.path, body.plain.raw, body.previous_threads, body.previous_threads[].sender.email.domain.root_domain, body.previous_threads[].sender.email.email, headers.auth_summary.dmarc.pass, headers.domains, headers.domains[].tld, headers.hops[0].received.server.raw, headers.hops[0].received.source.raw, headers.in_reply_to, headers.message_id, headers.references, headers.return_path.domain.domain, recipients.to, recipients.to[0].email.domain.root_domain, recipients.to[0].email.email, recipients.to[].email.domain.sld, sender.display_name, sender.email.domain, sender.email.domain.domain, sender.email.domain.root_domain, sender.email.domain.tld, subject.is_forward, subject.is_reply, type.inbound. Sensors: ml.nlu_classifier, network.whois, profile.by_sender, regex.contains, regex.icontains, regex.match, strings.contains, strings.ends_with, strings.icontains, strings.iends_with, strings.ilike, strings.istarts_with. Reference lists: $file_extensions_macros, $high_trust_sender_root_domains, $org_domains, $org_vips, $tranco_1m.

Indicators matched (35)

Field	Match	Value
`strings.icontains`	substring	`from:`
`strings.icontains`	substring	`to:`
`strings.icontains`	substring	`sent:`
`strings.icontains`	substring	`date:`
`strings.icontains`	substring	`cc:`
`strings.icontains`	substring	`subject:`
`attachments[].content_type`	member	`message/delivery-status`
`attachments[].content_type`	member	`message/rfc822`
`strings.istarts_with`	prefix	`<Mimecast.`
`strings.iends_with`	suffix	`.mimecast.lan>`
`headers.hops[0].received.server.raw`	equals	`relay.mimecast.com`
`strings.icontains`	substring	`mimecast.lan`

23 more

`ml.nlu_classifier(body.current_thread.text).entities[].name`	equals	`request`
`ml.nlu_classifier(body.current_thread.text).entities[].name`	equals	`financial`
`ml.nlu_classifier(body.current_thread.text).tags[].name`	equals	`invoice`
`ml.nlu_classifier(body.current_thread.text).entities[].text`	equals	`invoice`
`ml.nlu_classifier(body.current_thread.text).entities[].name`	equals	`urgency`
`ml.nlu_classifier(body.current_thread.text).intents[].name`	equals	`cred_theft`
`ml.nlu_classifier(body.current_thread.text).intents[].confidence`	member	`medium`
`ml.nlu_classifier(body.current_thread.text).intents[].confidence`	member	`high`
`strings.ilike`	substring	`*.jp`
`regex.match`	regex	`\/[a-z]{3}\d[a-z]`
`regex.match`	regex	`[A-Z ]+`
`strings.contains`	substring	`\u{200F}`
`strings.ends_with`	suffix	`pdf`
`regex.contains`	regex	`[a-z0-9]+@[a-z]+`
`sender.email.domain.domain`	equals
`regex.icontains`	regex	`((<br\s/?>\s){20,}\|\n{20,})`
`regex.icontains`	regex	`(<p[^>]>\s<br\s/?>\s</p>\s*){30,}`
`regex.icontains`	regex	`(<p class=".?"><span style=".?"><o:p> </o:p></span></p>\s*){30,}`
`regex.icontains`	regex	`(<p> </p>\s*){7,}`
`regex.icontains`	regex	`(<p> </p><br>\s*){7,}`
`regex.icontains`	regex	`(<p[^>]>\s <br>\s</p>\s){5,}`
`regex.icontains`	regex	`(<p[^>]> </p>\s){7,}`
`ml.nlu_classifier(body.current_thread.text).intents[].name`	equals	`bec`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)