Detection rules › Sublime MQL

Link: Multistage landing - Scribd document

Severity
medium
Type
rule
Source
github.com/sublime-security/sublime-rules

Detects when a Scribd document contains embedded links that are suspicious, particularly those targeting Microsoft services through various evasion techniques. The rule analyzes both the document content and linked destinations for suspicious patterns and redirects.

Threat classification

Sublime's own taxonomy (not MITRE ATT&CK).

CategoryValues
Attack typesCredential Phishing
Tactics and techniquesEvasion, Social engineering, Impersonation: Brand, Free file host

Event coverage

Rule body MQL

type.inbound
// only one link to Scribd
and length(distinct(filter(body.links,
                           .href_url.domain.root_domain in ("scribd.com")
                           and strings.istarts_with(.href_url.path, "/document")
                    ),
                    .href_url.url
           )
) == 1
and any(body.links,
        .href_url.domain.root_domain == "scribd.com"
        and strings.istarts_with(.href_url.path, "/document")
        and (
          // target the embedded links via XPath
          any(html.xpath(ml.link_analysis(.).final_dom,
                         '//a[@class="ll"]/@href'
              ).nodes,
              strings.parse_url(.raw).domain.tld in $suspicious_tlds
              or strings.parse_url(.raw).domain.domain in $free_subdomain_hosts
              or strings.parse_url(.raw).domain.root_domain in $free_subdomain_hosts
              // observed pattern in credential theft URLs
              or strings.ilike(strings.parse_url(.raw).path,
                               "*o365*",
                               "*office365*",
                               "*microsoft*"
              )
              // observed pattern in credential theft URLs
              or strings.ilike(strings.parse_url(.raw).query_params,
                               "*o365*",
                               "*office365*",
                               "*microsoft*"
              )
              // observed pattern in credential theft URLs
              or any(beta.scan_base64(strings.parse_url(.raw).query_params),
                     strings.ilike(., "*o365*", "*office365*", "*microsoft*")
              )
              or ml.link_analysis(strings.parse_url(.raw), mode="aggressive").credphish.disposition == "phishing"
              or ml.link_analysis(strings.parse_url(.raw), mode="aggressive").credphish.contains_captcha
              or strings.icontains(ml.link_analysis(strings.parse_url(.raw),
                                                    mode="aggressive"
                                   ).final_dom.display_text,
                                   "I'm Human"
              )
              // bails out to a well-known domain, seen in evasion attempts
              or (
                length(ml.link_analysis(strings.parse_url(.raw),
                                        mode="aggressive"
                       ).redirect_history
                ) > 0
                and ml.link_analysis(strings.parse_url(.raw), mode="aggressive").effective_url.domain.root_domain in $tranco_10k
              )
          )
          // credential theft language on the main Scribd page
          or any(ml.nlu_classifier(beta.ocr(ml.link_analysis(.,
                                                             mode="aggressive"
                                            ).screenshot
                                   ).text
                 ).intents,
                 .name == "cred_theft" and .confidence != "low"
          )
        )
)
// negate highly trusted sender domains unless they fail DMARC authentication
and (
  (
    sender.email.domain.root_domain in $high_trust_sender_root_domains
    and not headers.auth_summary.dmarc.pass
  )
  or sender.email.domain.root_domain not in $high_trust_sender_root_domains
)

Detection logic

Scope: inbound message.

Detects when a Scribd document contains embedded links that are suspicious, particularly those targeting Microsoft services through various evasion techniques. The rule analyzes both the document content and linked destinations for suspicious patterns and redirects.

  1. inbound message
  2. length(distinct(filter(body.links, .href_url.domain.root_domain in ('scribd.com') and strings.istarts_with(.href_url.path, '/document')), .href_url.url)) is 1
  3. any of body.links where all hold:
    • .href_url.domain.root_domain is 'scribd.com'
    • .href_url.path starts with '/document'
    • any of:
      • any of html.xpath(ml.link_analysis(.).final_dom, '//a[@class="ll"]/@href').nodes where any holds:
        • strings.parse_url(.raw).domain.tld in $suspicious_tlds
        • strings.parse_url(.raw).domain.domain in $free_subdomain_hosts
        • strings.parse_url(.raw).domain.root_domain in $free_subdomain_hosts
        • strings.parse_url(.raw).path matches any of 3 patterns
          • *o365*
          • *office365*
          • *microsoft*
        • strings.parse_url(.raw).query_params matches any of 3 patterns
          • *o365*
          • *office365*
          • *microsoft*
        • any of beta.scan_base64(...) where:
          • . matches any of 3 patterns
            • *o365*
            • *office365*
            • *microsoft*
        • ml.link_analysis(strings.parse_url(.raw)).credphish.disposition is 'phishing'
        • ml.link_analysis(strings.parse_url(.raw)).credphish.contains_captcha
        • ml.link_analysis(strings.parse_url(.raw), mode='aggressive').final_dom.display_text contains "I'm Human"
        • all of:
          • length(ml.link_analysis(strings.parse_url(.raw), mode='aggressive').redirect_history) > 0
          • ml.link_analysis(strings.parse_url(.raw)).effective_url.domain.root_domain in $tranco_10k
      • any of ml.nlu_classifier(beta.ocr(ml.link_analysis(., mode='aggressive').screenshot).text).intents where all hold:
        • .name is 'cred_theft'
        • .confidence is not 'low'
  4. any of:
    • all of:
      • sender.email.domain.root_domain in $high_trust_sender_root_domains
      • not:
        • headers.auth_summary.dmarc.pass
    • sender.email.domain.root_domain not in $high_trust_sender_root_domains

Inspects: body.links, body.links[].href_url.domain.root_domain, body.links[].href_url.path, headers.auth_summary.dmarc.pass, sender.email.domain.root_domain, type.inbound. Sensors: beta.ocr, beta.scan_base64, html.xpath, ml.link_analysis, ml.nlu_classifier, strings.icontains, strings.ilike, strings.istarts_with, strings.parse_url. Reference lists: $free_subdomain_hosts, $high_trust_sender_root_domains, $suspicious_tlds, $tranco_10k.

Indicators matched (8)

FieldMatchValue
body.links[].href_url.domain.root_domainmemberscribd.com
strings.istarts_withprefix/document
body.links[].href_url.domain.root_domainequalsscribd.com
strings.ilikesubstring*o365*
strings.ilikesubstring*office365*
strings.ilikesubstring*microsoft*
strings.icontainssubstringI'm Human
ml.nlu_classifier(beta.ocr(ml.link_analysis(body.links[], mode='aggressive').screenshot).text).intents[].nameequalscred_theft