Attachment: Office file with credential phishing URLs

Severity: medium
Type: rule
Source: github.com/sublime-security/sublime-rules

Detects Office documents containing embedded URLs that redirect to credential phishing pages. The rule filters out standard XML namespace and schema URLs commonly found in legitimate Office documents, then analyzes remaining URLs for malicious content using machine learning link analysis.

Threat classification

Sublime's own taxonomy (not MITRE ATT&CK).

Category	Values
Attack types	Credential Phishing
Tactics and techniques	Evasion, Social engineering

Event coverage

Message attribute
attachments (collection)
type

Rule body MQL

type.inbound
// Filter to Office documents that contain 1-3 non-schema URLs
and any(filter(attachments,
               // Only check Office documents that can contain macros/embedded content
               .file_extension in $file_extensions_macros

               // Count URLs after filtering out common XML namespace/schema URLs
               and 0 < sum(map(map(file.explode(.),
                                   // Filter out standard XML namespace URLs that appear in all Office docs
                                   filter(.scan.url.urls,
                                          // Exclude OpenXML format schemas
                                          .domain.domain not in (
                                            'schemas.openxmlformats.org',
                                            'schemas.microsoft.com',
                                            'www.w3.org'
                                          )
                                          // Additional Microsoft domain exclusion
                                          and not .domain.domain in (
                                            'microsoft.com',
                                            'wps.cn' // WPS is a china based alt to MS Office and used in namespaces of the documents created by that product
                                          )
                                          // Exclude Dublin Core persistent URLs (metadata schemas)
                                          and not (
                                            .domain.domain == 'purl.org'
                                            and strings.starts_with(.path,
                                                                    '/dc/'
                                            )
                                          )
                                          // Exclude Dublin Core XML schemas
                                          and not (
                                            .domain.domain == "dublincore.org"
                                            and strings.starts_with(.path,
                                                                    '/schemas/xmls/'
                                            )
                                          )
                                   )
                               ),
                               // Count URLs in each exploded file component
                               length(.)
                           )
               ) <= 3 // Only process attachments with 3 or fewer non-schema URLs
        ),
        // For the filtered Office documents, check for malicious URLs
        any(file.explode(.),
            any(
                // Apply the same URL filtering to remove XML namespace noise
                filter(.scan.url.urls,
                       .domain.domain not in (
                         'schemas.openxmlformats.org',
                         'schemas.microsoft.com',
                         'www.w3.org'
                       )
                       and not .domain.domain in (
                         'microsoft.com',
                         'wps.cn' // WPS is a china based alt to MS Office and used in namespaces of the documents created by that product
                       )
                       and not (
                         .domain.domain == 'purl.org'
                         and strings.starts_with(.path, '/dc/')
                       )
                       and not (
                         .domain.domain == "dublincore.org"
                         and strings.starts_with(.path, '/schemas/xmls/')
                       )
                ),
                // Run link analysis on the filtered URLs to detect phishing
                ml.link_analysis(.).credphish.disposition == "phishing"
                // confidence is only returned when brands, if it's not there, consider this true
                // this ensures if there is a brand, the confidence is high
                // and allows matching when there is no confidence
                and coalesce(ml.link_analysis(.).credphish.confidence == "high",
                             true
                )
                and not (
                  ml.link_analysis(.).credphish.brand.name is not null
                  and ml.link_analysis(.).credphish.brand.name == "GoDaddy"
                  and strings.icontains(ml.link_analysis(.).final_dom.inner_text,
                                        'is parked free, courtesy of GoDaddy.com.'
                  )
                  and strings.icontains(ml.link_analysis(.).final_dom.inner_text,
                                        'Get This Domain'
                  )
                )
            )
        )
)

Detection logic

Scope: inbound message.

inbound message
any of filter(attachments) where:
- any of file.explode(.) where:
  - any of filter(.scan.url.urls) where all hold:
    
    ml.link_analysis(.).credphish.disposition is 'phishing'
    coalesce(ml.link_analysis(.).credphish.confidence == 'high')
    not:
    
    all of:
    
    ml.link_analysis(.).credphish.brand.name is set
    ml.link_analysis(.).credphish.brand.name is 'GoDaddy'
    ml.link_analysis(.).final_dom.inner_text contains 'is parked free, courtesy of GoDaddy.com.'
    ml.link_analysis(.).final_dom.inner_text contains 'Get This Domain'

Inspects: attachments[].file_extension, type.inbound. Sensors: file.explode, ml.link_analysis, strings.icontains, strings.starts_with. Reference lists: $file_extensions_macros.

Indicators matched (18)

Field	Match	Value
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	member	`schemas.openxmlformats.org`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	member	`schemas.microsoft.com`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	member	`www.w3.org`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	member	`microsoft.com`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	member	`wps.cn`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	equals	`purl.org`
`strings.starts_with`	prefix	`/dc/`
`file.explode(attachments[])[].scan.url.urls[].domain.domain`	equals	`dublincore.org`
`strings.starts_with`	prefix	`/schemas/xmls/`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	member	`schemas.openxmlformats.org`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	member	`schemas.microsoft.com`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	member	`www.w3.org`

6 more

`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	member	`microsoft.com`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	member	`wps.cn`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	equals	`purl.org`
`file.explode(filter(attachments)[])[].scan.url.urls[].domain.domain`	equals	`dublincore.org`
`strings.icontains`	substring	`is parked free, courtesy of GoDaddy.com.`
`strings.icontains`	substring	`Get This Domain`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)