Detection rules › Splunk
Command Line Homoglyphs - Windows (Sysmon)
Threat actors may use homoglyph attacks by substituting characters in process names or commands with visually similar Unicode symbols to impersonate legitimate commands, or messages (e.g., using Cyrillic “а” instead of Latin “a”). This tactic is often used to evade string-based detection and confuse analysts during investigation. This use case detects Windows processes containing Unicode characters from commonly abused homoglyph ranges, including Cyrillic extended, Greek extended, and full-width Latin letters and digits.
MITRE ATT&CK coverage
| Tactic | Techniques |
|---|---|
| Stealth | T1027.010 Obfuscated Files or Information: Command Obfuscation |
References
- https://app.any.run/tasks/c044f84a-fd44-47ab-b53f-976debf96e63
- https://www.zdnet.com/article/magecart-group-uses-homoglyph-attacks-to-fool-you-into-visiting-malicious-websites/
- https://www.meshsecurity.io/blog/homoglyph-attacks-understanding-and-mitigating-the-threat
- https://www.bitdefender.com/en-us/blog/businessinsights/homograph-phishing-attacks-when-user-awareness-is-not-enough
Event coverage
| Provider | Event | Title |
|---|---|---|
| Sysmon | Event ID 1 | Process creation |
Rule body yaml
id: '44644.87462'
title: Command Line Homoglyphs - Windows
description: Threat actors may use homoglyph attacks by substituting characters in
process names or commands with visually similar Unicode symbols to impersonate legitimate
commands, or messages (e.g., using Cyrillic “а” instead of Latin “a”). This tactic
is often used to evade string-based detection and confuse analysts during investigation.
This use case detects Windows processes containing Unicode characters from commonly
abused homoglyph ranges, including Cyrillic extended, Greek extended, and full-width
Latin letters and digits.
logic_format: Splunk
logic: '`get_endpoint_data` `get_endpoint_data_sysmon` (TERM(EventCode=1) OR "<EventID>1<")
| regex process="[Ѐ-ӿͰ-Ͽa-zA-Z0-9]" | table _time, host, user, process, process_name,
parent_process_name | bin span=1s | stats values(*) as * by _time, host '
techniques:
- defense-evasion:obfuscated files or information:command obfuscation
technique_id:
- T1027.010
data_category:
- Windows Sysmon
references:
- https://app.any.run/tasks/c044f84a-fd44-47ab-b53f-976debf96e63
- https://www.zdnet.com/article/magecart-group-uses-homoglyph-attacks-to-fool-you-into-visiting-malicious-websites/
- https://www.meshsecurity.io/blog/homoglyph-attacks-understanding-and-mitigating-the-threat
- https://www.bitdefender.com/en-us/blog/businessinsights/homograph-phishing-attacks-when-user-awareness-is-not-enough
Stages and Predicates
Stage 1: search
`get_endpoint_data` `get_endpoint_data_sysmon` (TERM(EventCode=1) OR "<EventID>1<")
Stage 2: regex
| regex process="[Ѐ-ӿͰ-Ͽa-zA-Z0-9]"
Stage 3: table
| table _time, host, user, process, process_name, parent_process_name
Stage 4: bucket
| bin span=1s
Stage 5: stats
| stats values(*) as * by _time, host
Indicators
Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.
Search terms
Bare-string tokens in the SPL search body. Splunk matches each token against _raw (the untyped raw event text) anywhere it appears, not against a specific field. These don't surface in the Indicators table because they aren't predicates on a known field.
| Stage | Term |
|---|---|
| 1 | TERM |
| 1 | "<EventID>1<" |