Detection rules › Splunk

Command Line Homoglyphs - Windows (Sysmon)

Group by
_time, host
Source
github.com/anvilogic-forge/armory

Threat actors may use homoglyph attacks by substituting characters in process names or commands with visually similar Unicode symbols to impersonate legitimate commands, or messages (e.g., using Cyrillic “а” instead of Latin “a”). This tactic is often used to evade string-based detection and confuse analysts during investigation. This use case detects Windows processes containing Unicode characters from commonly abused homoglyph ranges, including Cyrillic extended, Greek extended, and full-width Latin letters and digits.

MITRE ATT&CK coverage

References

Event coverage

ProviderEventTitle
SysmonEvent ID 1Process creation

Rule body yaml

id: '44644.87462'
title: Command Line Homoglyphs - Windows
description: Threat actors may use homoglyph attacks by substituting characters in
  process names or commands with visually similar Unicode symbols to impersonate legitimate
  commands, or messages (e.g., using Cyrillic “а” instead of Latin “a”). This tactic
  is often used to evade string-based detection and confuse analysts during investigation.
  This use case detects Windows processes containing Unicode characters from commonly
  abused homoglyph ranges, including Cyrillic extended, Greek extended, and full-width
  Latin letters and digits.
logic_format: Splunk
logic: '`get_endpoint_data` `get_endpoint_data_sysmon` (TERM(EventCode=1) OR "<EventID>1<")
  | regex process="[Ѐ-ӿͰ-Ͽa-zA-Z0-9]" | table _time, host, user, process, process_name,
  parent_process_name | bin span=1s | stats values(*) as * by _time, host '
techniques:
- defense-evasion:obfuscated files or information:command obfuscation
technique_id:
- T1027.010
data_category:
- Windows Sysmon
references:
- https://app.any.run/tasks/c044f84a-fd44-47ab-b53f-976debf96e63
- https://www.zdnet.com/article/magecart-group-uses-homoglyph-attacks-to-fool-you-into-visiting-malicious-websites/
- https://www.meshsecurity.io/blog/homoglyph-attacks-understanding-and-mitigating-the-threat
- https://www.bitdefender.com/en-us/blog/businessinsights/homograph-phishing-attacks-when-user-awareness-is-not-enough

Stages and Predicates

Stage 1: search

`get_endpoint_data` `get_endpoint_data_sysmon` (TERM(EventCode=1) OR "<EventID>1<")

Stage 2: regex

| regex process="[Ѐ-ӿͰ-Ͽa-zA-Z0-9]"

Stage 3: table

| table _time, host, user, process, process_name, parent_process_name

Stage 4: bucket

| bin span=1s

Stage 5: stats

| stats values(*) as * by _time, host

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

FieldKindValues
EventCodeeq
  • 1 corpus 237 (splunk 224, kusto 13)
processregex_match
  • "[Ѐ-ӿͰ-Ͽa-zA-Z0-9]" corpus 2 (splunk 2)

Search terms

Bare-string tokens in the SPL search body. Splunk matches each token against _raw (the untyped raw event text) anywhere it appears, not against a specific field. These don't surface in the Indicators table because they aren't predicates on a known field.

StageTerm
1TERM
1"<EventID>1<"