Potential PowerShell Obfuscation via High Special Character Proportion

Status: production
Kind: building block (feeds higher-level correlation rules; not a standalone alert)
Severity: low
Time window: 9m
Author: Elastic
Source: github.com/elastic/detection-rules

Identifies PowerShell script block content with an unusually high proportion of non-alphanumeric characters, often produced by encoding, string mangling, or dynamic code generation. Attackers use special-character heavy obfuscation to conceal payloads and hinder static analysis and AMSI.

MITRE ATT&CK coverage

Tactic	Techniques
Execution	`T1059.001` Command and Scripting Interpreter: PowerShell
Stealth	`T1027.010` Obfuscated Files or Information: Command Obfuscation, `T1140` Deobfuscate/Decode Files or Information

Event coverage

Provider	Event	Title
PowerShell	Event ID 4104	Creating Scriptblock text (MessageNumber of MessageTotal).

Rule body elastic

[metadata]
bypass_bbr_timing = true
creation_date = "2025/04/16"
integration = ["windows"]
maturity = "production"
updated_date = "2026/03/24"

[rule]
author = ["Elastic"]
building_block_type = "default"
description = """
Identifies PowerShell script block content with an unusually high proportion of non-alphanumeric characters, often
produced by encoding, string mangling, or dynamic code generation. Attackers use special-character heavy obfuscation to
conceal payloads and hinder static analysis and AMSI.
"""
from = "now-9m"
language = "esql"
license = "Elastic License v2"
name = "Potential PowerShell Obfuscation via High Special Character Proportion"
note = """## Triage and analysis

> **Disclaimer**:
> This guide was created by humans with the assistance of generative AI. While its contents have been manually curated to include the most valuable information, always validate assumptions and adjust procedures to match your internal runbooks and incident triage and response policies.

### Investigating Potential PowerShell Obfuscation via High Special Character Proportion

This rule alerts on PowerShell Script Block Logging content where non-alphanumeric characters make up an unusually large share of the script text. This pattern is frequently produced by encoding, aggressive escaping, string mangling, or dynamic code generation intended to hinder inspection.

The alert includes scoring and script-shape fields that help distinguish benign embedded data from suspicious deobfuscation/execution logic. Treat the script block text as potentially untrusted content and focus on reconstructing the full script and identifying downstream behavior through correlation.

#### Key alert fields to review

- `user.name`, `user.domain`, `user.id`: Account execution context for correlation, prioritization, and scoping.
- `host.name`, `host.id`: Host execution context for correlation, prioritization, and scoping.
- `file.path`, `file.directory`, `file.name`: File-origin context when the script block is sourced from an on-disk file.
- `powershell.file.script_block_text`: Script block content that matched the detection logic.
- `powershell.file.script_block_id`, `powershell.sequence`, `powershell.total`: Script block metadata to pivot to other fragments or reconstruct full script content when split across multiple events.
- `Esql.script_block_tmp`: Transformed script block where detection patterns replace original content with a marker to support scoring/counting and quickly spot match locations.
- `Esql.script_block_ratio`: Proportion of the script block's characters that match the alert's target character set, divided by total script length (0-1).
- `Esql.script_block_pattern_count`: Count of matches for the detection pattern(s) observed in the script block content.
- `powershell.file.script_block_entropy_bits`: Shannon entropy of the script block. Higher values may indicate obfuscation.
- `powershell.file.script_block_surprisal_stdev`: Standard deviation of surprisal across the script block. Low values indicate uniform randomness. High values indicate mixed patterns and variability.
- `powershell.file.script_block_unique_symbols`: Count of distinct characters present in the script block.
- `powershell.file.script_block_length`: Script block length (size) context.

#### Possible investigation steps

- Establish scope and execution context:
  - Review `@timestamp` to set an investigation window and identify preceding/follow-on activity.
  - Identify the affected `host.id`/`host.name` and the executing `user.id`/`user.name`/`user.domain`.
  - Determine whether the user/host pairing is expected for PowerShell usage in your environment and whether the account is privileged or widely used.

- Interpret the alert scoring and "shape" signals:
  - Review `Esql.script_block_ratio` and `Esql.script_block_pattern_count` alongside `powershell.file.script_block_length` to understand how extreme the special-character density is.
  - Use `Esql.script_block_tmp` to quickly assess whether the special characters cluster in a single region (for example, one embedded blob) or are distributed throughout the script (for example, pervasive mangling).
  - Use `powershell.file.script_block_entropy_bits`, `powershell.file.script_block_unique_symbols`, and `powershell.file.script_block_surprisal_stdev` to guide prioritization:
    - High entropy with many unique symbols commonly aligns with encoded/compressed/encrypted blobs embedded in the script.
    - High special-character ratio with lower entropy can align with heavy escaping, string concatenation, or code generation.
    - Low surprisal variability (low standard deviation) can indicate uniformly random-looking content typical of dense encoding.

- Reconstruct the full script block before making a determination:
  - Pivot on `powershell.file.script_block_id` to collect all fragments for the script block.
  - Reassemble fragments in order using `powershell.sequence` and confirm completeness using `powershell.total`.
  - If fragments are missing, treat the visible content as incomplete and continue collection/scoping before concluding intent.

- Analyze `powershell.file.script_block_text` for intent and technique:
  - Identify whether the content is primarily data (for example, long opaque strings) or executable logic (functions, control flow, and invocation).
  - Look for common deobfuscation and dynamic execution patterns, such as:
    - Decoding/decompression routines (for example, Base64 decoding, byte/char transformations, compression streams).
    - Dynamic invocation or staged execution (for example, `Invoke-Expression`/`IEX`, reflection, `.Invoke()`, `Add-Type`).
    - Retrieval of remote content or secondary payloads (for example, web request/client usage, download-and-execute flow).
  - If the script includes clear indicators (domains, URLs, IP addresses, file paths, file names), capture them from the script text for pivoting and scoping.

- Determine provenance and expectedness using available file context:
  - If `file.path`/`file.directory`/`file.name` are present, assess whether the location and naming align with known administrative scripts or approved automation.
  - If file fields are absent, treat the activity as potentially interactive or in-memory and prioritize identifying what initiated PowerShell through adjacent host telemetry.

- Correlate with adjacent telemetry (as available in your environment) to confirm impact:
  - Process activity on the same host and time window to determine the initiating process and whether PowerShell was launched indirectly.
  - Network activity to identify outbound connections consistent with download, staging, or command-and-control behavior.
  - File and registry activity to identify dropped artifacts or persistence-related changes.
  - Authentication activity for the same `user.id` to identify suspicious logons or lateral movement preceding the script execution.

- Expand scope across the environment:
  - Search for additional script block events on the same `host.id` and `user.id` near the alert time to identify staged execution (for example, decoding in one block, execution in another).
  - Hunt for similar script content using stable substrings from `powershell.file.script_block_text`, and group by `file.name` or `file.path` when present to identify reuse.

### False positive analysis

- False positives are most likely when scripts embed or manipulate large non-code payloads (for example, serialized objects, structured data, certificates, or compressed content) or when tooling auto-generates scripts with heavy escaping and templating.
- Validate benign hypotheses by confirming a consistent execution pattern over time (recurring `host.id` and `user.id`), expected provenance (`file.path`/`file.name` when present), and script content that performs known administrative functions rather than decoding and executing newly generated code.
- Treat unexpected execution context (new user/host pairing) combined with high entropy and opaque content as higher risk, even if the script text does not immediately reveal its final payload.

### Response and remediation

- If malicious or suspicious activity is confirmed:
  - Contain the affected host to prevent additional execution and lateral movement.
  - Preserve evidence from the alert, including `powershell.file.script_block_text`, reconstructed fragments (if applicable), `powershell.file.script_block_id`, and the scoring fields (`Esql.script_block_ratio`, `Esql.script_block_pattern_count`, `powershell.file.script_block_entropy_bits`).
  - Identify and remediate follow-on behavior discovered during correlation (downloaded payloads, dropped files, persistence changes, or suspicious network destinations).
  - Scope the intrusion by searching for similar script content across the environment using stable substrings from `powershell.file.script_block_text`, and by pivoting on `user.id`, `host.id`, `file.path`, and `file.name`.
  - If account compromise is suspected, reset credentials for the affected user and review other recent activity for that `user.id`.

- If the activity is determined benign:
  - Document the responsible script or workflow, expected execution context, and typical frequency.
  - Monitor for deviations such as new hosts, new users, or materially different script content that may indicate abuse of a legitimate mechanism.
"""
risk_score = 21
rule_id = "f9753455-8d55-4ad8-b70a-e07b6f18deea"
setup = """## Setup

PowerShell Script Block Logging must be enabled to generate the events used by this rule (e.g., 4104).
Setup instructions: https://ela.st/powershell-logging-setup
"""
severity = "low"
tags = [
    "Domain: Endpoint",
    "OS: Windows",
    "Use Case: Threat Detection",
    "Tactic: Defense Evasion",
    "Data Source: PowerShell Logs",
    "Rule Type: BBR",
    "Resources: Investigation Guide",
]
timestamp_override = "event.ingested"
type = "esql"

query = '''
from logs-windows.powershell_operational* metadata _id, _version, _index
| where event.code == "4104"

// Filter out smaller scripts that are unlikely to implement obfuscation using the patterns we are looking for
| eval Esql.script_block_length = length(powershell.file.script_block_text)
| where Esql.script_block_length > 1000

// replace the patterns we are looking for with the 🔥 emoji to enable counting them
// The emoji is used because it's unlikely to appear in scripts and has a consistent character length of 1
// Excludes spaces, #, = and - as they are heavily used in scripts for formatting
| eval Esql.script_block_tmp = replace(powershell.file.script_block_text, """[^0-9A-Za-z\s#=-]""", "🔥")

// count how many patterns were detected by calculating the number of 🔥 characters inserted
| eval Esql.script_block_pattern_count = Esql.script_block_length - length(replace(Esql.script_block_tmp, "🔥", ""))

// Calculate the ratio of special characters to total length
| eval Esql.script_block_ratio = Esql.script_block_pattern_count::double / Esql.script_block_length::double

// keep the fields relevant to the query, although this is not needed as the alert is populated using _id
| keep
    Esql.script_block_pattern_count,
    Esql.script_block_length,
    Esql.script_block_ratio,
    Esql.script_block_tmp,
    powershell.file.*,
    file.path,
    file.directory,
    powershell.sequence,
    powershell.total,
    _id,
    _version,
    _index,
    host.name,
    host.id,
    agent.id,
    user.id

// Filter for scripts with high special character ratio
| where Esql.script_block_ratio > 0.35

// Exclude Noisy Patterns
| where not file.directory like "C:\\\\ProgramData\\\\Microsoft\\\\Windows Defender Advanced Threat Protection\\\\DataCollection\\\\*"
  // ESQL requires this condition, otherwise it only returns matches where file.directory exists.
  or file.directory IS NULL
'''


[[rule.threat]]
framework = "MITRE ATT&CK"

[[rule.threat.technique]]
id = "T1027"
name = "Obfuscated Files or Information"
reference = "https://attack.mitre.org/techniques/T1027/"

[[rule.threat.technique.subtechnique]]
id = "T1027.010"
name = "Command Obfuscation"
reference = "https://attack.mitre.org/techniques/T1027/010/"

[[rule.threat.technique]]
id = "T1140"
name = "Deobfuscate/Decode Files or Information"
reference = "https://attack.mitre.org/techniques/T1140/"

[rule.threat.tactic]
id = "TA0005"
name = "Defense Evasion"
reference = "https://attack.mitre.org/tactics/TA0005/"

[[rule.threat]]
framework = "MITRE ATT&CK"

[[rule.threat.technique]]
id = "T1059"
name = "Command and Scripting Interpreter"
reference = "https://attack.mitre.org/techniques/T1059/"

[[rule.threat.technique.subtechnique]]
id = "T1059.001"
name = "PowerShell"
reference = "https://attack.mitre.org/techniques/T1059/001/"

[rule.threat.tactic]
id = "TA0002"
name = "Execution"
reference = "https://attack.mitre.org/tactics/TA0002/"
[rule.investigation_fields]
field_names = [
    "@timestamp",
    "user.name",
    "user.id",
    "user.domain",
    "powershell.file.script_block_text",
    "powershell.file.script_block_id",
    "powershell.sequence",
    "powershell.total",
    "file.path",
    "file.directory",
    "file.name",
    "process.pid",
    "host.name",
    "host.id",
    "powershell.file.script_block_length"
]

Stages and Predicates

Stage 1: `from`

from logs-windows.powershell_operational* metadata _id, _version, _index

Stage 2: `where`

| where event.code == "4104"

Stage 3: `eval`

| eval Esql.script_block_length = length(powershell.file.script_block_text)

Stage 4: `where`

| where Esql.script_block_length > 1000

Stage 5: `eval`

| eval Esql.script_block_tmp = replace(powershell.file.script_block_text, """[^0-9A-Za-z\s#=-]""", "🔥")

Stage 6: `eval`

| eval Esql.script_block_pattern_count = Esql.script_block_length - length(replace(Esql.script_block_tmp, "🔥", ""))

Stage 7: `eval`

| eval Esql.script_block_ratio = Esql.script_block_pattern_count::double / Esql.script_block_length::double

Stage 8: `keep`

| keep
    Esql.script_block_pattern_count,
    Esql.script_block_length,
    Esql.script_block_ratio,
    Esql.script_block_tmp,
    powershell.file.*,
    file.path,
    file.directory,
    powershell.sequence,
    powershell.total,
    _id,
    _version,
    _index,
    host.name,
    host.id,
    agent.id,
    user.id

Stage 9: `where`

| where Esql.script_block_ratio > 0.35

Stage 10: `where`

| where not file.directory like "C:\\\\ProgramData\\\\Microsoft\\\\Windows Defender Advanced Threat Protection\\\\DataCollection\\\\*"
  or file.directory IS NULL

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

Field	Kind	Values
`Esql.script_block_length`	gt	`1000` corpus 3 (elastic 3)
`Esql.script_block_ratio`	gt	`0.35`
`file.directory`	is_null	(no value, null check)

Output fields

Fields the rule emits when it matches. Chronicle authors list these in the outcome block; they appear on the detection and $risk_score drives alerting. Sentinel / Defender XDR rules build them up through project / summarize / extend stages. Sentinel maps these into alert fields via entityMappings and customDetails; Defender XDR custom detections surface them as alert fields directly.

Field	Source
`Esql.script_block_pattern_count`	`KEEP Esql.script_block_pattern_count`
`Esql.script_block_length`	`KEEP Esql.script_block_length`
`Esql.script_block_ratio`	`KEEP Esql.script_block_ratio`
`Esql.script_block_tmp`	`KEEP Esql.script_block_tmp`
`powershell.file.*`	`KEEP powershell.file.*`
`file.path`	`KEEP file.path`
`file.directory`	`KEEP file.directory`
`powershell.sequence`	`KEEP powershell.sequence`
`powershell.total`	`KEEP powershell.total`
`_id`	`KEEP _id`
`_version`	`KEEP _version`
`_index`	`KEEP _index`
`host.name`	`KEEP host.name`
`host.id`	`KEEP host.id`
`agent.id`	`KEEP agent.id`
`user.id`	`KEEP user.id`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)

Potential PowerShell Obfuscation via High Special Character Proportion

MITRE ATT&CK coverage

Event coverage

Rule body elastic

Stages and Predicates

Stage 1: `from`

Stage 2: `where`

Stage 3: `eval`

Stage 4: `where`

Stage 5: `eval`

Stage 6: `eval`

Stage 7: `eval`

Stage 8: `keep`

Stage 9: `where`

Stage 10: `where`

Indicators

Output fields

Keyboard shortcuts

Search operators

Potential PowerShell Obfuscation via High Special Character Proportion

MITRE ATT&CK coverage

Event coverage

Rule body elastic

Stages and Predicates

Stage 1: from

Stage 2: where

Stage 3: eval

Stage 4: where

Stage 5: eval

Stage 6: eval

Stage 7: eval

Stage 8: keep

Stage 9: where

Stage 10: where

Indicators

Output fields

Stage 1: `from`

Stage 2: `where`

Stage 3: `eval`

Stage 4: `where`

Stage 5: `eval`

Stage 6: `eval`

Stage 7: `eval`

Stage 8: `keep`

Stage 9: `where`

Stage 10: `where`