Detection rules › Splunk
Kubernetes Process with Resource Ratio Anomalies
The following analytic detects anomalous changes in resource utilization ratios for processes running on a Kubernetes node. It leverages process metrics collected via an OTEL collector and hostmetrics receiver, analyzed through Splunk Observability Cloud. The detection uses a lookup table containing average and standard deviation values for various resource ratios (e.g., CPU:memory, CPU:disk operations). Significant deviations from these baselines may indicate compromised processes, malicious activity, or misconfigurations. If confirmed malicious, this could signify a security breach, allowing attackers to manipulate workloads, potentially leading to data exfiltration or service disruption.
MITRE ATT&CK coverage
| Tactic | Techniques |
|---|---|
| Execution | T1204 User Execution |
Rule body splunk
name: Kubernetes Process with Resource Ratio Anomalies
id: 0d42b295-0f1f-4183-b75e-377975f47c65
version: 9
creation_date: '2024-01-10'
modification_date: '2026-05-13'
author: Matthew Moore, Splunk
status: experimental
type: Anomaly
description: The following analytic detects anomalous changes in resource utilization ratios for processes running on a Kubernetes node. It leverages process metrics collected via an OTEL collector and hostmetrics receiver, analyzed through Splunk Observability Cloud. The detection uses a lookup table containing average and standard deviation values for various resource ratios (e.g., CPU:memory, CPU:disk operations). Significant deviations from these baselines may indicate compromised processes, malicious activity, or misconfigurations. If confirmed malicious, this could signify a security breach, allowing attackers to manipulate workloads, potentially leading to data exfiltration or service disruption.
data_source: []
search: "| mstats avg(process.*) as process.* where `kubernetes_metrics` by host.name k8s.cluster.name k8s.node.name process.executable.name span=10s | eval cpu:mem = 'process.cpu.utilization'/'process.memory.utilization' | eval cpu:disk = 'process.cpu.utilization'/'process.disk.operations' | eval mem:disk = 'process.memory.utilization'/'process.disk.operations' | eval cpu:threads = 'process.cpu.utilization'/'process.threads' | eval disk:threads = 'process.disk.operations'/'process.threads' | eval key = 'k8s.cluster.name' + \":\" + 'host.name' + \":\" + 'process.executable.name' | lookup k8s_process_resource_ratio_baseline key | fillnull | eval anomalies = \"\" | foreach stdev_* [ eval anomalies =if( '<<MATCHSTR>>' > ('avg_<<MATCHSTR>>' + 4 * 'stdev_<<MATCHSTR>>'), anomalies + \"<<MATCHSTR>> ratio higher than average by \" + tostring(round(('<<MATCHSTR>>' - 'avg_<<MATCHSTR>>')/'stdev_<<MATCHSTR>>' ,2)) + \" Standard Deviations. <<MATCHSTR>>=\" + tostring('<<MATCHSTR>>') + \" avg_<<MATCHSTR>>=\" + tostring('avg_<<MATCHSTR>>') + \" 'stdev_<<MATCHSTR>>'=\" + tostring('stdev_<<MATCHSTR>>') + \", \" , anomalies) ] | eval anomalies = replace(anomalies, \",\\s$\", \"\") | where anomalies!=\"\" | stats count values(anomalies) as anomalies by host.name k8s.cluster.name k8s.node.name process.executable.name | where count > 5 | rename host.name as host | `kubernetes_process_with_resource_ratio_anomalies_filter`"
how_to_implement: "To implement this detection, follow these steps:\n* Deploy the OpenTelemetry Collector (OTEL) to your Kubernetes cluster.\n* Enable the hostmetrics/process receiver in the OTEL configuration.\n* Ensure that the process metrics, specifically Process.cpu.utilization and process.memory.utilization, are enabled.\n* Install the Splunk Infrastructure Monitoring (SIM) add-on. (ref: https://splunkbase.splunk.com/app/5247)\n * Configure the SIM add-on with your Observability Cloud Organization ID and Access Token.\n* Set up the SIM modular input to ingest Process Metrics. Name this input \"sim_process_metrics_to_metrics_index\".\n* In the SIM configuration, set the Organization ID to your Observability Cloud Organization ID.\n* Set the Signal Flow Program to the following: data('process.threads').publish(label='A'); data('process.cpu.utilization').publish(label='B'); data('process.cpu.time').publish(label='C'); data('process.disk.io').publish(label='D'); data('process.memory.usage').publish(label='E'); data('process.memory.virtual').publish(label='F'); data('process.memory.utilization').publish(label='G'); data('process.cpu.utilization').publish(label='H'); data('process.disk.operations').publish(label='I'); data('process.handles').publish(label='J'); data('process.threads').publish(label='K')\n* Set the Metric Resolution to 10000.\n * Leave all other settings at their default values.\n* Run the Search Baseline Of Kubernetes Container Network IO Ratio"
known_false_positives: No false positives have been identified at this time.
references:
- https://github.com/signalfx/splunk-otel-collector-chart
intermediate_findings:
entities:
- field: host
type: system
score: 20
message: Kubernetes Process with Resource Ratio Anomalies on host $host$
analytic_story:
- Abnormal Kubernetes Behavior using Splunk Infrastructure Monitoring
asset_type: Kubernetes
mitre_attack_id:
- T1204
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
category: cloud
security_domain: network
baselines:
- Baseline Of Kubernetes Process Resource Ratio
Stages and Predicates
Stage 1: search
| mstats avg(process.*) as process.* where `kubernetes_metrics` by host.name k8s.cluster.name k8s.node.name process.executable.name span=10s
Stage 2: eval
| eval cpu:mem = 'process.cpu.utilization'/'process.memory.utilization'
Stage 3: eval
| eval cpu:disk = 'process.cpu.utilization'/'process.disk.operations'
Stage 4: eval
| eval mem:disk = 'process.memory.utilization'/'process.disk.operations'
Stage 5: eval
| eval cpu:threads = 'process.cpu.utilization'/'process.threads'
Stage 6: eval
| eval disk:threads = 'process.disk.operations'/'process.threads'
Stage 7: eval
| eval key = 'k8s.cluster.name' + ":" + 'host.name' + ":" + 'process.executable.name'
Stage 8: lookup
| lookup k8s_process_resource_ratio_baseline key
Stage 9: fillnull
| fillnull
Stage 10: eval
| eval anomalies = ""
Stage 11: search
| foreach stdev_* [ eval anomalies =if( '<<MATCHSTR>>' > ('avg_<<MATCHSTR>>' + 4 * 'stdev_<<MATCHSTR>>'), anomalies + "<<MATCHSTR>> ratio higher than average by " + tostring(round(('<<MATCHSTR>>' - 'avg_<<MATCHSTR>>')/'stdev_<<MATCHSTR>>' ,2)) + " Standard Deviations. <<MATCHSTR>>=" + tostring('<<MATCHSTR>>') + " avg_<<MATCHSTR>>=" + tostring('avg_<<MATCHSTR>>') + " 'stdev_<<MATCHSTR>>'=" + tostring('stdev_<<MATCHSTR>>') + ", " , anomalies) ]
Stage 12: eval
| eval anomalies = replace(anomalies, ",\s$", "")
Stage 13: where
| where anomalies!=""
Stage 14: stats
| stats count values(anomalies) as anomalies by host.name k8s.cluster.name k8s.node.name process.executable.name
Stage 15: where
| where count > 5
Stage 16: rename
| rename host.name as host
Stage 17: search
| `kubernetes_process_with_resource_ratio_anomalies_filter`
Indicators
Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.
Search terms
Bare-string tokens in the SPL search body. Splunk matches each token against _raw (the untyped raw event text) anywhere it appears, not against a specific field. These don't surface in the Indicators table because they aren't predicates on a known field.
| Stage | Term |
|---|---|
| 1 | mstats |
| 1 | avg |
| 1 | process.* |
| 1 | as |
| 1 | process.* |
| 1 | where |
| 1 | by |
| 1 | process.executable.name |
| 1 | host.name |
| 1 | k8s.cluster.name |
| 1 | k8s.node.name |
| 11 | foreach |
| 11 | stdev_* |