AWS Bedrock Knowledge Base or RAG Data Source Tampering

Status: production
Severity: medium
Time window: 6m
Group by: cloud.account.id
Author: Elastic
Source: github.com/elastic/detection-rules

Detects control-plane mutations to AWS Bedrock knowledge bases and their backing RAG data sources via CloudTrail. An adversary with access to Bedrock Agent APIs can poison the corpus that RAG-enabled models treat as authoritative by ingesting attacker-controlled documents (IngestKnowledgeBaseDocuments, StartIngestionJob), deleting legitimate documents (DeleteKnowledgeBaseDocuments), or repointing/altering the data source itself (CreateDataSource, UpdateDataSource, DeleteDataSource, UpdateKnowledgeBase). Because downstream applications and users trust model answers grounded in this stored data, tampering with the corpus is a stored data manipulation that can drive misinformation, fraud, or manipulated decisions at inference time. This is a New Terms rule that looks for the first time a given identity ARN performs one of these knowledge base or data source mutations within the history window.

MITRE ATT&CK coverage

Tactic	Techniques
Impact	`T1565.001` Data Manipulation: Stored Data Manipulation

Event coverage

Provider	Event
AWS-bedrock	CreateDataSource
AWS-bedrock	DeleteDataSource
AWS-bedrock	DeleteKnowledgeBase
AWS-bedrock	DeleteKnowledgeBaseDocuments
AWS-bedrock	IngestKnowledgeBaseDocuments
AWS-bedrock	StartIngestionJob
AWS-bedrock	UpdateDataSource
AWS-bedrock	UpdateKnowledgeBase

Rules detecting the same action

Other rules on this platform that filter on the same API call or operation.

AWS Bedrock Delete Knowledge Base (Splunk)

Rule body elastic

[metadata]
creation_date = "2026/06/05"
integration = ["aws"]
maturity = "production"
updated_date = "2026/06/05"

[rule]
author = ["Elastic"]
description = """
Detects control-plane mutations to AWS Bedrock knowledge bases and their backing RAG data sources via CloudTrail. An
adversary with access to Bedrock Agent APIs can poison the corpus that RAG-enabled models treat as authoritative by
ingesting attacker-controlled documents (IngestKnowledgeBaseDocuments, StartIngestionJob), deleting legitimate documents
(DeleteKnowledgeBaseDocuments), or repointing/altering the data source itself (CreateDataSource, UpdateDataSource,
DeleteDataSource, UpdateKnowledgeBase). Because downstream applications and users trust model answers grounded in this
stored data, tampering with the corpus is a stored data manipulation that can drive misinformation, fraud, or
manipulated decisions at inference time. This is a New Terms rule that looks for the first time a given identity ARN
performs one of these knowledge base or data source mutations within the history window.
"""
false_positives = [
    """
    Legitimate knowledge base maintenance, content onboarding, and scheduled re-ingestion performed by data engineering
    teams, MLOps automation, or infrastructure-as-code pipelines will generate these events. Validate the calling
    identity, user agent, and source IP against known automation and approved operators. If a known maintenance workflow
    is causing noise, it can be exempted from this rule.
    """,
]
from = "now-6m"
index = ["logs-aws.cloudtrail-*"]
language = "kuery"
license = "Elastic License v2"
name = "AWS Bedrock Knowledge Base or RAG Data Source Tampering"
note = """## Triage and analysis

### Investigating AWS Bedrock Knowledge Base or RAG Data Source Tampering

AWS Bedrock knowledge bases provide Retrieval-Augmented Generation (RAG) by grounding model responses in a stored
corpus that is synchronized from a configured data source. Because RAG-enabled applications present these grounded
answers as authoritative, an adversary who can ingest, delete, or repoint the underlying corpus can poison the answers
returned to downstream users and systems. This rule detects control-plane changes to knowledge bases and data sources
that could enable such corpus poisoning.

#### Possible investigation steps

- **Identify the actor and context**
  - Review `aws.cloudtrail.user_identity.arn`, `aws.cloudtrail.user_identity.type`, and
    `aws.cloudtrail.user_identity.access_key_id`.
  - Examine `source.ip`, `user_agent.original`, and `aws.cloudtrail.user_identity.invoked_by` to determine whether the
    change came from an approved operator, automation, or an unexpected origin.
  - Confirm a related change request exists (content update, data source migration, scheduled ingestion).
- **Validate the specific action**
  - Inspect `event.action` and `aws.cloudtrail.flattened.request_parameters` to identify the knowledge base, data
    source, and any S3 bucket / ingestion configuration referenced.
  - For `CreateDataSource` / `UpdateDataSource`, verify the data source location (e.g., S3 bucket) is org-owned and not
    attacker-controlled.
  - For `IngestKnowledgeBaseDocuments` / `StartIngestionJob`, review what content was ingested and from where.
  - For `DeleteKnowledgeBaseDocuments` / `DeleteDataSource`, determine whether legitimate content was removed.
- **Correlate activity**
  - Look for prior enumeration of Bedrock resources or anomalous IAM/STS activity from the same identity.
  - Review `cloud.account.id` and `cloud.region` to confirm the change occurred where expected.

### False positive analysis

- **Planned content maintenance**: Routine ingestion, document updates, and re-syncs by data teams or MLOps automation
  are expected. Validate against change tickets and known automation roles.
- **Infrastructure-as-code**: Pipelines may create or update data sources during deployments. Confirm the source IP and
  ARN match expected automation.

### Response and remediation

- If unauthorized, suspend or disable the implicated knowledge base and data source to prevent further poisoned
  retrieval, and revert the corpus to a known-good state.
- Disable or rotate the credentials identified in `aws.cloudtrail.user_identity.access_key_id` if compromise is
  suspected.
- Audit recent ingestion jobs and document changes, and validate the integrity of the data source location.
- Restrict Bedrock Agent knowledge base and data source mutation permissions to a small set of trusted roles.
"""
references = [
    "https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Agents_for_Amazon_Bedrock.html"
]
risk_score = 47
rule_id = "7811b5f7-9e07-4999-87aa-a950365cd327"
setup = """## Setup

This rule requires the AWS CloudTrail integration. The data source and knowledge base configuration actions are management 
events (captured by default), but the direct document operations (`IngestKnowledgeBaseDocuments`, `DeleteKnowledgeBaseDocuments`) 
are Bedrock CloudTrail **data events** that are off by default. Without Bedrock data-event logging enabled on the trail, this rule 
provides only **partial coverage** — it will see config changes but not direct document ingestion/deletion, the primary poisoning 
vector. Enable Bedrock data-event logging for full coverage.
"""

severity = "medium"
tags = [
    "Domain: Cloud",
    "Domain: LLM",
    "Data Source: AWS",
    "Data Source: AWS CloudTrail",
    "Data Source: Amazon Web Services",
    "Data Source: Amazon Bedrock",
    "Use Case: Threat Detection",
    "Resources: Investigation Guide",
    "Tactic: Impact",
]
timestamp_override = "event.ingested"
type = "new_terms"

query = '''
data_stream.dataset: "aws.cloudtrail" and
    event.provider: "bedrock.amazonaws.com" and
    event.action: (
        "IngestKnowledgeBaseDocuments" or
        "DeleteKnowledgeBaseDocuments" or
        "UpdateKnowledgeBase" or
        "CreateDataSource" or
        "UpdateDataSource" or
        "DeleteDataSource" or
        "StartIngestionJob" or
        "DeleteKnowledgeBase"
    ) and
    event.outcome: "success"
'''


[[rule.threat]]
framework = "MITRE ATT&CK"

[[rule.threat.technique]]
id = "T1565"
name = "Data Manipulation"
reference = "https://attack.mitre.org/techniques/T1565/"

[[rule.threat.technique.subtechnique]]
id = "T1565.001"
name = "Stored Data Manipulation"
reference = "https://attack.mitre.org/techniques/T1565/001/"

[rule.threat.tactic]
id = "TA0040"
name = "Impact"
reference = "https://attack.mitre.org/tactics/TA0040/"

[rule.investigation_fields]
field_names = [
    "@timestamp",
    "user.name",
    "user_agent.original",
    "source.ip",
    "aws.cloudtrail.user_identity.arn",
    "aws.cloudtrail.user_identity.type",
    "aws.cloudtrail.user_identity.access_key_id",
    "event.action",
    "event.provider",
    "event.outcome",
    "cloud.account.id",
    "cloud.region",
    "aws.cloudtrail.request_parameters",
    "aws.cloudtrail.response_elements",
]

[rule.new_terms]
field = "new_terms_fields"
value = ["cloud.account.id"]

[[rule.new_terms.history_window_start]]
field = "history_window_start"
value = "now-7d"

Stages and Predicates

Stage 1: `new_terms`

data_stream.dataset: "aws.cloudtrail" and
    event.provider: "bedrock.amazonaws.com" and
    event.action: (
        "IngestKnowledgeBaseDocuments" or
        "DeleteKnowledgeBaseDocuments" or
        "UpdateKnowledgeBase" or
        "CreateDataSource" or
        "UpdateDataSource" or
        "DeleteDataSource" or
        "StartIngestionJob" or
        "DeleteKnowledgeBase"
    ) and
    event.outcome: "success"

New terms: cloud.account.id
History since: now-7d

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

Field	Kind	Values
`data_stream.dataset`	eq	`aws.cloudtrail`
`event.action`	in	`CreateDataSource` `DeleteDataSource` `DeleteKnowledgeBase` `DeleteKnowledgeBaseDocuments` `IngestKnowledgeBaseDocuments` `StartIngestionJob` `UpdateDataSource` `UpdateKnowledgeBase`
`event.outcome`	eq	`success`
`event.provider`	eq	`bedrock.amazonaws.com`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)

AWS Bedrock Knowledge Base or RAG Data Source Tampering

MITRE ATT&CK coverage

Event coverage

Rules detecting the same action

Rule body elastic

Stages and Predicates

Stage 1: `new_terms`

Indicators

Keyboard shortcuts

Search operators

AWS Bedrock Knowledge Base or RAG Data Source Tampering

MITRE ATT&CK coverage

Event coverage

Rules detecting the same action

Rule body elastic

Stages and Predicates

Stage 1: new_terms

Indicators

Stage 1: `new_terms`