Detection rules › Panther

AWS Bedrock Model Invocation GuardRail Intervened

Status
Experimental
Severity
informational
Log types
AWS.BedrockModelInvocation
Tags
AWS, Bedrock, Persistence, Manipulate AI Model
Reference
https://stratus-red-team.cloud/attack-techniques/AWS/aws.impact.bedrock-invoke-model/, https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
Source
github.com/panther-labs/panther-analysis

Detects when AWS Bedrock guardrail features have intervened during AI model invocations. It specifically monitors when an AI model request was blocked by Guardrails. This helps security teams identify when users attempt to generate potentially harmful or inappropriate content through AWS Bedrock models.

MITRE ATT&CK coverage

TacticTechniques
PersistenceNo specific technique
Credential AccessNo specific technique

Rule body yaml

AnalysisType: rule
Filename: aws_bedrockmodelinvocation_guardrailintervened.py
RuleID: "AWS.BedrockModelInvocation.GuardRailIntervened"
DisplayName: "AWS Bedrock Model Invocation GuardRail Intervened"
Enabled: true
LogTypes:
    - AWS.BedrockModelInvocation
Tags:
    - AWS
    - Bedrock
    - Persistence
    - Manipulate AI Model
Status: Experimental
Severity: Info
Reports:
    MITRE ATT&CK:
        - TA0006:T0018.000
Description: Detects when AWS Bedrock guardrail features have intervened during AI model invocations. It specifically monitors when an AI model request was blocked by Guardrails. This helps security teams identify when users attempt to generate potentially harmful or inappropriate content through AWS Bedrock models.
Runbook: Confirm alert details by reviewing the model ID, operation name, account ID, and the specific guardrail intervention reasons provided in the alert description. Analyze the user prompts that triggered the guardrail by examining the Bedrock console logs for the associated requestId, looking for patterns of attempted model poisoning or prompt injection techniques. If suspicious activity is confirmed, temporarily restrict the access of the malicious actor to Bedrock services, preserve all evidence of the interaction, and escalate to the security team for further analysis of potential AI model manipulation attempts. https://atlas.mitre.org/mitigations/AML.M0005
DedupPeriodMinutes: 60
Threshold: 1
Reference: https://stratus-red-team.cloud/attack-techniques/AWS/aws.impact.bedrock-invoke-model/, https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
SummaryAttributes:
  - p_any_aws_account_ids
  - p_any_aws_arns
InlineFilters:
    - All: []
Tests:
    - Name: Perform Another Operation
      ExpectedResult: false
      Log:
        accountId: "111111111111"
        identity:
            arn: arn:aws:sts::111111111111:assumed-role/role_details/regular.user
        input:
            inputBodyJson:
                messages:
                    - content:
                        - text: I have a rather normal question.
                      role: user
            inputContentType: application/json
            inputTokenCount: 0
        modelId: anthropic.claude-3-haiku-20240307-v1:0
        operation: ListModels
        output:
            outputBodyJson:
                metrics:
                    latencyMs: 249
                output:
                    message:
                        content:
                            - text: I can respond to this question
                        role: assistant
                usage:
                    inputTokens: 0
                    outputTokens: 0
                    totalTokens: 0
            outputContentType: application/json
            outputTokenCount: 0
        region: us-west-2
        requestId: bb98d9a8-bd9a-47ca-976b-f165ef1f8b67
        schemaType: ModelInvocationLog
        schemaVersion: "1.0"
        timestamp: "2025-05-15 14:17:22.000000000"
    - Name: Regular Converse Operation
      ExpectedResult: false
      Log:
        accountId: "111111111111"
        identity:
            arn: arn:aws:sts::111111111111:assumed-role/role_details/regular.user
        input:
            inputBodyJson:
                messages:
                    - content:
                        - text: I have a rather normal question.
                      role: user
            inputContentType: application/json
            inputTokenCount: 0
        modelId: anthropic.claude-3-haiku-20240307-v1:0
        operation: Converse
        output:
            outputBodyJson:
                metrics:
                    latencyMs: 249
                output:
                    message:
                        content:
                            - text: I can respond to this question
                        role: assistant
                usage:
                    inputTokens: 0
                    outputTokens: 0
                    totalTokens: 0
            outputContentType: application/json
            outputTokenCount: 0
        region: us-west-2
        requestId: bb98d9a8-bd9a-47ca-976b-f165ef1f8b67
        schemaType: ModelInvocationLog
        schemaVersion: "1.0"
        timestamp: "2025-05-15 14:17:22.000000000"
    - Name: Suspicious Converse Operation
      ExpectedResult: true
      Log:
        accountId: "111111111111"
        identity:
            arn: arn:aws:sts::111111111111:assumed-role/role_details/suspicious.user
        input:
            inputBodyJson:
                messages:
                    - content:
                        - text: I have a very suspicious question.
                      role: user
            inputContentType: application/json
            inputTokenCount: 0
        modelId: anthropic.claude-3-haiku-20240307-v1:0
        operation: Converse
        output:
            outputBodyJson:
                metrics:
                    latencyMs: 249
                output:
                    message:
                        content:
                            - text: You shouldn't ask this question
                        role: assistant
                stopReason: guardrail_intervened
                usage:
                    inputTokens: 0
                    outputTokens: 0
                    totalTokens: 0
            outputContentType: application/json
            outputTokenCount: 0
        region: us-west-2
        requestId: bb98d9a8-bd9a-47ca-976b-f165ef1f8b67
        schemaType: ModelInvocationLog
        schemaVersion: "1.0"
        timestamp: "2025-05-15 14:17:22.000000000"
    - Name: Suspicious Invoke Operation
      ExpectedResult: true
      Log:
        accountId: "111111111111"
        identity:
            arn: arn:aws:sts::111111111111:assumed-role/role_details/suspicious.user
        input:
            inputBodyJson:
                anthropic_version: bedrock-2023-05-31
                max_tokens: 100
                messages:
                    - content: I have a very suspicious question.
                      role: user
                system: You are a helpful assistant.
            inputContentType: application/json
        modelId: anthropic.claude-3-haiku-20240307-v1:0
        operation: InvokeModel
        output:
            outputBodyJson:
                amazon-bedrock-guardrailAction: INTERVENED
                amazon-bedrock-trace:
                    guardrail:
                        actionReason: Guardrail blocked.
                        input:
                            h28wrktbwagn:
                                contentPolicy:
                                    filters:
                                        - action: BLOCKED
                                          confidence: HIGH
                                          detected: true
                                          filterStrength: HIGH
                                          type: VIOLENCE
                                invocationMetrics:
                                    guardrailCoverage:
                                        textCharacters:
                                            guarded: 62
                                            total: 62
                                    guardrailProcessingLatency: 179
                                    usage:
                                        contentPolicyImageUnits: 0
                                        contentPolicyUnits: 1
                                        contextualGroundingPolicyUnits: 0
                                        sensitiveInformationPolicyFreeUnits: 0
                                        sensitiveInformationPolicyUnits: 0
                                        topicPolicyUnits: 0
                                        wordPolicyUnits: 0
                content:
                    - text: You shouldn't ask this question
                      type: text
                role: assistant
                type: message
            outputContentType: application/json
        region: us-west-2
        requestId: ba78ac1f-5ea4-4e2a-a936-92f7e13c96c4
        schemaType: ModelInvocationLog
        schemaVersion: "1.0"
        timestamp: "2025-05-15 14:14:49.000000000"

Detection logic

Condition

not (operation ne "InvokeModel" and operation ne "Converse")
output.outputBodyJSON.stopReason eq "guardrail_intervened" or output.outputBodyJSON.amazon-bedrock-trace.guardrail.actionReason starts_with "Guardrail blocked"

Exclusions

Top-level NOT(...) conjuncts: predicates this rule actively suppresses.

FieldKindExcluded values
operationneConverse
operationneInvokeModel

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

FieldKindValues
output.outputBodyJSON.amazon-bedrock-trace.guardrail.actionReasonstarts_with
  • Guardrail blocked
output.outputBodyJSON.stopReasoneq
  • guardrail_intervened

Output fields

Fields the rule emits when it matches. Chronicle authors list these in the outcome block; they appear on the detection and $risk_score drives alerting. Sentinel / Defender XDR rules build them up through project / summarize / extend stages. Sentinel maps these into alert fields via entityMappings and customDetails; Defender XDR custom detections surface them as alert fields directly.

FieldSource
modelId
operation
accountId
stopReasonoutput.outputBodyJSON.stopReason
actionReasonoutput.outputBodyJSON.amazon-bedrock-trace.guardrail.actionReason