Response rows stateful anomaly on database

Status: available
Severity: medium
Time window: 14d
Group by: Database
Source: github.com/Azure/Azure-Sentinel

'Goal: To detect anomalous data exfiltration. This query detects SQL queries that accessed a large number of rows, which is significantly higher than normal for this database. The calculation is made inside recent time window (defined by 'detectionWindow' parameter), and the anomaly is calculated based on previous training window (defined by 'trainingWindow' parameter). The user can set the minimal threshold for anomaly by changing the threshold parameters volThresholdZ and volThresholdQ (higher thresholds will detect only more severe anomalies).'

MITRE ATT&CK coverage

Tactic	Techniques
Collection	`T1213.006` Data from Information Repositories: Databases

Rules detecting the same action

Other rules on this platform that filter on the same API call or operation.

Rule body kusto

id: 9851c360-5fd5-4bae-a117-b66d8476bf5e
name: Response rows stateful anomaly on database
description: |
  'Goal: To detect anomalous data exfiltration. This query detects SQL queries that accessed a large number of rows, which is significantly higher than normal for this database.
   The calculation is made inside recent time window (defined by 'detectionWindow' parameter), and the anomaly is calculated based on previous training window (defined by 'trainingWindow' parameter). The user can set the minimal threshold for anomaly by changing the threshold parameters volThresholdZ and volThresholdQ (higher thresholds will detect only more severe anomalies).'
severity: Medium
requiredDataConnectors:
  - connectorId: AzureSql
    dataTypes:
      - AzureDiagnostics
queryFrequency: 1h
queryPeriod: 14d
triggerOperator: gt
triggerThreshold: 0
status: Available
tactics:
  - Collection
relevantTechniques:
  - T1213.006 
tags:
  - SQL
query: |
    let volumeThresholdZ = 3.0;                     // Minimal threshold for the Zscore to trigger anomaly (number of standard deviations above mean). If set higher, only very significant alerts will fire.
    let volumeThresholdQ = volumeThresholdZ;        // Minimal threshold for the Qscore to trigger anomaly (number of Inter-Percentile Ranges above high percentile). If set higher, only very significant alerts will fire.
    let volumeThresholdHardcoded = 500;             // Minimal value for the volume metric to trigger anomaly.
    let detectionWindow = 1h;                       // The size of the recent detection window for detecting anomalies.  
    let trainingWindow = detectionWindow + 14d;     // The size of the training window before the detection window for learning the normal state.
    let monitoredColumn = 'ResponseRows';           // The name of the column for volumetric anomalies.
    let processedData = materialize (
        AzureDiagnostics
        | where TimeGenerated >= ago(trainingWindow)
        | where Category == 'SQLSecurityAuditEvents' and action_id_s has_any ("RCM", "BCM") // Keep only SQL affected rows
        | project TimeGenerated, PrincipalName = server_principal_name_s, ClientIp = client_ip_s, HostName = host_name_s, ResourceId,
                  ApplicationName = application_name_s, ActionName = action_name_s, Database = strcat(LogicalServerName_s, '/', database_name_s),
                  IsSuccess = succeeded_s, AffectedRows = affected_rows_d,
                  ResponseRows = response_rows_d, Statement = statement_s
        | extend QuantityColumn = column_ifexists(monitoredColumn, 0)
        | extend WindowType = case( TimeGenerated >= ago(detectionWindow), 'detection',
                                               (ago(trainingWindow) <= TimeGenerated and TimeGenerated < ago(detectionWindow)), 'training', 'other')
        | where WindowType in ('detection', 'training'));
    let trainingSet =
        processedData
        | where WindowType == 'training'
        | summarize AvgVal = round(avg(QuantityColumn), 2), StdVal = round(stdev(QuantityColumn), 2), N = count(),
                    P99Val = round(percentile(QuantityColumn, 99), 2), P50Val = round(percentile(QuantityColumn, 50), 2)
          by Database;
    processedData
    | where WindowType == 'detection'
    | join kind = inner (trainingSet) on Database
    | extend ZScoreVal = iff(N >= 20, round(todouble(QuantityColumn - AvgVal) / todouble(StdVal + 1), 2), 0.00),
             QScoreVal = iff(N >= 20, round(todouble(QuantityColumn - P99Val) / todouble(P99Val - P50Val + 1), 2), 0.00)
    | extend IsVolumeAnomalyOnVal = iff((ZScoreVal > volumeThresholdZ and QScoreVal > volumeThresholdQ and QuantityColumn > volumeThresholdHardcoded), true, false), AnomalyScore = round((ZScoreVal + QScoreVal)/2, 0)
    | where IsVolumeAnomalyOnVal == 'true'
    | project TimeGenerated, Database, PrincipalName, ClientIp, HostName, ApplicationName, ActionName, Statement,
              IsSuccess, ResponseRows, AffectedRows, IsVolumeAnomalyOnVal, AnomalyScore, ResourceId
    | extend Name = tostring(split(PrincipalName,'@',0)[0]), UPNSuffix = tostring(split(PrincipalName,'@',1)[0])
entityMappings:
  - entityType: Account
    fieldMappings:
      - identifier: Name
        columnName: Name
      - identifier: UPNSuffix
        columnName: UPNSuffix
  - entityType: IP
    fieldMappings:
      - identifier: Address
        columnName: ClientIp
  - entityType: Host
    fieldMappings:
      - identifier: HostName
        columnName: HostName
  - entityType: CloudApplication
    fieldMappings:
      - identifier: Name
        columnName: ApplicationName
  - entityType: AzureResource
    fieldMappings:
      - identifier: ResourceId
        columnName: ResourceId
alertDetailsOverride:
  alertDisplayNameFormat: 'Response rows anomaly on database {{Database}}'
  alertDescriptionFormat: |
    'An anomaly was detected on database {{Database}} with {{ResponseRows}} response rows in a statement, which is significantly higher than normal for this database. Investigate the database activity for potential malicious data changes or deletions.'
version: 1.1.3
kind: Scheduled

Stages and Predicates

Parameters

let volumeThresholdZ = 3.0;
let volumeThresholdQ = volumeThresholdZ;
let volumeThresholdHardcoded = 500;
let detectionWindow = 1h;
let trainingWindow = detectionWindow + 14d;
let monitoredColumn = 'ResponseRows';

Let binding: `trainingSet`

let trainingSet = processedData
    | where WindowType == 'training'
    | summarize AvgVal = round(avg(QuantityColumn), 2), StdVal = round(stdev(QuantityColumn), 2), N = count(),
                P99Val = round(percentile(QuantityColumn, 99), 2), P50Val = round(percentile(QuantityColumn, 50), 2)
      by Database;

Derived from processedData.

The stages below define let processedData (the rule's main pipeline source).

Stage 1: `source`

AzureDiagnostics

Stage 2: `where`

| where TimeGenerated >= ago(trainingWindow)

Stage 3: `where`

| where Category == 'SQLSecurityAuditEvents' and action_id_s has_any ("RCM", "BCM")

Stage 4: `project`

| project TimeGenerated, PrincipalName = server_principal_name_s, ClientIp = client_ip_s, HostName = host_name_s, ResourceId,
              ApplicationName = application_name_s, ActionName = action_name_s, Database = strcat(LogicalServerName_s, '/', database_name_s),
              IsSuccess = succeeded_s, AffectedRows = affected_rows_d,
              ResponseRows = response_rows_d, Statement = statement_s

Stage 5: `extend`

| extend QuantityColumn = column_ifexists(monitoredColumn, 0)

Stage 6: `extend`

| extend WindowType = case( TimeGenerated >= ago(detectionWindow), 'detection',
                                           (ago(trainingWindow) <= TimeGenerated and TimeGenerated < ago(detectionWindow)), 'training', 'other')

WindowType =

if'detection'

elif/* macro: (ago(trainingWindow) <= TimeGenerated) */'training'

else'other'

Stage 7: `where`

| where WindowType in ('detection', 'training')

The stages below run on processedData (the outer pipeline).

Stage 8: `where`

processedData
| where WindowType == 'detection'

Stage 9: `join`

| join kind = inner (trainingSet) on Database

Stage 10: `extend`

| extend ZScoreVal = iff(N >= 20, round(todouble(QuantityColumn - AvgVal) / todouble(StdVal + 1), 2), 0.00),
         QScoreVal = iff(N >= 20, round(todouble(QuantityColumn - P99Val) / todouble(P99Val - P50Val + 1), 2), 0.00)

QScoreVal =

ifN >= 20round((todouble((QuantityColumn - P99Val)) / todouble(((P99Val - P50Val) + 1))), 2)

else0.0

ZScoreVal =

ifN >= 20round((todouble((QuantityColumn - AvgVal)) / todouble((StdVal + 1))), 2)

else0.0

Stage 11: `extend`

| extend IsVolumeAnomalyOnVal = iff((ZScoreVal > volumeThresholdZ and QScoreVal > volumeThresholdQ and QuantityColumn > volumeThresholdHardcoded), true, false), AnomalyScore = round((ZScoreVal + QScoreVal)/2, 0)

IsVolumeAnomalyOnVal =

if(ZScoreVal > "volumeThresholdZ" and QScoreVal > "volumeThresholdQ") and QuantityColumn > "volumeThresholdHardcoded"true

elsefalse

Stage 12: `where`

| where IsVolumeAnomalyOnVal == 'true'

Stage 13: `project`

| project TimeGenerated, Database, PrincipalName, ClientIp, HostName, ApplicationName, ActionName, Statement,
          IsSuccess, ResponseRows, AffectedRows, IsVolumeAnomalyOnVal, AnomalyScore, ResourceId

Stage 14: `extend`

| extend Name = tostring(split(PrincipalName,'@',0)[0]), UPNSuffix = tostring(split(PrincipalName,'@',1)[0])

Stage 15: `summarize`

summarize by Database

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

Field	Kind	Values
`Category`	eq	`SQLSecurityAuditEvents` transforms: `cased`
`IsVolumeAnomalyOnVal`	eq	`true` transforms: `cased`
`WindowType`	eq	`detection` transforms: `cased` `training` transforms: `cased`
`WindowType`	in	`detection` transforms: `cased` `training` transforms: `cased`
`action_id_s`	match	`BCM` `RCM`

Output fields

Fields the rule emits when it matches. Chronicle authors list these in the outcome block; they appear on the detection and $risk_score drives alerting. Sentinel / Defender XDR rules build them up through project / summarize / extend stages. Sentinel maps these into alert fields via entityMappings and customDetails; Defender XDR custom detections surface them as alert fields directly.

Field	Source
`Database`	`summarize`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)

Response rows stateful anomaly on database

MITRE ATT&CK coverage

Rules detecting the same action

Rule body kusto

Stages and Predicates

Parameters

Let binding: `trainingSet`

Stage 1: `source`

Stage 2: `where`

Stage 3: `where`

Stage 4: `project`

Stage 5: `extend`

Stage 6: `extend`

Stage 7: `where`

Stage 8: `where`

Stage 9: `join`

Stage 10: `extend`

Stage 11: `extend`

Stage 12: `where`

Stage 13: `project`

Stage 14: `extend`

Stage 15: `summarize`

Indicators

Output fields

Keyboard shortcuts

Search operators

Response rows stateful anomaly on database

MITRE ATT&CK coverage

Rules detecting the same action

Rule body kusto

Stages and Predicates

Parameters

Let binding: trainingSet

Stage 1: source

Stage 2: where

Stage 3: where

Stage 4: project

Stage 5: extend

Stage 6: extend

Stage 7: where

Stage 8: where

Stage 9: join

Stage 10: extend

Stage 11: extend

Stage 12: where

Stage 13: project

Stage 14: extend

Stage 15: summarize

Indicators

Output fields

Let binding: `trainingSet`

Stage 1: `source`

Stage 2: `where`

Stage 3: `where`

Stage 4: `project`

Stage 5: `extend`

Stage 6: `extend`

Stage 7: `where`

Stage 8: `where`

Stage 9: `join`

Stage 10: `extend`

Stage 11: `extend`

Stage 12: `where`

Stage 13: `project`

Stage 14: `extend`

Stage 15: `summarize`