Suspicious access of BEC related documents in AWS S3 buckets

Severity: medium
Time window: 14d
Group by: UserIdentityPrincipalid
Source: github.com/Azure/Azure-Sentinel

'This query looks for users with suspicious spikes in the number of files accessed that relate to topics commonly accessed as part of Business Email Compromise (BEC) attacks. The query looks for access to files in AWS S3 storage that relate to topics such as invoices or payments, and then looks for users accessing these files in significantly higher numbers than in the previous 14 days. Incidents raised by this analytic should be investigated to see if the user accessing these files should be accessing them, and if the volume they accessed them at was related to a legitimate business need. This query contains thresholds to reduce the chance of false positives, these can be adjusted to suit individual environments. In addition false positives could be generated by legitimate, scheduled actions that occur less often than every 14 days, additional exclusions can be added for these actions on username or IP address entities.'

MITRE ATT&CK coverage

Tactic	Techniques
Collection	`T1530` Data from Cloud Storage

Rules detecting the same action

Other rules on this platform that filter on the same API call or operation.

Rule body kusto

id: f3e2d35f-1202-4215-995c-4654ef07d1d8
name: Suspicious access of BEC related documents in AWS S3 buckets
description: |
  'This query looks for users with suspicious spikes in the number of files accessed that relate to topics commonly accessed as part of Business Email Compromise (BEC) attacks.
  The query looks for access to files in AWS S3 storage that relate to topics such as invoices or payments, and then looks for users accessing these files in significantly higher numbers than in the previous 14 days. Incidents raised by this analytic should be investigated to see if the user accessing these files should be accessing them, and if the volume they accessed them at was related to a legitimate business need. 
  This query contains thresholds to reduce the chance of false positives, these can be adjusted to suit individual environments. In addition false positives could be generated by legitimate, scheduled actions that occur less often than every 14 days, additional exclusions can be added for these actions on username or IP address entities.'
severity: Medium
requiredDataConnectors:
  - connectorId: AWS
    dataTypes:
      - AWSCloudTrail
queryFrequency: 1d
queryPeriod: 14d
triggerOperator: gt
triggerThreshold: 0
tactics:
  - Collection
relevantTechniques:
  - T1530
eventGroupingSettings:
  aggregationKind: SingleAlert
query: |
  let BEC_Keywords = dynamic([ 'invoice','payment','paycheck','transfer','bank statement','bank details','closing','funds','bank account','account details','remittance','purchase','deposit',"PO#","Zahlung","Rechnung","Paiement", "virement bancaire","Bankuberweisung",'hacked','phishing']);
  // Adjust this threshold based on your environment
  let sensitivity = 2.5;
  let Events = materialize(AWSCloudTrail
  | where TimeGenerated between (ago(14d)..ago(0d))
  | where UserIdentityAccountId != "anonymous"
  | where EventSource startswith "s3."
  | where EventName =~ "GetObject"
  | extend FilePath = tostring(parse_json(RequestParameters).key)
  | where FilePath has_any(BEC_Keywords)
  );
  Events
  | summarize dcount(FilePath) by UserIdentityPrincipalid, bin(startofday(TimeGenerated), 1d)
  | summarize CountOfDocs = make_list(dcount_FilePath, 10000), TimeStamp = make_list(TimeGenerated, 10000) by UserIdentityPrincipalid
  | extend (Anomalies, Score, Baseline) = series_decompose_anomalies(CountOfDocs, sensitivity, -1, 'linefit')
  | mv-expand CountOfDocs to typeof(double), TimeStamp to typeof(datetime), Anomalies to typeof(double),Score to typeof(double), Baseline to typeof(long)
  | where Anomalies > 0
  | project TimeStamp, CountOfDocs, Baseline, Score, Anomalies, UserIdentityPrincipalid
  | join kind=inner(Events | extend TimeStamp = startofday(TimeGenerated)) on TimeStamp, UserIdentityPrincipalid
  | extend Name = iif(UserIdentityUserName contains "@", split(UserIdentityUserName, "@")[0], UserIdentityUserName)
  | extend UPNSuffix = iif(UserIdentityUserName contains "@", split(UserIdentityUserName, "@")[1], "")
  | project-reorder TimeGenerated, UserIdentityType, UserIdentityPrincipalid, UserIdentityUserName, FilePath, EventName, UserAgent, SourceIpAddress, CountOfDocs, Baseline, Score
entityMappings:
  - entityType: Account
    fieldMappings:
      - identifier: FullName
        columnName: UserIdentityUserName
      - identifier: Name
        columnName: Name
      - identifier: UPNSuffix
        columnName: UPNSuffix
  - entityType: IP
    fieldMappings:
      - identifier: Address
        columnName: SourceIpAddress
  - entityType: File
    fieldMappings:
      - identifier: Name
        columnName: FilePath
customDetails:
  UserType: UserIdentityType
  Event: EventName
  UserAgent: UserAgent
alertDetailsOverride:
  alertDisplayNameFormat: Suspicious access of {{CountOfDocs}} BEC related documents in AWS S3 buckets by {{UserIdentityUserName}}
  alertDescriptionFormat: |
    This query looks for users (in this case {{UserIdentityUserName}}) with suspicious spikes in the number of files accessed (in this case {{CountOfDocs}})that relate to topics commonly accessed as part of Business Email Compromise (BEC) attacks. The query looks for access to files in AWS S3 storage that relate to topics such as invoices or payments, and then looks for users accessing these files in significantly higher numbers than in the previous 14 days. Incidents raised by this analytic should be investigated to see if the user accessing these files should be accessing them, and if the volume they accessed them at was related to a legitimate business need. 
    This query contains thresholds to reduce the chance of false positives, these can be adjusted to suit individual environments. In addition false positives could be generated by legitimate, scheduled actions that occur less often than every 14 days, additional exclusions can be added for these actions on username or IP address entities.
version: 1.0.4
kind: Scheduled

Stages and Predicates

Parameters

let sensitivity = 2.5;

Let binding: `BEC_Keywords`

let BEC_Keywords = dynamic([ 'invoice','payment','paycheck','transfer','bank statement','bank details','closing','funds','bank account','account details','remittance','purchase','deposit',"PO#","Zahlung","Rechnung","Paiement", "virement bancaire","Bankuberweisung",'hacked','phishing']);

The stages below define let Events (the rule's main pipeline source).

Stage 1: `source`

AWSCloudTrail

Stage 2: `where`

| where TimeGenerated between (ago(14d)..ago(0d))

Stage 3: `where`

| where UserIdentityAccountId != "anonymous"

Stage 4: `where`

| where EventSource startswith "s3."

Stage 5: `where`

| where EventName =~ "GetObject"

Stage 6: `extend`

| extend FilePath = tostring(parse_json(RequestParameters).key)

Stage 7: `where`

| where FilePath has_any(BEC_Keywords)

References BEC_Keywords (defined above).

The stages below run on Events (the outer pipeline).

Stage 8: `summarize`

Events
| summarize dcount(FilePath) by UserIdentityPrincipalid, bin(startofday(TimeGenerated), 1d)

Stage 9: `summarize`

| summarize CountOfDocs = make_list(dcount_FilePath, 10000), TimeStamp = make_list(TimeGenerated, 10000) by UserIdentityPrincipalid

The stages below score time-series anomalies (make-series, series_decompose_anomalies).

Stage 10: `extend`

| extend (Anomalies, Score, Baseline) = series_decompose_anomalies(CountOfDocs, sensitivity, -1, 'linefit')

Stage 11: `mv-expand`

| mv-expand CountOfDocs to typeof(double), TimeStamp to typeof(datetime), Anomalies to typeof(double),Score to typeof(double), Baseline to typeof(long)

Stage 12: `where`

| where Anomalies > 0

Stage 13: `project`

| project TimeStamp, CountOfDocs, Baseline, Score, Anomalies, UserIdentityPrincipalid

Stage 14: `join`

| join kind=inner(Events | extend TimeStamp = startofday(TimeGenerated)) on TimeStamp, UserIdentityPrincipalid

Stage 15: `extend`

| extend Name = iif(UserIdentityUserName contains "@", split(UserIdentityUserName, "@")[0], UserIdentityUserName)

Stage 16: `extend`

| extend UPNSuffix = iif(UserIdentityUserName contains "@", split(UserIdentityUserName, "@")[1], "")

Stage 17: `project-reorder`

| project-reorder TimeGenerated, UserIdentityType, UserIdentityPrincipalid, UserIdentityUserName, FilePath, EventName, UserAgent, SourceIpAddress, CountOfDocs, Baseline, Score

Indicators

Each row is a field, operator, and value that the rule matches. The corpus column counts how many other rules in the catalog look for the same combination: high numbers point to widely-used, community-vetted indicators. Blank or 1 shows that the indicator is specific to this rule.

Field	Kind	Values
`Anomalies`	gt	`0` transforms: `cased`
`EventName`	eq	`GetObject`
`EventSource`	starts_with	`s3.`
`FilePath`	match	`Bankuberweisung` `PO#` `Paiement` `Rechnung` `Zahlung` `account details` `bank account` `bank details` `bank statement` `closing` `deposit` `funds` `hacked` `invoice` `paycheck` `payment` `phishing` `purchase` `remittance` `transfer` `virement bancaire`
`UserIdentityAccountId`	ne	`anonymous` transforms: `cased`

Output fields

Fields the rule emits when it matches. Chronicle authors list these in the outcome block; they appear on the detection and $risk_score drives alerting. Sentinel / Defender XDR rules build them up through project / summarize / extend stages. Sentinel maps these into alert fields via entityMappings and customDetails; Defender XDR custom detections surface them as alert fields directly.

Field	Source
`Anomalies`	`project`
`Baseline`	`project`
`CountOfDocs`	`project`
`Score`	`project`
`TimeStamp`	`project`
`UserIdentityPrincipalid`	`project`
`Name`	`extend`
`UPNSuffix`	`extend`

`j` / `k`	Scroll down / up
`d` / `u`	Half-page down / up
`gg` / `G`	Top / bottom
`h` / `l`	History back / forward
`f`	Follow link (`Shift` = new tab)
`/`	Focus search
`?`	Toggle this help
`↑` / `↓`	Navigate search results
`Enter`	Open highlighted result
`Esc`	Close results / dialog

`type:`	`events` / `rules` / `providers`
`vendor:`	`sigma` / `elastic` / `splunk` / `kusto` / `chronicle` (vendor name alone also works: `sigma:`, `kql:`, `secops:`…)
`tactic:`	TA-id, slug, or name: `credential_access`, `TA0006`
`technique:`	technique or sub-technique ID: `T1003`, `T1003.001` (alias `tech:`)
`severity:`	`critical` / `high` / `medium` / `low` / `informational` (alias `sev:`)
`risk_score`	Numeric comparison on the Elastic risk score (0 to 100): `risk_score>50`, `risk_score<=20`, `risk_score=99` (alias `risk`; Elastic rules only)
`stages:`	Rules with exactly N pipeline stages
`correlation:`	`single_event` / `sequence` / `alternatives` / `alternatives_cross_log` / `all_required` / `correlated`
`with:`	Co-occurrence event-id; stacks (`with:4624 with:4769`) to require all, while a comma list in one occurrence (`with:4624,4769`) is an either-or group. Implies multi-event
`like:`	Structural neighbors of a rule slug (equivalents + subsumption stricter / broader): `like:comsvcs_lsass_memory_dump-splunk-sysmon`
`groupby:`	Entity-grouping substring match against `group_by_keys`: `groupby:user`, `groupby:host`
`uses:`	Rules whose predicate tree touches the field (any kind, any value): `uses:CommandLine`
`excludes:`	Rules with top-level `not()` clauses on the field (FP whitelists): `excludes:ParentImage`
`field:` / `value:`	Predicate search; narrows rule cards to those with a matching leaf and drives the indicator tier. Unquoted = substring, wildcards allowed (`value:mimikatz`)
`indicator:`	Shorthand for `field:F value:V`: `indicator:Image=*\powershell.exe`
`kind:`	Filter by predicate kind. Narrows rule cards to those carrying a matching predicate leaf (`vendor:elastic kind:cidr_match`) and drives the indicator tier: `contains` / `starts_with` / `ends_with` / `regex` / `cidr` / `eq` / `in` … (operator aliases `op:`/`match:`)
`has:` / `no:`	`sample`, `field`, `notes`, `refs`, `trace`, `thirdparty`, `rule`, `pattern`, `timewindow`, `threshold`, `newterms`, `sigma`/`elastic`/`splunk`/`kusto`/`chronicle`
`-op:val`	Exclude matches; works on most operators but not `type:`/`like:`/`has:`/`no:` (use `no:<flag>` to exclude a rule flag): `tactic:execution -vendor:splunk`. Standalone `-kind:`/`-field:`/`-value:` drop every rule carrying a matching predicate leaf (`type:rules -kind:is_null`)
`field:"…"` / `value:"…"`	Quoted value = anchored exact match (also allows spaces): `value:"net user"`
`a,b`	Comma = OR inside one operator (`vendor:sigma,elastic`, `severity:high,critical`); repeating a facet merges the same way. `field:`/`value:` never split (literal commas)
`vendors:` / `stage:`	Singular and plural spellings fold to the canonical operator and value: `tactics:` = `tactic:`, `type:event` = `type:events`, `correlation:sequences` = `correlation:sequence`, `has:thresholds` = `has:threshold`
`"quoted phrase"`	Exact-match a multi-word phrase (free text)

Suspicious access of BEC related documents in AWS S3 buckets

MITRE ATT&CK coverage

Rules detecting the same action

Rule body kusto

Stages and Predicates

Parameters

Let binding: `BEC_Keywords`

Stage 1: `source`

Stage 2: `where`

Stage 3: `where`

Stage 4: `where`

Stage 5: `where`

Stage 6: `extend`

Stage 7: `where`

Stage 8: `summarize`

Stage 9: `summarize`

Stage 10: `extend`

Stage 11: `mv-expand`

Stage 12: `where`

Stage 13: `project`

Stage 14: `join`

Stage 15: `extend`

Stage 16: `extend`

Stage 17: `project-reorder`

Indicators

Output fields

Keyboard shortcuts

Search operators

Suspicious access of BEC related documents in AWS S3 buckets

MITRE ATT&CK coverage

Rules detecting the same action

Rule body kusto

Stages and Predicates

Parameters

Let binding: BEC_Keywords

Stage 1: source

Stage 2: where

Stage 3: where

Stage 4: where

Stage 5: where

Stage 6: extend

Stage 7: where

Stage 8: summarize

Stage 9: summarize

Stage 10: extend

Stage 11: mv-expand

Stage 12: where

Stage 13: project

Stage 14: join

Stage 15: extend

Stage 16: extend

Stage 17: project-reorder

Indicators

Output fields

Let binding: `BEC_Keywords`

Stage 1: `source`

Stage 2: `where`

Stage 3: `where`

Stage 4: `where`

Stage 5: `where`

Stage 6: `extend`

Stage 7: `where`

Stage 8: `summarize`

Stage 9: `summarize`

Stage 10: `extend`

Stage 11: `mv-expand`

Stage 12: `where`

Stage 13: `project`

Stage 14: `join`

Stage 15: `extend`

Stage 16: `extend`

Stage 17: `project-reorder`