Office 365 and Azure Data Availability

integrations cloud microsoft office 365 azure

Data Availability and Collection Times ⫘

Secureworks® Taegis™ XDR ingest time is a representation of the time taken to collect, normalize, and store data. Variances between timeframes on original log data and the time of XDR ingest may include delays in availability of the data from the source API. All Azure and Office 365 data collection and resulting data availability follow these standards:

Data is polled on a recurring basis known as a polling interval. Polling intervals define the number of minutes the collection application waits between requests to the API for data. Each polling interval is queried from the number of minutes equal to the polling interval to current time, as well as four hours in the past.
- This collection ensures every minute on the clock is queried twice, once within the polling interval window and one four hours in the past. This is to account for late-arriving data from Microsoft (e.g., data that is not available within the minute it is first queried for but becomes available at a later time). If data arrives later than four hours from when it occurred, it is not collected. Duplicate data is deduplicated by the XDR platform, meaning only the first occurrence of a log is stored.
- The interval used is based on observed data availability from Microsoft APIs from the variety of customers collected with XDR. The goal of the polling interval is to ensure data is captured as quickly as possible without exceeding the rate limit of a given API. Polling intervals are subject to change as data availability changes are observed to be different.
Data transfer and transformation between XDR components occurs on a streaming or constant basis. Data transfer may take up to 10 minutes, but is typically completed in under one minute.
Writing data is done on a batch basis that occurs once every four minutes, but may take up to 10 minutes.
Data is indexed and available for search within 60 to 90 minutes, but may be available sooner.
Alerts generated from data depend on the nature of the detector being used. For more information on XDR Detectors, see Detector Overview. Alerting is not dependent on search or index.

Important

Users may observe a disparity between the event/alert timestamps of Event Create Time and Ingest Time when activity logs are made available in historic times by the API. This is because Event Create Time is based on the observed timestamp from the original data, and Ingest Time is based on when Secureworks collected and normalized the log from the API.

Data Collection Variables ⫘

Microsoft Office 365 Management API ⫘

The Office 365 Management Activity API allows several variables for input when querying data. This section describes how these variables are used when collecting data.

Polling Interval: One minute
Content Type — XDR uses the following content types: Audit.AzureActiveDirectory, Audit.Exchange, Audit.SharePoint, Audit.General, and DLP.All.
startTime and endTime — For each polling interval, startTime is always set to current time minus the polling interval and endTime is set to current time, as well as startTime is set to four hours prior minus the polling interval and endTime is set to four hours prior. This is done because Microsoft Management Activity API is known to publish or make historical data available at a delay. Because of this, Secureworks always queries for data a second time at four hours ago, and removes activity logs that have already been recorded from prior queries.

Microsoft Graph API—Alerts Resource ⫘

alert resource type of the Microsoft Graph API allows several variables for input when querying data. The following describes how these variables are used when collecting data.

Polling Interval: One minute
List alerts is used to retrieve a list of available alerts for a given timeframe.
After a list of IDs is returned, Get alert is used to retrieve each new alert. Listing alerts leverages the OData filter of lastModifiedDateTime. For each polling interval, alerts are queried for lastModifiedTime less than or equal to (le) the current time, and greater than or equal to (ge) the current time minus the polling interval, as well as lastModifiedTime less than or equal to four hours prior, and greater than or equal to four hours prior minus the polling interval. This is done because Microsoft Graph API alerts resource is known to publish or make historical data available at a delay; because of this, Secureworks always queries for data a second time at four hours ago, and removes activity logs that have already been recorded from prior queries.

Microsoft Graph API—Directory Audit Resource ⫘

The directoryAudit resource type of the Microsoft Graph API allows several variables for input when querying data. The following describes how these variables are used when collecting data.

Polling Interval: 10 minutes
list directoryAudits is used to retrieve a list of available directory audit logs for a given timeframe.
Once a list of IDs is returned, get directoryAudit is used to retrieve each new log. Listing alerts leverages the OData filter of activityDateTime. For each polling interval, audit logs are queried for activityDateTime less than or equal to (le) the current time and greater than or equal to (ge) the current time minus the polling interval, as well as less than or equal to four hours prior and greater than or equal to four hours prior minus the polling interval . This is done because Microsoft Graph API is known to publish or make historical data available at a delay; because of this, Secureworks always queries for data a second time at four hours ago, and removes activity logs that have already been recorded from prior queries.

Microsoft Graph API—Sign In Resource ⫘

The sign-in resource of the Microsoft Graph API allows several variables for input when querying data. The following describes how these variables are used when collecting data.

Polling Interval: 10 minutes
list signIns is used to retrieve a list of available signin logs for a given timeframe.
Once a list of IDs is returned, get signIn is used to retrieve each new log. Listing alerts leverages the an OData filter of createdDateTime. For each polling interval, logs are queried for createdDateTime less than or equal to (le) the current time and greater than or equal to (ge) the current time minus the polling interval, as well as less than or equal to four hours prior and greater than or equal to four hours prior minus the polling interval. This is done because Microsoft Graph API signin resource is known to publish or make historical data available at a delay; because of this, Secureworks always queries for data a second time at four hours ago, and removes activity logs that have already been recorded from prior queries.

Microsoft Azure Active Directory Activity Reports ⫘

Data is polled on a recurring basis using two configuration parameters—polling interval and lag time.

Polling interval: 10 minutes
Polling interval specifies the frequency with which the API is queried for data (i.e., every 10 minutes the API queries for a 10-minute window of data).
Lag time specifies how far back the ingester goes from the current time when the data is queried. There are multiple lag times so data is queried more frequently. The lag times are one minute, 10 minutes, 20 minutes, 30 minutes, 40 minutes, one hour, two hours and four hours in the past.

Microsoft Azure Monitor API—Activity Log Resource ⫘

Polling Interval: 30 minutes
list Activity Logs is used to retrieve all activity logs within a given timeframe
Logs are queried for using the eventTimestamp value, where each polling interval logs are queried from the current time minus the polling interval to the current time as well as four hours prior minus the polling interval to four hours prior. This is done because Microsoft Azure Monitor API is known to publish or make historical data available at a delay; because of this, Secureworks always queries for data a second time at four hours ago, and removes activity logs that have already been recorded from prior queries.

Data Collection Content and Accuracy ⫘

In all collection cases, variables used when collecting data are only altered with the purposes of exposing additional logs. Data contained within a log is stored in the original_data field, and no input variables are used that alter the content of a response or original log. If users experience anomalies in the content of a JSON/log, Secureworks recommend working with the vendor to determine why the log is malformed as stored or returned by the respective API.

Additional References ⫘

Azure Active Directory reporting latencies
Search the audit log in the compliance center—includes data availability and latency information for Microsoft 365

Data Availability and Collection Times
Data Collection Variables
Microsoft Office 365 Management API
Microsoft Graph API—Alerts Resource
Microsoft Graph API—Directory Audit Resource
Microsoft Graph API—Sign In Resource
Microsoft Azure Active Directory Activity Reports
Microsoft Azure Monitor API—Activity Log Resource
Data Collection Content and Accuracy
Additional References