🌙
 

Subscribe to the Taegis™ XDR Documentation RSS Feed at .

Learn more about RSS readers or RSS browser extensions.

Custom Parser Syntax

custom parsers integrations

Important

The regular expression syntax supported by Taegis XDR Custom Parsers is the Golang variant.

Statements

!SAMPLE=...

A sample message. Everything to the right of the = is interpreted literally, all the way up to a newline. This field is optional, but strongly encouraged.

!SCHEMA=...

This is the schema for this message type, for example scwx.nids, scwx.netflow, scwx.auth. If not specified, the schema from the parent or closest ancestor is used.

!CONFIRMWITH

This is either PATTERN or EXPRESSION. This works in tandem with !CONFIRMSTRING to determine if a message matches this parser. If set to PATTERN, then CONFIRMSTRING is a regex pattern. If set to EXPRESSION, then CONFIRMSTRING is an expression that evaluates to True/False.

!CONFIRMSTRING=

See !CONFIRMWITH.

!DISABLED=

This disables the parser. The parser is completely removed from the runtime catalog. This is useful when you don’t yet know how to handle a message but want to capture minimal documentation of its existence.

!IMPORT=

Import another parser into this parser at the current line. Variables are shared between the importing and imported parser. This allows repeating lines of parser code to be consolidated into one place.

!IMPORTONLY

Indicates that this parser is only for import (via !IMPORT). With extremely rare exceptions, all imported parsers should be !IMPORTONLY. This flag exempts the parser from many validation rules (For example, it doesn’t have to have a parent parser, no CONFIRMWITH/CONFIRMSTRING, etc.)

!TRIMALLOFF

This disables the default behavior of running TRIM_ALL() for all parsers. In some cases, this causes problems as TRIM_ALL() removes leading or trailing braces ({ and } and also [ and ]), which leads to incorrect data for Json fields.

!SANITIZEALLOFF

This disables the default behavior of running SANITIZE_ALL() for all parsers.

Functions

SPLIT(data, delimiter, makeGreedy)

Splits data into tokens separated by delimiter. For example, if (optional) makeGreedy is "true" then the data of 0,,2 with a delimiter of , is evaluated to [0,2] instead of [0,'',2].

Example

data = "aaa,bbb,ccc,eee"
values1 = SPLIT(data, ",", FALSE)
OUTPUT1$ = values1[3]

#OUTPUT1$: eee (String)
data = "aaa,bbb,ccc,,eee"
values1 = SPLIT(data, ",", FALSE)
values2 = SPLIT(data, ",", TRUE)
OUTPUT1$ = values1[3]
OUTPUT2$ = values2[3]

#OUTPUT1$: NULL (null)
#OUTPUT2$: eee (String)

SPLIT_NAME_VALUES(data, delimiter, separator, quoteChar)

Splits data into a collection of name/value pairs where delimiter separates the pairs and separator separates the name vs value. quotechar indicates the character for quoting the value.

Example

data = "User: Unknown, InitiatorPackets: 2, ResponderPackets: 1, InitiatorBytes: 120, ResponderBytes: 66"
dict = SPLIT_NAME_VALUES(data, ",", ":", "\\")
OUTPUT$ = dict["InitiatorBytes"]

# OUTPUT$: 120 (String)

JSON(data)

Converts data into a json object that can be accessed with square brackets containing a json path. See https://goessner.net/articles/JsonPath/ and https://github.com/ohler55/ojg.

Example

data= "{ \"store\": { \"book\": [ { \"category\": \"reference\", \"author\": \"Nigel Rees\", \"title\": \"Sayings of the Century\", \"price\": 8.95 }, { \"category\": \"fiction\", \"author\": \"Evelyn Waugh\", \"title\": \"Sword of Honour\", \"price\": 12.99 }, { \"category\": \"fiction\", \"author\": \"Herman Melville\", \"title\": \"Moby Dick\", \"isbn\": \"0-553-21311-3\", \"price\": 8.99 }, { \"category\": \"fiction\", \"author\": \"J.R. R. Tolkien\", \"title\": \"The Lord of the Rings\", \"isbn\": \"0-395-19395-8\", \"price\": 22.99 } ], \"bicycle\": { \"color\": \"red\", \"price\": 19.95 } } }"
json= JSON(data)
OUTPUT$ = json["$.store.book[*].author"]

# OUTPUT$: "Nigel Rees","Evelyn Waugh","Herman Melville","J. R. R. Tolkien" (String)

Example usage for JSON keys that contain dots:

data= "{ \"store\": { \"book\": [ { \"id.category\": \"reference\" } ] } }"
json= JSON(data)
OUTPUT$ = json["$.store.book[0][\"id.category\"]"]

# OUTPUT$: reference(String)

CEF(data)

Parses data as a CEF-formatted message. The header fields can be accessed with an integer and the named fields can be accessed by name.

Example

!SAMPLE=Nov 6 07:49:03 10.42.0.1 %helloWorld: CEF:0|Check Point|VPN-1 & FireWall-1|Check Point|Log|Address spoofing|Unknown|act=Drop cs3Label=Protection Type cs3=IPS

values = CEF(originalData$)
OUTPUT1$= values[2]
OUTPUT2$= values["act"]
OUTPUT3$= values["Protection Type"]

# OUTPUT1$: VPN-1 & FireWall-1 (String)
# OUTPUT2$: Drop (String)
# OUTPUT3$: IPS (String)

LEEF(data, delimiterOverride)

Parses data as a LEEF-formatted message. The header fields can be accessed with an integer and the named fields can be accessed by name. Optionally, a delimiter override may be specified. LEEF extensions should be either tab-separated or they should indicate an alternate delimiter in field 6 of the header. The override parameter should be used when you know that a device is not compliant with the standard.

DATETIME(data, fmt, handle2DigitYear)

Converts a string to a time value for fields like EventTimeUsec$. Also accepts time.Parse format strings (optional). If handle2digitYear is TRUE, an appropriate year is chosen; usually the current year with an edge case around the new year.

Example

data = "Sep 21 2018 17:35:54"
OUTPUT1$ = DATETIME(data, "Jan 02 2006 15:04:05")
OUTPUT2$ = data

# OUTPUT1$: 2018-09-21 17:35:54 +0000 UTC (time)
# OUTPUT2$: Sep 21 2018 17:35:54 (String)

IS_PRIVATE_IP(string)

Returns boolean if the passed in (IP address) string is in the private IP range. Currently only supports IPv4 and tests against the private IP ranges defined in RFC1918.

Example

data1 = "10.0.0.1"
data2 = "11.0.0.1"
OUTPUT1$ = IS_PRIVATE_IP(data1)
OUTPUT2$ = IS_PRIVATE_IP(data2)

# OUTPUT1$: true (bool)
# OUTPUT2$: false (bool)

IS_VALID_IP(string)

Returns boolean if the passed in string is a valid IP address, leveraging net.ParseIP.

Example

data1 = "10.0.0.1"
data2 = "999.255.255.255"
data3 = "2001:0db8:85a3:0000:0000:8a2e:0370:7334"
OUTPUT1$ = IS_VALID_IP(data1)
OUTPUT2$ = IS_VALID_IP(data2)
OUTPUT3$ = IS_VALID_IP(data3)

# OUTPUT1$: true (bool)
# OUTPUT2$: false (bool)
# OUTPUT3$: true (bool)

REPLACE(data, oldString, newString)

Replaces all occurrences of oldString with newString.

Example

data = "aaaBBBaaaCCC"
OUTPUT$ = REPLACE(data, "aaa", "zzz")

# OUTPUT$: zzzBBBzzzCCC (String)

REPLACE_REGEX(data, pattern, newString)

Replaces all occurrences of oldPattern with newString.

Example

data = "aaaBBBaaaCCC"
OUTPUT$ = REPLACE_REGEX(data, "a+", "z")

# OUTPUT$: zBBBzCCC (String)

STRLEN(string)

Returns the length of the passed in string. On error, returns -1; if NULL Type passed, returns 0 (zero, no error).

Example

data = "1234567890"
OUTPUT$ = STRLEN(data)

# OUTPUT$: 10 (int)

UPPERCASE(string)

Returns the passed in string with all Unicode letters mapped to their upper case; just an interface/wrapper for strings.ToUpper().

Example

data = "aaabbbccc acme"
OUTPUT$ = UPPERCASE(data)

# OUTPUT$: AAABBBCCC ACME (String)

LOWERCASE(string)

Returns the passed in string with all Unicode letters mapped to their lower case; just an interface/wrapper for strings.ToLower().

Example

data = "AAABBBCCC ACME"
OUTPUT$ = LOWERCASE(data)

# OUTPUT$: aaabbbccc acme (String)

SANITIZE_ALL()

Cleans up null/empty values in event field variables. For example, all of these are set to null: " ", "N/A", "n/a", "null", "nil", "-". This function is run by default on all parsers unless disabled with !SANITIZEALLOFF.

Example

data = "N/A"
OUTPUT$ = data

# OUTPUT$: NULL (null)
!SANITIZEALLOFF
data = "N/A"
OUTPUT$ = data

# OUTPUT$: N/A (String)

TRIM(data)

Removes whitespace, quotes, braces etc.

Example

data = " aaa bbb bcc "
OUTPUT1$ = "---" + data
OUTPUT2$ = "---" + TRIM(data)

# OUTPUT1$: --- aaa bbb bcc (String)
# OUTPUT2$: ---aaa bbb bcc (String)

TRIM_ALL()

Removes whitespace from the beginning/end of all event field variables. This function is run by default on all parsers unless disabled with !TRIMALLOFF.

Example

!TRIMALLOFF
data = " aaa bbb bcc "
OUTPUT2$ = data

# OUTPUT2$: aaa bbb bcc (String)

ADDFIELD(collection, fieldName, fieldValues)

Adds a field to an array of objects. The values of the field for each object are specified by fieldValues (also an array). The name of the new field is specified by fieldName. If collection is NULL, a new array of objects is created, each with a single field (fieldName) with the provided values.

Example

keys = ["httpSourceName", "httpSourceId"]
values = [json["$.httpSourceName"], json["$.httpSourceId"]]

eventMetadata$.record$ = ADDFIELD(NULL, "key$", keys)
eventMetadata$.record$ = ADDFIELD(eventMetadata$.record$, "value$", values)

# event_metadata = {
#     "httpSourceName": json["$.httpSourceName"]
#     "httpSourceId": json["$.httpSourceId"]
# }

URL_PARSE(url, silent)

Parse a URL.

Tip

For more on working with parsing, see Creating, Editing, and Enabling a Custom Parser in XDR.

If silent is true, this does not throw an error in the case of the URL being invalid, and instead nulls all fields. For badly formatted URLs, it always attempts to extract as much as possible. The expected passed in URL format is one of:

scheme:opaque?query#fragment
scheme://userinfo@host/path?query#fragment

Examples

http://user:password@192.1.1.1:8080/1/asdfasdfasdf.html?key=value&key2=value2#topOfTheMorning
hTtps://Example.com:443/here//is/path.html?a=1+6&x=%2f%2Fkey=%41%0Avalue&b=ddd#top
https://example.com/foo/bar/bar/../baz.html?a=1&b=2
example.com/foo/bar/bar/../baz.html?a=1&b=2

If the scheme is not provided (for example, example.com/index.html instead of http://example.com/index.html), then http is assumed and returned in the scheme value.

The resulting collection object contains the following values, if possible, given the URL:

Examples

data = "hTtps://Example.com:443/here//is/path.html?a=1+6&x=%2f%2Fkey=%41%0Avalue&b=ddd#top"
urlParts = URL_PARSE(data, FALSE)
OUTPUT$ = urlParts["path_raw"]

# OUTPUT$: /here//is/path.html (String)

CONTAINS(string, substring)

Wraps golang's strings.Contains(string,subString), returns a bool.

Example

data     = "aaabbbccc acme"
OUTPUT1$ = CONTAINS(data, "roadrunner")
OUTPUT2$ = CONTAINS(data, "acme")

# OUTPUT1$: false    (bool)
# OUTPUT2$: true    (bool)

IDX_OF_TLD(string)

Returns an int64 that signals where in the string the top-level domain is at for indexOfTopPrivateDomain$. If -1 is returned set IsTopPrivateDomainParsed$ to false, otherwise set IsTopPrivateDomainParsed$ to true.

Example

OUTPUT0$ = IDX_OF_TLD("aaa http://example.com")
OUTPUT1$ = IDX_OF_TLD("http://example.com")
OUTPUT2$ = IDX_OF_TLD("")

# OUTPUT0$: 0    (int)
# OUTPUT1$: 0    (int)
# OUTPUT2$: -1    (int)

PARSE_ERROR(string,string)

ParseError explicitly raises an error in the parser .parameters[0] errText if coercion.EvaluateAsString() is passed. parameter[1] is an optional string that is cast into a boolean via ParserValue.BoolValue() to denote if a generic event should be created. It defaults to true if not provided. The message does not normalize to any other schema.

Examples

Creates a Generic Event
test = IF someVal != "Expected_value" THEN PARSE_ERROR("bad data received") ELSE "ok"
Doesn’t Create a Generic Event
tenantId$ = TENANT_LOOKUP("ngav_id", vals["Account"], PARSE_ERROR("Unable to find Taegis tenant id for Deep Armor account " + vals["Account"],"False"))

TENANT_LOOKUP(label, value, default)

Looks up the tenant id in Taegis Tenant Manager based on a label and a value from the message. If no tenant is found, the specified default expression is evaluated. Note that if a tenant is found, the third parameter is not evaluated. This gives the caller the option to provide a default value or to use the PARSE_ERROR() function to raise an error.

Example

tenantId$ = TENANT_LOOKUP("VendorName", messageValues["customerId"], PARSE_ERROR("Customer Id not on file"))

BASE64_DECODE(string)

Returns plain text string of a base64 encoded string input.

Example

OUTPUT$ = BASE64_DECODE("aG1lZXBcISBobWVlcFwh")

#OUTPUT$: hmeep\! hmeep\!    (String)

INT(string, base)

Returns integer of a number string with specific base

Example

OUTPUT$ = INT("4e0", 16)

#OUTPUT$: 1248    (int)

STRING(valueType)

Attempts to cast the variable input into a string representation.

# In some cases "key" can be a string, empty (NULL), an array, or even map.
key = json["$.requestParameters.key"]
# By calling STRING() you guarantee objectKey is set with a value.
objectKey$ = STRING(key)
# Note: ParserValue.StringValue() isn't used directly because addition logic breaks when appending two valuetype.OBJECT to make a list (addition operator).
# valuetype.LIST, valuetype.OBJECT, and valuetype.JSONDATA returns the json string representation all others are cast to their string analogs.

OBJKEYS(value)

Will return a list of the keys of a map or json object.

Example

 # Suppose the original json was:
 { 
    "values" : {
        "c" : "x", 
        "b" : "y", 
        "a" : "z"
    }
 }
keys = OBJKEYS(json["$.values"])
# keys is now an array of ["a", "b", "c"]
# NOTE: this function puts the values in alphabetical order

OBJVALUES(value)

Will return a list of the values of a map or json object.

Example

 # Suppose the original json was:
 {
    "values" : {
        "c" : "x", 
        "b" : {
            "foo" : "bar"
        }, 
        "a" : "z"
    }
 }
vals = OBJVALS(json["$.values"])
# vals is now an array of ["z", "{ 'foo' : 'bar' }", "x"]
# NOTE: this function puts the values in alphabetical order by their key.  This assures that OBJKEYS and OBJVALS output their elements in the same order which is important when combining these functions with ADDFIELD().

FLATTEN(json, keyLabel, valueLabel)

Converts arbitrary json to a list of objects.

Each object has two fields: a key and a value, both of type string. Parameters keyLabel and valueLabel are optional with default values "key"and"value" and "value"and"value" respectively. This function is intended to provide a convenient way to put json data into the schema fields of type KeyValuePairsIndexed; for example, the tags field on the generic schema or the evidence.sourceData.record field of ThirdPartyAlert.

Example

 # Suppose the original json was:
{
    "val" : { 
        "x": [
            "1",
            "2",
            "3"
        ] 
    }
}

# The output would be:
[
    {
        "key$": "val.x.0",
        "value$": "1"
    },         _
    {
        "key$": "val.x.1", 
        "value$": "2"
    },        _
    {
        "key$": "val.x.2", 
        "value$": "3"
    }
]

 

On this page: