Splunk Commands – Exploring of Join Commands and its Arguments

0

SOC analysts have come across number of Splunk commands where, each has its own set of features that help us understand data better. With these commands, we can generate reports, alerts, and dashboards exactly how we want them. Today, we’ll look at the join command, which is one of Splunk’s most popular commands.

Also Read: Splunk Commands – Append , Chart and Dedup

Description:

The traditional join command joins the results from the main results pipeline with the search pipeline results provided as the last argument. Optionally specifies the exact fields to join on. If no fields are specified, all fields that are shared by both result sets will be used. The join command is used to merge the results of a sub search with the main search results. Each result set must have at least one field in common. The self-join command can also be used to join a collection of search results to itself.

Synopsis

Joining of results from the main results pipeline with the results from the sub pipelines.

Syntax

join [join-options]*<field-list> [ subsearch ]

————————————————————————————

Index = “your_index”

| join type = (inner | left | outer) “common_field”

[ | search index = “your_index”

……………………….

]

Required arguments

sub search

Description: A search pipeline. Read more about how sub searches work in the Search manual.

Optional arguments

field-list:

Syntax: <field>, …

Description: Specify the exact fields to use for the join. If none arespecified, uses all fields that are common to both result sets.

join-options:

Syntax:

type=(inner|outer|left) | usetime=<bool> | earlier=<bool> |overwrite=<bool> | max=<int>

Description: Options to the join command.

Join options:

1-type:

Syntax: type=inner | outer | left

Description: Indicates the type of join to perform. Basically, the difference between an inner and a left (or outer) join is how they treat events in the main pipeline that do not match any in the subpipeline. In both cases, events that match are joined. The results of an inner join will not include any events with no matches. A left (or outer) join does not require each event to have matching field values, and the joined result retains each event? even if there is no match with any rows of the sub search. Defaults to inner.

2-usetime:

Syntax: usetime=<bool>

Description: Indicates whether to limit matches to sub-results that are earlier or later than the main result to join with. Defaults to false.

3-earlier:

Syntax: earlier=<bool>

Description: If usetime=true, specify whether to join with matches that are earlier (true) or later (false) than the main result. Defaults to true.

4-overwrite:

Syntax: overwrite=<bool>

Description: Indicates if fields from the sub results should overwrite those from the main result if they have the same field name. Defaults to true.

5-max:

Syntax: max=<int>

Description: Indicates the maximum number of sub-results each main result can join with. If max=0 means no limit. Defaults to 1.

Examples:

Set A Query:

sourcetype=pan:traffic

| table dest_port, dest_location, sourcetype

Also Read: Splunk Commands – Field-value pair matching , Boolean and comparison , Operator and Wildcards

Set B Query:

sourcetype=pan:threat

| table action, dest_port, dest_location, http_category

Example 1-The below example illustrates inner Join type.

Inner Join:  Only the matched events are included in the result in the inner join.

Inner Join Query:

sourcetype=pan:traffic

| table dest_port, dest_location, sourcetype

| join type=inner dest_location

[| search sourcetype=pan:threat

| rename http_category as pan:threat-http_category

| fields action, dest_location, pan:threat-http_category]

Also Read: Latest IOCs – Threat Actor URLs , IP’s & Malware Hashes

Explanation:

In the Set A query, the sourcetype used in Pan: traffic. In the Set B query, the sourcetype used in Pan: threat. In both the queries, a common field is dest_location. The inner type will check the 2 set queries and fetch the results which are common. In the dest_location, the common location is “Ireland” in both the query sets. Hence the join: inner query has fetched the “Ireland” location details only.

Example 2-The below example illustrates the Left or Outer Join type.

Left or Outer Join:  All the main search events are comprised along with the matched results in left or outer join.

Left or OuterJoin Query:

sourcetype=pan:traffic

| table dest_port, dest_location, sourcetype

| join type=leftdest_location

[| search sourcetype=pan:threat

| rename http_category as pan:threat-http_category

| fields action, dest_location, pan:threat-http_category]

Also Read: Latest Cyber Security News – Hacker News !

Explanation:

In the Set A query, the sourcetype used in Pan: traffic. In the Set B query, the sourcetype used in Pan: threat. In both the queries, a common field is dest_location. The left or outer type will check the 2 set queries and combine the results which is common. In the dest_location, the common location is “Ireland” in both the query sets. Hence the join: left or outer query has fetched the “Ireland” location details from Set B and combined them with the Set A results. So that we could see Set A results as well as the common location details which are in Set B.

Conclusion:

Join command is used to fetch data from other datatype or index or sourcetype and to combine with the existing query. In most of the Splunk rules, we need to join commands to produce the best results.

Previous articleSecurity Operations Center (SOC) Roles and Responsibilities
Next articleAzure Sentinel for IT Security and its SIEM Architecture
Anusthika Jeyashankar
Ambitious Blue Teamer; Enthused Security Analyst

LEAVE A REPLY

Please enter your comment!
Please enter your name here