CVE-2022-33891- Apache Spark Shell Command Injection – Detection & Response

July 22, 2022

Apache Spark released the latest security bulletin on July 18, which contains a shell command injection vulnerability (CVE-2022-33891). The severity is important. The security researcher Kostya Kortchinsky (Databricks) has been credited with reporting this flaw.

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The PoC code exploits are available on GitHub.

Affected version

Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1

Solution

In this regard, we recommend that users upgrade Apache Spark to version 3.1.3, 3.2.2, or 3.3.0 or later in time to fix CVE-2022-33891.

Vulnerable component

http://localhost:8080/?doAs=`[command injection here]`

How does it work?

The command injection occurs because Spark checks the group membership of the user passed in the ?doAs parameter by using a raw Linux command.

User commands are processed through ?doAs parameter and nothing reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

OS commands that are passed on the URL parameters?doAs will trigger the background Linux bash process which calls cmdseq will run the process with the command line id -Gn .Running of bash with id -Gn is a a good sign of indicator that your server is vulnerable or it is already compromised.

If an attacker is sending reverse shell commands. There is also a high chance of granting apache spark server access to the attackers’ machine.

private def getUnixGroups(username: String): Set[String] = { val cmdSeq = Seq("bash", "-c", "id -Gn " + username) // we need to get rid of the trailing "\n" from the result of command execution Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet Utils.executeAndGetOutput(idPath :: "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet } }

Vulnerable source code: https://github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

Detection & Response:

Splunk:

index=* c-uri="*?doAs=`*"

index=* (Image="*\\bash" AND (CommandLine="*id -Gn*"))

Qradar:

SELECT UTF8(payload) from events where LOGSOURCENAME(logsourceid) ilike '%Linux%' and "Image" ilike '%\bash' and ("Process CommandLine" ilike '%id -Gn%')

SELECT UTF8(payload) from events where "URL" ilike '%?doAs=`%'

Elastic Query:

url.original:*?doAs\=`*

(process.executable:*\\bash AND process.command_line:*id\ \-Gn*)

Carbon Black:

(process_name:*\\bash AND process_cmdline:*id\ \-Gn*)

FireEye:

(process:`*\bash` args:`id -Gn`)

GrayLog:

(Image.keyword:*\\bash AND CommandLine.keyword:*id\ \-Gn*)

c-uri.keyword:*?doAs=`*

RSA Netwitness:

(web.page contains '?doAs=`')

((Image contains 'bash') && (CommandLine contains 'id -Gn'))

Logpoint:

(Image="*\\bash" CommandLine IN "*id -Gn*")

c-uri="*?doAs=`*"

Source/References:

github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

github.com/HuskyHacks/cve-2022-33891

github.com/W01fh4cker/cve-2022-33891/blob/main/cve_2022_33891_poc.py

Threat Hunting Using Windows Security Log

CVE-2023-21554 – Hunt For MSMQ QueueJumper In The Environment

OS Credential Dumping- LSASS Memory vs Windows Logs

Credential Dumping using Windows Network Providers – How to Respond

The Flow of Event Telemetry Blocking – Detection & Response

How Does DGA Malware Operate And How To Detect In A…

DNS sinkholes to Prevent Malware? How did it work?

Threat Hunting using DNS logs – Soc Incident Response Procedure

What is Port Forwarding and the Security Risks?

Threat Hunting using Firewall Logs – Soc Incident Response Procedure

How to Detect Malware C2 with DNS Status Codes

Ngrok Threat Hunting: Detect Hackers at the End of the Tunnel

The Most Important Data Exfiltration Techniques for a Soc Analyst to…

Soc Interview Questions and Answers – CYBER SECURITY ANALYST

Anatomy Of An Advanced Persistent Threat Group

DeepBlueCLI – PowerShell Module for Threat Hunting

Pestudio: Initial Malware Assessment Made Simple

How Attackers Manipulate LLMs in ML – Attack Vectors

How to Remove Database Malware from Your Website

PECmd – Windows Prefetch Analysis For Incident Responders

Phishing Scam Alert: Fraudulent Emails Requesting to Clear Email Storage Space…

Vidar Infostealer Malware Returns with new TTPS – Detection & Response

New WhiskerSpy Backdoor via Watering Hole Attack -Detection & Response

RedLine Stealer returns with New TTPS – Detection & Response

Understanding Microsoft Defender Threat Intelligence (Defender TI)

Threat Hunting Playbooks For MITRE TACTICS

Masquerade Attack Part 2 – Suspicious Services and File Names

Masquerade Attack – Everything You Need To Know in 2022

MITRE D3FEND Knowledge Guides to Design Better Cyber Defenses

Mapping MITRE ATT&CK with Window Event Log IDs

How DKIM SPF & DMARC Work to Prevent Email Spoofing and…

How Email Encryption Protects Your Privacy

How To Check Malicious Phishing Links

Emotet Malware with Microsoft OneNote- How to Block emails based on…

How DMARC is used to reduce spoofed emails ?

CVE-2022-33891- Apache Spark Shell Command Injection – Detection & Response

Solution

How does it work?

Detection & Response: