CVE-2022-33891- Apache Spark Shell Command Injection – Detection & Response


Apache Spark released the latest security bulletin on July 18, which contains a shell command injection vulnerability (CVE-2022-33891). The severity is important. The security researcher Kostya Kortchinsky (Databricks) has been credited with reporting this flaw.

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The PoC code exploits are available on GitHub.

Affected version

Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1


In this regard, we recommend that users upgrade Apache Spark to version 3.1.3, 3.2.2, or 3.3.0 or later in time to fix CVE-2022-33891.

Vulnerable component

http://localhost:8080/?doAs=`[command injection here]`

How does it work?

The command injection occurs because Spark checks the group membership of the user passed in the ?doAs parameter by using a raw Linux command.

User commands are processed through ?doAs parameter and nothing reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

OS commands that are passed on the URL parameters?doAs will trigger the background Linux bash process which calls cmdseq will run the process with the command line id -Gn .Running of bash with id -Gn is a a good sign of indicator that your server is vulnerable or it is already compromised.

If an attacker is sending reverse shell commands. There is also a high chance of granting apache spark server access to the attackers’ machine.

private def getUnixGroups(username: String): Set[String] = {
val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
// we need to get rid of the trailing "\n" from the result of command execution
Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
Utils.executeAndGetOutput(idPath :: "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet

Vulnerable source code:

Detection & Response:


index=* c-uri="*?doAs=`*"
index=* (Image="*\\bash" AND (CommandLine="*id -Gn*"))


SELECT UTF8(payload) from events where LOGSOURCENAME(logsourceid) ilike '%Linux%' and "Image" ilike '%\bash' and ("Process CommandLine" ilike '%id -Gn%')
SELECT UTF8(payload) from events where "URL" ilike '%?doAs=`%'

Elastic Query:

(process.executable:*\\bash AND process.command_line:*id\ \-Gn*)

Carbon Black:

(process_name:*\\bash AND process_cmdline:*id\ \-Gn*)


(process:`*\bash` args:`id -Gn`)


(Image.keyword:*\\bash AND CommandLine.keyword:*id\ \-Gn*)

RSA Netwitness:

( contains '?doAs=`')
((Image contains 'bash') && (CommandLine contains 'id -Gn'))


(Image="*\\bash" CommandLine IN "*id -Gn*")


Previous articleNew Luna ransomware targets Windows, Linux and ESXi systems
Next articleAnatomy Of The Ransomware Cybercrime Economy
Balaganesh is a Incident Responder. Certified Ethical Hacker, Penetration Tester, Security blogger, Founder & Author of Soc Investigation.


Please enter your comment!
Please enter your name here