Nicole Fishbein and Ryan Robinson are the researchers reported that how they discovered misconfiguration errors on Apache Airflow servers over the internet ,exposing sensitive information such as accounts run by major tech companies.
Apache Airflow is an open-source workflow management platform for automating business and IT tasks that is used by many companies across the world.
Intezer’s Researchers analysed that “These unsecured instances expose sensitive information of companies across the media, finance, manufacturing, information technology (IT), biotech, e-commerce, health, energy, cybersecurity, and transportation industries.”
Researchers reported that “Hardcoded passwords should be avoided and used the lengthy names for images and dependencies. If you imagine the application is securely encrypted off from the internet , you will not be safe if you use poor development standards”.
According to Intezer, the great majority of these problems were discovered in servers running Airflow v1.x from 2015, which are still in use by many businesses.
In Airflow version 2, many additional security features were included such as REST API that requires authentication for all activities. Moreover, the current version does not log sensitive information and requires the administrator to verify the setup of data protection laws as well as the possibility of legal action.
Customer records and sensitive data could be exposed as a result of security weaknesses caused by procrastinated patching which could be a violation of data protection rules such as the GDPR.
“There is also the possibility that Airflow plugins or features can be abused to run malicious code. An example of how an attacker can abuse a native “Variables” feature in Airflow is if any code or images placed in the variables form is used to build evaluated code strings.” continues the analysis. “Variables are able to be edited by any visiting user which means that malicious code could be injected. One entity we observed was using variables to store internal container image names to execute. These container image variables could be edited and swapped out with an image containing and running unauthorized or malicious code.” reads the blogpost.
Airflow uses standard Python to create and schedule workflows, providing users with a dynamic and convenient way to work with the platform. There are several concepts and features in Airflow that make it flexible and popular among users:
- Directed Acyclic Graph (DAG) – the primary concept in Airflow that represents a collection of tasks with defined dependencies and relationships.
- Task – the basic unit of execution in Airflow, with each node in the DAG represented by a task.
- Variables – a way to store content and settings in a key value storage, including passwords and API keys that are stored as masked strings. Airflow also supports variable encryption using Fernet.
- Connections – A feature that stores parameters (username, password, host) needed to connect to external systems.
- Logs – Airflow supports logging mechanisms as well as the ability to emit metrics.
The Security Firm warns “Disruption of clients activities as a result of poor cybersecurity procedures can lead to legal action such as class action lawsuits.”