Installs and configures Airflow workflow management platform. More information about Airflow can be found here: https://github.com/airbnb/airflow
Ubuntu (Tested on 14.04, 16.04). CentOS (Tested on 7.2).
The Airflow all and oracle packages are not supported, this is due the Oracle package having dependencies which cannot be automatically installed. I will look how to solve this and add support for those packages at later stage.
Please follow instructions in the contributing doc.
- Use the relevant cookbooks to install and configure Airflow.
- Use environment variable in /etc/default/airflow (for Ubuntu) or /etc/sysconfig/airflow (for CentOS) to configure Airflow during the startup process. (More info about Airflow environment variables at: Setting Configuration Options)
- Make sure to run airflow initdb as part of your startup script.
- default - Executes other recipes.
- directories - Creates required directories.
- user - Creates OS user and group.
- packages - Installs OS and pip packages.
- config - Handles airflow.cfg
- services - Creates services env file.
- webserver - Configures service for webserver.
- scheduler - Configures service for scheduler.
- worker - Configures service for worker.
- flower - Configures service for flower.
- kerberos - Configures service for kerberos.
- packages - Installs Airflow and supporting packages.
- ["airflow"]["airflow_package"] - Airflow package name, defaults to 'apache-airflow'. Use 'airflow' for installing version 1.8.0 or lower.
- ["airflow"]["version"] - The version of airflow to install, defaults to latest (nil).
- ["airflow"]["user"] - The user Airflow is executed with and owner of all related folders.
- ["airflow"]["group"] - Airflow user group.
- ["airflow"]["user_uid"] - Airflow user uid
- ["airflow"]["group_gid"] - Airflow group gid
- ["airflow"]["user_home_directory"] - Airflow user home directory.
- ["airflow"]["shell"] - Airflow user shell.
- ["airflow"]["directories_mode"] - The permissions airflow and user directories are created.
- ["airflow"]["config_file_mode"] - The permissions airflow.cfg is created.
- ["airflow"]["bin_path"] - Path to the bin folder, default is based on platform.
- ["airflow"]["run_path"] - Pid files base directory
- ["airflow"]["is_upstart"] - Should upstart be used for services, determined automatiaclly.
- ["airflow"]["init_system"] - The init system to use when configuring services, only upstart or systemd are supported and defaults based on ["airflow"]["is_upstart"] value.
- ["airflow"]["env_path"] - The path to services env file, determined automatiaclly.
- ["airflow"]["python_runtime"] = Python runtime as used by poise-python cookbook.
- ["airflow"]["python_version"] = Python version to install as used by poise-python cookbook.
- ["airflow"]["pip_version"] = Pip version to install (true - installs latest) as used by poise-python cookbook.
- default['airflow']['packages'] - The Python packages to install for Airflow.
- default['airflow']['dependencies'] - The dependencies of the packages listed in default['airflow']['packages']. These are OS packages, not Python packages.
This cookbook enables to configure any airflow.cfg paramters dynamically by using attributes structure like (see the attributes file for airflow.cfg examples): ["airflow"]["config"]["CONFIG_SECTION"]["CONFIG_ENTRY"]
Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Following entries must be set in the knife credentials vault and updated when the cookbook is added to a node via a role or run-list
- pg-airflow : Must contain a password key/value for Postgres access
- ldap-bind or ldap-bind-dev : Must include username and password key/values which contains the Ldap bind system user credentials.
Following data bags are used in this cookbook
- None
following core pip packages should be reinstalled if higher version is installed during deployment
- pip install SQLAlchemy==1.3.15
- pip install werkzeug==0.16.1
- pip install JPype1==0.6.3 # verion 0.7.2 and 0.7.4 broke OracleOperator and Hook
Rust is required for compiling Cryptography package during instgallation. Install Rust on the OS by executing the following command as #airflow user on the host and follow the on-screen instructions This will download the required scripts and binaries in the airflow home directory in ~/.cargo $ sudo su - airflow $ cd $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
following core pip packages should be reinstalled if higher version is installed during deployment
- pip uninstall SQLAlchemy
- pip install SQLAlchemy==1.3.23 following package should be installed for upgrade to version 2 deprication verifications
- pip install apache-airflow-upgrade-check
removed:{ name: 'websocket-client', version: '<0.55.0' }
2.0.2
- [Ali] increase paralleism level from 32 to 48
2.6.0
- uninstall pendulum 2.1.2 due to bug () and install 2.1.0 manually