Skip to content

ZyeLabs/airflow-cookbook

 
 

Repository files navigation

Airflow Chef Cookbook

Installs and configures Airflow workflow management platform. More information about Airflow can be found here: https://github.com/airbnb/airflow

Supported Platforms

Ubuntu (Tested on 14.04, 16.04). CentOS (Tested on 7.2).

Limitations

The Airflow all and oracle packages are not supported, this is due the Oracle package having dependencies which cannot be automatically installed. I will look how to solve this and add support for those packages at later stage.

Contributing

Please follow instructions in the contributing doc.

Usage

  • Use the relevant cookbooks to install and configure Airflow.
  • Use environment variable in /etc/default/airflow (for Ubuntu) or /etc/sysconfig/airflow (for CentOS) to configure Airflow during the startup process. (More info about Airflow environment variables at: Setting Configuration Options)
  • Make sure to run airflow initdb as part of your startup script.

Recipes

  • default - Executes other recipes.
  • directories - Creates required directories.
  • user - Creates OS user and group.
  • packages - Installs OS and pip packages.
  • config - Handles airflow.cfg
  • services - Creates services env file.
  • webserver - Configures service for webserver.
  • scheduler - Configures service for scheduler.
  • worker - Configures service for worker.
  • flower - Configures service for flower.
  • kerberos - Configures service for kerberos.
  • packages - Installs Airflow and supporting packages.

Attributes

User config
  • ["airflow"]["airflow_package"] - Airflow package name, defaults to 'apache-airflow'. Use 'airflow' for installing version 1.8.0 or lower.
  • ["airflow"]["version"] - The version of airflow to install, defaults to latest (nil).
  • ["airflow"]["user"] - The user Airflow is executed with and owner of all related folders.
  • ["airflow"]["group"] - Airflow user group.
  • ["airflow"]["user_uid"] - Airflow user uid
  • ["airflow"]["group_gid"] - Airflow group gid
  • ["airflow"]["user_home_directory"] - Airflow user home directory.
  • ["airflow"]["shell"] - Airflow user shell.
General config
  • ["airflow"]["directories_mode"] - The permissions airflow and user directories are created.
  • ["airflow"]["config_file_mode"] - The permissions airflow.cfg is created.
  • ["airflow"]["bin_path"] - Path to the bin folder, default is based on platform.
  • ["airflow"]["run_path"] - Pid files base directory
  • ["airflow"]["is_upstart"] - Should upstart be used for services, determined automatiaclly.
  • ["airflow"]["init_system"] - The init system to use when configuring services, only upstart or systemd are supported and defaults based on ["airflow"]["is_upstart"] value.
  • ["airflow"]["env_path"] - The path to services env file, determined automatiaclly.
Python config
  • ["airflow"]["python_runtime"] = Python runtime as used by poise-python cookbook.
  • ["airflow"]["python_version"] = Python version to install as used by poise-python cookbook.
  • ["airflow"]["pip_version"] = Pip version to install (true - installs latest) as used by poise-python cookbook.
Package config
  • default['airflow']['packages'] - The Python packages to install for Airflow.
  • default['airflow']['dependencies'] - The dependencies of the packages listed in default['airflow']['packages']. These are OS packages, not Python packages.
airflow.cfg

This cookbook enables to configure any airflow.cfg paramters dynamically by using attributes structure like (see the attributes file for airflow.cfg examples): ["airflow"]["config"]["CONFIG_SECTION"]["CONFIG_ENTRY"]

License

Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Author

Sergey Bahchissaraitsev

Required Chef Vaults

Following entries must be set in the knife credentials vault and updated when the cookbook is added to a node via a role or run-list

  1. pg-airflow : Must contain a password key/value for Postgres access
  2. ldap-bind or ldap-bind-dev : Must include username and password key/values which contains the Ldap bind system user credentials.

Required Chef Data Bags

Following data bags are used in this cookbook

  • None

airflow version upgrade 1.10.4 to 1.10.7

following core pip packages should be reinstalled if higher version is installed during deployment

  • pip install SQLAlchemy==1.3.15
  • pip install werkzeug==0.16.1
  • pip install JPype1==0.6.3 # verion 0.7.2 and 0.7.4 broke OracleOperator and Hook

airflow version upgrade 1.10.7 to 1.10.15

Rust is required for compiling Cryptography package during instgallation. Install Rust on the OS by executing the following command as #airflow user on the host and follow the on-screen instructions This will download the required scripts and binaries in the airflow home directory in ~/.cargo $ sudo su - airflow $ cd $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

following core pip packages should be reinstalled if higher version is installed during deployment

  • pip uninstall SQLAlchemy
  • pip install SQLAlchemy==1.3.23 following package should be installed for upgrade to version 2 deprication verifications
  • pip install apache-airflow-upgrade-check

airflow version 2

removed:{ name: 'websocket-client', version: '<0.55.0' }

2.0.2

  • [Ali] increase paralleism level from 32 to 48

2.6.0

  • uninstall pendulum 2.1.2 due to bug () and install 2.1.0 manually

About

Airflow workflow management platform chef cookbook.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • HTML 83.0%
  • Ruby 16.4%
  • Other 0.6%