kubernetes
elasticsearch
kafka
zookeeper
ceph
cassandra
percona-cluster
glance
mariadb
spark
Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. This charm manages a dedicated client node as a place to run Hadoop-related jobs. However, its main purpose is to serve as a base layer for other client charms, such as Spark or Zeppelin.
This is the base layer for charms that wish to connect to a core Hadoop cluster.
Including this layer provides and manages the relation to hadoop-plugin. All your reactive charm needs to do is respond to one or more of the states listed below.
The plugin charm provides the appropriate Hadoop libraries for the cluster, and sets up the standard Hadoop config files in /etc/hadoop/conf.
/etc/hadoop/conf
To create a charm layer using this base layer, you need only include it in a layer.yaml file:
layer.yaml
includes: ['layer:hadoop-client']
This will fetch this layer from interfaces.juju.solutions and incorporate it into your charm layer. You can then add handlers under the reactive/ directory. Note that any file under reactive/ will be expected to contain handlers, whether as Python decorated functions or executables using the external handler protocol.
reactive/
This layer, via the hadoop-plugin interface, will set the following states:
hadoop.hdfs.ready The Hadoop cluster has indicated that HDFS is ready to store data. Handlers reacting to this state will be passed an instance of the hadoop-plugin class, and can use the following methods to access information about HDFS:
hadoop.hdfs.ready
hadoop.namenodes()
hadoop.hdfs_port()
hadoop.yarn.ready The Hadoop cluster has indicated that Yarn is ready to process data. Handlers reacting to this state will be passed an instance of the hadoop-plugin class, and can use the following methods to access information about Yarn:
hadoop.yarn.ready
hadoop.resourcemanagers()
hadoop.yarn_port()
hadoop.yarn_hs_ipc_port()
hadoop.yarn_hs_http_port()
hadoop.ready The Hadoop cluster has indicated that both HDFS and Yarn are ready. This is a combination of the previous two states, and also provides an instance upon which any of the previously mentioned methods can be called.
hadoop.ready
An example using these states would be:
@when('hadoop.ready') def configure_service(hadoop): update_config( hadoop.namenodes(), hadoop.hdfs_port(), hadoop.resourcemanagers(), hadoop.yarn_port()) restart_service()
This layer supports the following options, which can be set in layer.yaml:
packages A list of system packages to be installed when Hadoop is being installed.
groups A list of system groups to be created when Hadoop is being configured.
users A list of system users to be created when Hadoop is being configured.
dirs A mapping of directories to be created when Hadoop is being configured. Each entry should contain the following keys:
An example layer.yaml using these options might be:
includes: ['layer:hadoop-client'] options: hadoop-client: groups: [spark] users: [spark] dirs: spark_home: path: /var/lib/spark perms: 0755 owner: spark group: spark