加载中

A week ago we have announced and open sourced Cloudbreak, the first Docker based Hadoop as a Service API. In this post we’d like to introduce you into the technical details and the building blocks of the architecture. Cloudbreak is built on the foundation of cloud providers APIs, Apache Ambari, Docker containers, Serf and dnsmasq. It is a cloud agnostic solution – as all the Hadoop services and components are running inside Docker containers – and these containers are shipped across different cloud providers.

Cloudbreak product documentation: http://sequenceiq.com/cloudbreak

Cloudbreak API documentation: http://docs.cloudbreak.apiary.io/

一周前我们发布并开源了Cloudbreak--首个基于hadoop的docker service API。本文将为您展示其技术细节和架构组成。 Cloudbreak 建立于Apache Ambari, Docker containers, Serf 和 dnsmasq 这类云提供者的API之上。它是一个与云无关的解决方案——这是因为所有hadoop服务和组件都运行于Docker容器中——并且这些容器被搭载到不同的云提供者之间。

Cloudbreak产品文档: http://sequenceiq.com/cloudbreak

Cloudbreak API文档: http://docs.cloudbreak.apiary.io/

How it works

From Docker containers point of view we have two kind of containers – based on their Ambari role – server and agent. There is one Docker container running the Ambari server, and there are many Docker containers running the Ambari agents. The used Docker image is always the same: sequenceiq/ambari and the Ambari role is decided based on the $AMBARI_ROLE variable.

For example on Amazon EC2 this is how we start the containers:

docker run -d -p <LIST of ports> -e SERF_JOIN_IP=$SERF_JOIN_IP --dns 127.0.0.1 --name ${NODE_PREFIX}${INSTANCE_IDX} -h ${NODE_PREFIX}${INSTANCE_IDX}.${MYDOMAIN} --entrypoint /usr/local/serf/bin/start-serf-agent.sh  $IMAGE $AMBARI_ROLE

工作原理

从docker容器的角度来说我们需要两种容器——基于Ambari角色的——服务器与客户端。即需要一个运行Ambari服务器的Docker容器和许多运行Ambari客户端的容器。用到的Docker镜像是相同的:sequenceiq/ambari,并且Ambari角色是由$AMBARI_ROLE变量决定的。

例如在Amazon EC2  中启动这些容器的方法如下:

docker run -d -p <LIST of ports> -e SERF_JOIN_IP=$SERF_JOIN_IP --dns 127.0.0.1 --name ${NODE_PREFIX}${INSTANCE_IDX} -h ${NODE_PREFIX}${INSTANCE_IDX}.${MYDOMAIN} --entrypoint /usr/local/serf/bin/start-serf-agent.sh  $IMAGE $AMBARI_ROLE

As we are starting up the instances and the Docker containers on the host, we’d like them to join each other and be able to communicate – though we don’t know the IP addresses beforehand. This can be challanging on cloud environments – where your IP address and DNS name is dynamically allocated – however you don’t want to collect these imformations beforehand launching the Docker containers. For that we use Serf – and pass along the IP address SERF_JOIN_IP=$SERF_JOIN_IP of the first container. Using a gossip protocol Serf will automatically discover each other, set the DNS names, and configure the routing between the nodes. Serf reconfigures the DNS server dnsmasq running inside the container, and keeps it up to date with the joining or leaving nodes information. As you can see at startup we always pass a --dns 127.0.0.1 dns server for the container to use.

当我们在主机中启动这些实例和Docker容器时,我们希望它们可以加入对方并且相互通信——尽管此前我们不知道它们的IP地址。这在动态分配IP和DNS的云环境中是很困难的——而且在启动Docker容器前你根本就没打算要收集这些信息。因此我们需要使用Serf——并传入首个启动的容器的IP地址SERF_JOIN_IP=$SERF_JOIN_IP 。通过使用gossip协议,Serf会自动发现其他容器,设置DNS名称并配置节点间的路由。Serf会随着节点的加入和离开自动更新配置运行于容器中的DNS服务器dnsmasq。正如在启动时你所看到的,我们要为容器传入一个DNS服务器--dns 127.0.0.1。

As you see there is no cloud specific code at the Docker containers level, the same technology can be used on bare metal as well. Check our previous blog posts about a multi node Hadoop cluster on any host.

Obliviously there is some configuration on the host as well – for that and to handle early initialization of a cloud instance we use CloudInit. We will write a blog post about these for every cloud provider we support.

For additional information you can check our slides from the Hadoop Summit 2014.

Once Ambari is started it will install the selected components based on the passed Hadoop blueprint – and start the desired services.

正如你所看到的,在Docker容器级别没有特定云平台的代码,而相似的技术也适用于裸机。请参阅我们上一篇文章:运行于各种主机之上的多节点hadoop集群

很明显也需要对主机进行一些配置——因此并且为了处理云实例的早期初始化我们使用CloudInit。我们将为我们支持的云提供者发布一篇关于这些配置的博文。

更多信息请参见我们在Hadoop 2014峰会上的幻灯片。

Ambari启动后它会根据传入的hadoop blueprint安装选定的组件——并启动相应的service。

Used Technologies

Apache Ambari

The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs.

Ambari enables System Administrators to:

  1. Provision a Hadoop Cluster

  2. provides a step-by-step wizard for installing Hadoop services across any number of hosts.

  3. handles configuration of Hadoop services for the cluster.

  4. Manage a Hadoop Cluster

  5. provides central management for starting, stopping, and reconfiguring Hadoop services across the entire cluster.

  6. Monitor a Hadoop Cluster

  7. provides a dashboard for monitoring health and status of the Hadoop cluster.

  8. leverages Ganglia for metrics collection.

  9. leverages Nagios for system alerting and will send emails when your attention is needed (e.g. a node goes down, remaining disk space is low, etc).

用到的技术

Apache Ambari

Apache Ambari 项目旨在开发配置、管理和监视 Apache Hadoop 集群的软件,使得Hadoop管理更简单。Ambari提供了一个直观易用的Hadoop管理的web UI,由RESTful API支持

Ambari使系统管理员能够:

1.配置一个Hadoop集群

2.为所有主机的Hadoop服务安装提供向导

3.配置集群的Hadoop服务

4.管理一个Hadoop集群

5.提供横跨整个集群的中央管理,可启动、停止和重新配置Hadoop服务

6.监视一个Hadoop集群

7.提供监视Hadoop集群的健康和状态的控制面板

8.利用Ganglia度量收集的数据

9.利用Nagios发出系统警报,并且需要时,可发送邮件(例如,节点出现故障,剩余磁盘空间底等


 

Ambari enables to integrate Hadoop provisioning, management and monitoring capabilities into applications with the Ambari REST APIs. Ambari Blueprints are a declarative definition of a cluster. With a Blueprint, you can specify a Stack, the Component layout and the Configurations to materialize a Hadoop cluster instance (via a REST API) without having to use the Ambari Cluster Install Wizard.

Ambari能够用 REST API将Hadoop配置、管理和监视的能力整合到应用程序中。Ambari Bluepoint是集群的一个声明性定义。有了 Bluepoint,可以指定堆栈、组件布局和配置实现一个Hadoop集群的实例化(通过REST API),而 无需使用Ambari集群安装向导。


 

Docker

Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

The main features of Docker are:

  1. Lightweight, portable

  2. Build once, run anywhere

  3. VM – without the overhead of a VM

  4. Each virtualized application includes not only the application and the necessary binaries and libraries, but also an entire guest operating system

  5. The Docker Engine container comprises just the application and its dependencies. It runs as an isolated process in userspace on the host operating system, sharing the kernel with other containers.

  6. Containers are isolated

  7. It can be automated and scripted

Docker

Docker是一个开放平台,它为开发者和系统管理员创建,组织和运行分布式的应用程序.其中,Docker Engine是一个可移植的,轻量级的运行时和打包工具.Docker Hub是一个云服务,用来分享应用程序和自动化工作流.Docker允许应用程序通过组件来快速组装,消除了开发,QA和生产环境之间的摩擦.因此,IT系统可以组织的更快,无需修改即可在笔记本电脑,VM数据中心和任何云中运行同样的应用程序.

Docker的主要特性包括:

  1. 轻量级,可移植.

  2. 创建一次,处处运行.

  3. VM – 没有VM的系统开销.

  4. E任意一个虚拟化的应用程序不仅仅包含应用程序以及必要的二进制文件和库,而且还包括整个guest操作系统.

  5. Docker Engine容器仅仅包含应用程序及其依赖.它就像一个隔离的进程在host操作系统的用户空间里运行,并和其它容器共享内核.

  6. 容器是相互隔离的.

  7. 它可以自动化和脚本化.

Serf

Serf is a tool for cluster membership, failure detection, and orchestration that is decentralized, fault-tolerant and highly available. Serf runs on every major platform: Linux, Mac OS X, and Windows. It is extremely lightweight. Serf uses an efficient gossip protocol to solve three major problems:

  • Membership: Serf maintains cluster membership lists and is able to execute custom handler scripts when that membership changes. For example, Serf can maintain the list of Hadoop servers of a cluster and notify the members when nodes come online or go offline.

  • Failure detection and recovery: Serf automatically detects failed nodes within seconds, notifies the rest of the cluster, and executes handler scripts allowing you to handle these events. Serf will attempt to recover failed nodes by reconnecting to them periodically.

  • Custom event propagation: Serf can broadcast custom events and queries to the cluster. These can be used to trigger deploys, propagate configuration, etc. Events are simple fire-and-forget broadcast, and Serf makes a best effort to deliver messages in the face of offline nodes or network partitions. Queries provide a simple realtime request/response mechanism.

Serf

Serf是为集群会员(cluster membership),错误检测和业务流程服务的工具.它是分散的,容错的和高可用的.Serf运行于主要的平台:Linux,Mac OS X和Windows.它是超级轻量级的.Serf使用有效的gossip协议来处理三个主要的问题:

  • 会员(membership): Serf维护了集群会员列表,当会员改变的时候可以执行自定义的处理脚本.例如,Serf维护了Hadoop集群服务器的列表,当节点上线或掉线的时候会通知其成员.

  • 错误检测和恢复:Serf在几秒内自动检测失败的节点,并通知集群的其他节点,同时执行处理这些事件的处理脚本.Serf将通过周期性的重连来尝试恢复失败的节点.

  • 自定义事件传播: Serf可以广播自定义事件和查询到集群.这被用来出发部署,传递配置等.事件是简单的自引导(fire-and-forget)广播,当碰到离线节点或网路分区的时候,Serf尽最大努力来传递信息.查询提供了简单的实时请求/应答技术.

返回顶部
顶部