Docker 和 PID 1 僵尸进程问题 已翻译 100%

oschina 投递于 2015/01/21 08:03 (共 21 段, 翻译完成于 01-28)
阅读 14314
收藏 74
2
加载中

When building Docker containers, you should be aware of the PID 1 zombie reaping problem. That problem can cause unexpected and obscure-looking issues when you least expect it. This article explains the PID 1 problem, explains how you can solve it, and presents a pre-built solution that you can use: Baseimage-docker.

When done, you may want to read part 2: Baseimage-docker, fat containers and “treating containers as VMs”.

Introduction

About a year ago — back in the Docker 0.6 days — we first introduced Baseimage-docker. This is a minimal Ubuntu base image that is modified for Docker-friendliness. Other people can pull Baseimage-docker from the Docker Registry and use it as a base image for their own images.

已有 1 人翻译此段
我来翻译

We were early adopters of Docker, using Docker for continuous integration and for building development environments way before Docker hit 1.0. We developed Baseimage-docker in order to solve some problems with the way Docker works. For example, Docker does not run processes under a special init process that properly reaps child processes, so that it is possible for the container to end up with zombie processes that cause all sorts of trouble. Docker also does not do anything with syslog so that it’s possible for important messages to get silently swallowed, etcetera.

However, we’ve found that a lot of people have problems understanding the problems that we’re solving. Granted, these are low-level Unix operating system-level mechanisms that few people know about or understand. So in this blog article we will describe the most important problem that we’re solving — the PID 1 problem zombie reaping problem — in detail.

Zombies

已有 1 人翻译此段
我来翻译

We figured that:

  1. The problems that we solved are applicable to a lot of people.

  2. Most people are not even aware of these problems, so things can break in unexpected ways (Murphey’s law).

  3. It’s inefficient if everybody has to solve these problems over and over.

So in our spare time we extracted our solution into a reusable base image that everyone can use: Baseimage-docker. This image also adds a bunch of useful tools that we believe most Docker image developers would need. We use Baseimage-docker as a base image for all our Docker images.

The community seemed to like what we did: we are the most popular third party image on the Docker Registry, only ranking below the official Ubuntu and CentOS images.

已有 1 人翻译此段
我来翻译

The PID 1 problem: reaping zombies

Recall that Unix processes are ordered in a tree. Each process can spawn child processes, and each process has a parent except for the top-most process.

This top-most process is the init process. It is started by the kernel when you boot your system. This init process is responsible for starting the rest of the system, such as starting the SSH daemon, starting the Docker daemon, starting Apache/Nginx, starting your GUI desktop environment, etc. Each of them may in turn spawn further child processes.

Unix process hierarchy

Nothing special so far. But consider what happens if a process terminates. Let’s say that the bash (PID 5) process terminates. It turns into a so-called “defunct process”, also known as a “zombie process”.

Zombie process

已有 1 人翻译此段
我来翻译

Why does this happen? It’s because Unix is designed in such a way that parent processes must explicitly “wait” for child process termination, in order to collect its exit status. The zombie process exists until the parent process has performed this action, using the waitpid() family of system calls. I quote from the man page:

“A child that terminates, but has not been waited for becomes a “zombie”. The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child.”

In every day language, people consider “zombie processes” to be simply runaway processes that cause havoc. But formally speaking — from a Unix operating system point of view — zombie processes have a very specific definition. They are processes that have terminated but have not (yet) been waited for by their parent processes.

已有 1 人翻译此段
我来翻译

Most of the time this is not a problem. The action of calling waitpid() on a child process in order to eliminate its zombie, is called “reaping”. Many applications reap their child processes correctly. In the above example with sshd, if bash terminates then the operating system will send a SIGCHLD signal to sshd to wake it up. Sshd notices this and reaps the child process.

Zombie process reaping

But there is a special case. Suppose the parent process terminates, either intentionally (because the program logic has determined that it should exit), or caused by a user action (e.g. the user killed the process). What happens then to its children? They no longer have a parent process, so they become “orphaned” (this is the actual technical term).

已有 1 人翻译此段
我来翻译

And this is where the init process kicks in. The init process — PID 1 — has a special task. Its task is to “adopt” orphaned child processes (again, this is the actual technical term). This means that the init process becomes the parent of such processes, even though those processes were never created directly by the init process.

Consider Nginx as an example, which daemonizes into the background by default. This works as follows. First, Nginx creates a child process. Second, the original Nginx process exits. Third, the Nginx child process is adopted by the init process.

Orphaned process adoption

You may see where I am going. The operating system kernel automatically handles adoption, so this means that the kernel expects the init process to have a special responsibility: the operating system expects the init process to reap adopted children too.

已有 1 人翻译此段
我来翻译

This is a very important responsibility in Unix systems. It is such a fundamental responsibility that many many pieces of software are written to make use of this. Pretty much all daemon software expect that daemonized child processes are adopted and reaped by init.

Although I used daemons as an example, this is in no way limited to just daemons. Every time a process exits even though it has child processes, it’s expecting the init process to perform the cleanup later on. This is described in detail in two very good books: Operating System Concepts by Silberschatz et al, and Advanced Programming in the UNIX Environment by Stevens et al.

Operating System Concepts by Silberschatz et al Avanced Programming in the Unix Environment by Stevens et al

已有 1 人翻译此段
我来翻译

Why zombie processes are harmful

Why are zombie processes a bad thing, even though they’re terminated processes? Surely the original application memory has already been freed, right? Is it anything more than just an entry that you see in ps?

You’re right, the original application memory has been freed. But the fact that you still see it in ps means that it’s still taking up some kernel resources. I quote the Linux waitpid man page:

“As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes.”

已有 1 人翻译此段
我来翻译

Relationship with Docker

So how does this relate to Docker? Well, we see that a lot of people run only one process in their container, and they think that when they run this single process, they’re done. But most likely, this process is not written to behave like a proper init process. That is, instead of properly reaping adopted processes, it’s probably expecting another init process to do that job, and rightly so.

Let’s look at a concrete example. Suppose that your container contains a web server that runs a CGI script that’s written in bash. The CGI script calls grep. Then the web server decides that the CGI script is taking too long and kills the script, but grep is not affected and keeps running. When grep finishes, it becomes a zombie and is adopted by the PID 1 (the web server). The web server doesn’t know about grep, so it doesn’t reap it, and the grep zombie stays in the system.

已有 1 人翻译此段
我来翻译
本文中的所有译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接。
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。
加载中

评论(7)

eechen
eechen
Linux Process Status:
R (task_running) : 可执行状态
S (task_interruptible): 可中断的睡眠状态
D (task_uninterruptible): 不可中断的睡眠状态
T (task_stopped or task_traced): 暂停状态或跟踪状态
Z (task_dead - exit_zombie): 退出状态,进程成为僵尸进程
X (task_dead - exit_dead): 退出状态,进程即将被销毁

http://man7.org/linux/man-pages/man1/top.1.html
running表示正在运行的进程,sleeping表示睡眠的进程,stopped表示暂停的进程,zombie表示已结束但还没有从进程表中删除的僵尸进程.
wwq1001
wwq1001
存在即合理吧
LinkerLin
LinkerLin
我一直在怀疑自建云的必要性,何不使用公有云,成本低太多了。
LinkerLin
LinkerLin
我一直在怀疑自建云的必要性,何不使用公有云,成本低太多了。
ringz
ringz
"Phusion 乘客认为这个进程依然存活" .....
kerriganA
kerriganA
好吧,其实这一个问题我也遇到过。。。
动弹
动弹
介绍一段里面的 `从Docker 登记下载`, 建议保留为 `从Docker Registry拉取基础镜像`
返回顶部
顶部