加载中

1. Abstract

(This is a modified version of the keynote talk given by Rob Pike at the SPLASH 2012 conference in Tucson, Arizona, on October 25, 2012.)

The Go programming language was conceived in late 2007 as an answer to some of the problems we were seeing developing software infrastructure at Google. The computing landscape today is almost unrelated to the environment in which the languages being used, mostly C++, Java, and Python, had been created. The problems introduced by multicore processors, networked systems, massive computation clusters, and the web programming model were being worked around rather than addressed head-on. Moreover, the scale has changed: today's server programs comprise tens of millions of lines of code, are worked on by hundreds or even thousands of programmers, and are updated literally every day. To make matters worse, build times, even on large compilation clusters, have stretched to many minutes, even hours.

Go was designed and developed to make working in this environment more productive. Besides its better-known aspects such as built-in concurrency and garbage collection, Go's design considerations include rigorous dependency management, the adaptability of software architecture as systems grow, and robustness across the boundaries between components.

This article explains how these issues were addressed while building an efficient, compiled programming language that feels lightweight and pleasant. Examples and explanations will be taken from the real-world problems faced at Google.

1. 摘要

(本文是根据Rob Pike于2012年10月25日在Tucson, Arizona举行的SPLASH 2012大会上所做的主题演讲进行修改后所撰写的。)

针对我们在Google公司内开发软件基础设施时遇到的一些问题,我们于2007年末构思出Go编程语言。当今的计算领域同创建如今所使用的编程语言(使用最多的有C++、Java和Python)时的环境几乎没什么关系了。由多核处理器、系统的网络化、大规模计算机集群和Web编程模型带来的编程问题都是以迂回的方式而不是迎头而上的方式解决的。此外,程序的规模也已发生了变化:现在的服务器程序由成百上千甚至成千上万的程序员共同编写,源代码也以数百万行计,而且实际上还需要每天都进行更新。更加雪上加霜的是,即使在大型编译集群之上进行一次build,所花的时间也已长达数十分钟甚至数小时。

之所以设计开发Go,就是为了提高这种环境下的工作效率。Go语言设计时考虑的因素,除了大家较为了解的内置并发和内存垃圾自动回收这些方面之外,还包括严格的依赖管理、对随系统增大而在体系结构方面发生变化的适应性、跨组件边界的健壮性(robustness)。

本文将详细讲解在构造一门轻量级并让人感觉愉悦的、高效的编译型编程语言时,这些问题是如何得到解决的。讲解过程中使用的例子都是来自Google公司中所遇到的现实问题。

2. Introduction

Go is a compiled, concurrent, garbage-collected, statically typed language developed at Google. It is an open source project: Google imports the public repository rather than the other way around.

Go is efficient, scalable, and productive. Some programmers find it fun to work in; others find it unimaginative, even boring. In this article we will explain why those are not contradictory positions. Go was designed to address the problems faced in software development at Google, which led to a language that is not a breakthrough research language but is nonetheless an excellent tool for engineering large software projects.

2. 简介

Go语言开发自Google,是一门支持并发编程和内存垃圾回收的编译型静态类型语言。它是一个开源的项目:Google从公共的代码库中导入代码而不是相反。

Go语言运行效率高,具有较强的可伸缩性(scalable),而且使用它进行工作时的效率也很高。有些程序员发现用它编程很有意思;还有一些程序员认为它缺乏想象力甚至很烦人。在本文中我们将解释为什么这两种观点并不相互矛盾。Go是为解决Google在软件开发中遇到的问题而设计的,虽然因此而设计出的语言不会是一门在研究领域里具有突破性进展的语言,但它却是大型软件项目中软件工程方面的一个非常棒的工具。

3. Go at Google

Go is a programming language designed by Google to help solve Google's problems, and Google has big problems.

The hardware is big and the software is big. There are many millions of lines of software, with servers mostly in C++ and lots of Java and Python for the other pieces. Thousands of engineers work on the code, at the "head" of a single tree comprising all the software, so from day to day there are significant changes to all levels of the tree. A large custom-designed distributed build system makes development at this scale feasible, but it's still big.

And of course, all this software runs on zillions of machines, which are treated as a modest number of independent, networked compute clusters.

In short, development at Google is big, can be slow, and is often clumsy. But it is effective.

The goals of the Go project were to eliminate the slowness and clumsiness of software development at Google, and thereby to make the process more productive and scalable. The language was designed by and for people who write—and read and debug and maintain—large software systems.

Go's purpose is therefore not to do research into programming language design; it is to improve the working environment for its designers and their coworkers. Go is more about software engineering than programming language research. Or to rephrase, it is about language design in the service of software engineering.

But how can a language help software engineering? The rest of this article is an answer to that question.

3. Google公司中的Go语言

为了帮助解决Google自己的问题,Google设计了Go这门编程语言,可以说,Google有很大的问题。

硬件的规模很大而且软件的规模也很大。软件的代码行数以百万计,服务器软件绝大多数用的是C++,还有很多用的是Java,剩下的一部分还用到了Python。成千上万的工程师在这些代码上工作,这些代码位于由所有软件组成的一棵树上的“头部”,所以每天这棵树的各个层次都会发生大量的修改动作。尽管使用了一个大型自主设计的分布式Build系统才让这种规模的开发变得可行,但这个规模还是太大 了。

当然,所有这些软件都是运行在无数台机器之上的,但这些无数台的机器只是被看做数量并不多若干互相独立而仅通过网络互相连接的计算机集群。

简言之,Google公司的开发规模很大,速度可能会比较慢,看上去往往也比较笨拙。但很效果。

Go项目的目标是要消除Google公司软件开发中的慢速和笨拙,从而让开发过程更加高效并且更加具有可伸缩性。该语言的设计者和使用者都是要为大型软件系统编写、阅读和调试以及维护代码的人。

因此,Go语言的目的不是要在编程语言设计方面进行科研;它要能为它的设计者以及设计者的同事们改善工作环境。Go语言考虑更多的是软件工程而不是编程语言方面的科研。或者,换句话说,它是为软件工程服务而进行的语言设计。

但是,编程语言怎么会对软件工程有所帮助呢?下文就是该问题的答案。

4. Pain points

When Go launched, some claimed it was missing particular features or methodologies that were regarded as de rigueur for a modern language. How could Go be worthwhile in the absence of these facilities? Our answer to that is that the properties Go does have address the issues that make large-scale software development difficult. These issues include:

  • slow builds
  • uncontrolled dependencies
  • each programmer using a different subset of the language
  • poor program understanding (code hard to read, poorly documented, and so on)
  • duplication of effort
  • cost of updates
  • version skew
  • difficulty of writing automatic tools
  • cross-language builds

Individual features of a language don't address these issues. A larger view of software engineering is required, and in the design of Go we tried to focus on solutions to these problems.

As a simple, self-contained example, consider the representation of program structure. Some observers objected to Go's C-like block structure with braces, preferring the use of spaces for indentation, in the style of Python or Haskell. However, we have had extensive experience tracking down build and test failures caused by cross-language builds where a Python snippet embedded in another language, for instance through a SWIG invocation, is subtly and invisibly broken by a change in the indentation of the surrounding code. Our position is therefore that, although spaces for indentation is nice for small programs, it doesn't scale well, and the bigger and more heterogeneous the code base, the more trouble it can cause. It is better to forgo convenience for safety and dependability, so Go has brace-bounded blocks.

4. 痛之所在

当Go刚推出来时,有人认为它缺乏某些大家公认的现代编程语言中所特有的特性或方法论。缺了这些东西,Go语言怎么可能会有存在的价值?我们回答这个问题的答案在于,Go的确具有一些特性,而这些特性可以解决困扰大规模软件开发的一些问题。这些问题包括:

  • Build速度缓慢
  • 失控的依赖关系
  • 每个程序员使用同一门语言的不同子集
  • 程序难以理解(代码难以阅读,文档不全面等待)
  • 很多重复性的劳动
  • 更新的代价大
  • 版本偏斜(version skew)
  • 难以编写自动化工具
  • 语言交叉Build(cross-language build)产生的问题

一门语言每个单个的特性都解决不了这些问题。这需要从软件工程的大局观,而在Go语言的设计中我们试图致力于解决所有这些问题。

举个简单而独立的例子,我们来看看程序结果的表示方式。有些评论者反对Go中使用象C一样用花括号表示块结构,他们更喜欢Python或Haskell风格式,使用空格表示缩进。可是,我们无数次地碰到过以下这种由语言交叉Build造成的Build和测试失败:通过类似SWIG调用的方式,将一段Python代码嵌入到另外一种语言中,由于修改了这段代码周围的一些代码的缩进格式,从而导致Python代码也出乎意料地出问题了并且还非常难以觉察。 因此,我们的观点是,虽然空格缩进对于小规模的程序来说非常适用,但对大点的程序可不尽然,而且程序规模越大、代码库中的代码语言种类越多,空格缩进造成的问题就会越多。为了安全可靠,舍弃这点便利还是更好一点,因此Go采用了花括号表示的语句块。

5. Dependencies in C and C++

A more substantial illustration of scaling and other issues arises in the handling of package dependencies. We begin the discussion with a review of how they work in C and C++.

ANSI C, first standardized in 1989, promoted the idea of#ifndef"guards" in the standard header files. The idea, which is ubiquitous now, is that each header file be bracketed with a conditional compilation clause so that the file may be included multiple times without error. For instance, the Unix header file<sys/stat.h>looks schematically like this:

/* Large copyright and licensing notice */
#ifndef _SYS_STAT_H_
#define _SYS_STAT_H_
/* Types and other definitions */
#endif

The intent is that the C preprocessor reads in the file but disregards the contents on the second and subsequent readings of the file. The symbol_SYS_STAT_H_, defined the first time the file is read, "guards" the invocations that follow.

This design has some nice properties, most important that each header file can safely#includeall its dependencies, even if other header files will also include them. If that rule is followed, it permits orderly code that, for instance, sorts the#includeclauses alphabetically.

But it scales very badly.

5.C和C++中的依赖

在处理包依赖(package dependency)时会出现一些伸缩性以及其它方面的问题,这些问题可以更加实质性的说明上个小结中提出的问题。让我们先来回顾一下C和C++是如何处理包依赖的。

ANSI C第一次进行标准化是在1989年,它提倡要在标准的头文件中使用#ifndef这样的"防护措施"。 这个观点现已广泛采用,就是要求每个头文件都要用一个条件编译语句(clause)括起来,这样就可以将该头文件包含多次而不会导致编译错误。比如,Unix中的头文件<sys/stat.h>看上去大致是这样的:

/* Large copyright and licensing notice */
#ifndef _SYS_STAT_H_
#define _SYS_STAT_H_
/* Types and other definitions */
#endif

此举的目的是让C的预处理器在第二次以及以后读到该文件时要完全忽略该头文件。符号_SYS_STAT_H_在文件第一次读到时进行定义,可以“防止”后继的调用。

这么设计有一些好处,最重要的是可以让每个头文件能够安全地include它所有的依赖,即时其它的头文件也有同样的include语句也不会出问题。 如果遵循此规则,就可以通过对所有的#include语句按字母顺序进行排序,让代码看上去更整洁。

但是,这种设计的可伸缩性非常差。

In 1984, a compilation ofps.c, the source to the Unixpscommand, was observed to#include<sys/stat.h>37 times by the time all the preprocessing had been done. Even though the contents are discarded 36 times while doing so, most C implementations would open the file, read it, and scan it all 37 times. Without great cleverness, in fact, that behavior is required by the potentially complex macro semantics of the C preprocessor.

The effect on software is the gradual accumulation of#includeclauses in C programs. It won't break a program to add them, and it's very hard to know when they are no longer needed. Deleting a#includeand compiling the program again isn't even sufficient to test that, since another#includemight itself contain a#includethat pulls it in anyway.

Technically speaking, it does not have to be like that. Realizing the long-term problems with the use of#ifndefguards, the designers of the Plan 9 libraries took a different, non-ANSI-standard approach. In Plan 9, header files were forbidden from containing further#includeclauses; all#includeswere required to be in the top-level C file. This required some discipline, of course—the programmer was required to list the necessary dependencies exactly once, in the correct order—but documentation helped and in practice it worked very well. The result was that, no matter how many dependencies a C source file had, each#includefile was read exactly once when compiling that file. And, of course, it was also easy to see if an#includewas necessary by taking it out: the edited program would compile if and only if the dependency was unnecessary.

在1984年,有人发现在编译Unix中ps命令的源程序ps.c时,在整个的预处理过程中,它包含了<sys/stat.h>这个头文件37次之多。尽管在这么多次的包含中有36次它的文件的内容都不会被包含进来,但绝大多数C编译器实现都会把"打开文件并读取文件内容然后进行字符串扫描"这串动作做37遍。这么做可真不聪明,实际上,C语言的预处理器要处理的宏具有如此复杂的语义,其势必导致这种行为。

对软件产生的效果就是在C程序中不断的堆积#include语句。多加一些#include语句并不会导致程序出问题,而且想判断出其中哪些是再也不需要了的也很困难。删除一条#include语句然后再进行编译也不太足以判断出来,因为还可能有另外一条#include所包含的文件中本身还包含了你刚刚删除的那条#include语句。

从技术角度讲,事情并不一定非得弄成这样。在意识到使用#ifndef这种防护措施所带来的长期问题之后,Plan 9的library的设计者采取了一种不同的、非ANSI标准的方法。Plan 9禁止在头文件中使用#include语句,并要求将所有的#include语句放到顶层的C文件中。 当然,这么做需要一些训练 —— 程序员需要一次列出所有需要的依赖,还要以正确的顺序排列 —— 但是文档可以帮忙而且实践中效果也非常好。这么做的结果是,一个C源程序文件无论需要多少依赖,在对它进行编译时,每个#include文件只会被读一次。当然,这样一来,对于任何#include语句都可以通过先拿掉然后在进行编译的方式判断出这条#include语句到底有无include的必要:当且仅当不需要该依赖时,拿掉#include后的源程序才能仍然可以通过编译。

The most important result of the Plan 9 approach was much faster compilation: the amount of I/O the compilation requires can be dramatically less than when compiling a program using libraries with#ifndefguards.

Outside of Plan 9, though, the "guarded" approach is accepted practice for C and C++. In fact, C++ exacerbates the problem by using the same approach at finer granularity. By convention, C++ programs are usually structured with one header file per class, or perhaps small set of related classes, a grouping much smaller than, say,<stdio.h>. The dependency tree is therefore much more intricate, reflecting not library dependencies but the full type hierarchy. Moreover, C++ header files usually contain real code—type, method, and template declarations—not just the simple constants and function signatures typical of a C header file. Thus not only does C++ push more to the compiler, what it pushes is harder to compile, and each invocation of the compiler must reprocess this information. When building a large C++ binary, the compiler might be taught thousands of times how to represent a string by processing the header file<string>. (For the record, around 1984 Tom Cargill observed that the use of the C preprocessor for dependency management would be a long-term liability for C++ and should be addressed.)

Plan 9的这种方式产生的一个最重要的结果是编译速度比以前快了很多:采用这种方式后编译过程中所需的I/O量,同采用#ifndef的库相比,显著地减少了不少。

但在Plan 9之外,那种“防护”式的方式依然是C和C++编程实践中大家广为接受的方式。实际上,C++还恶化了该问题,因为它把这种防护措施使用到了更细的粒度之上。按照惯例,C++程序通常采用每个类或者一小组相关的类拥有一个头文件这种结构,这种分组方式要更小,比方说,同<stdio.h>相比要小。因而其依赖树更加错综复杂,它反映的不是对库的依赖而是对完整类型层次结构的依赖。而且,C++的头文件通常包含真正的代码 —— 类型、方法以及模板声明 ——不像一般的C语言头文件里面仅仅有一些简单的常量定义和函数签名。这样,C++就把更多的工作推给了编译器,这些东西编译起来要更难一些,而且每次编译时编译器都必须重复处理这些信息。当要build一个比较大型的C++二进制程序时,编译器可能需要成千上万次地处理头文件<string>以了解字符串的表示方式。(根据当时的记录,大约在1984年,Tom Cargill说道,在C++中使用C预处理器来处理依赖管理将是个长期的不利因素,这个问题应该得到解决。)

The construction of a single C++ binary at Google can open and read hundreds of individual header files tens of thousands of times. In 2007, build engineers at Google instrumented the compilation of a major Google binary. The file contained about two thousand files that, if simply concatenated together, totaled 4.2 megabytes. By the time the#includeshad been expanded, over 8 gigabytes were being delivered to the input of the compiler, a blow-up of 2000 bytes for every C++ source byte.

As another data point, in 2003 Google's build system was moved from a single Makefile to a per-directory design with better-managed, more explicit dependencies. A typical binary shrank about 40% in file size, just from having more accurate dependencies recorded. Even so, the properties of C++ (or C for that matter) make it impractical to verify those dependencies automatically, and today we still do not have an accurate understanding of the dependency requirements of large Google C++ binaries.

在Google,Build一个单个的C++二进制文件就能够数万次地打开并读取数百个头文件中的每个头文件。在2007年,Google的build工程师们编译了一次Google里一个比较主要的C++二进制程序。该文件包含了两千个文件,如果只是将这些文件串接到一起,总大型为4.2M。将#include完全扩展完成后,就有8G的内容丢给编译器编译,也就是说,C++源代码中的每个自己都膨胀成到了2000字节。 还有一个数据是,在2003年Google的Build系统转变了做法,在每个目录中安排了一个Makefile,这样可以让依赖更加清晰明了并且也能好的进行管理。一般的二进制文件大小都减小了40%,就因为记录了更准确的依赖关系。即使如此,C++(或者说C引起的这个问题)的特性使得自动对依赖关系进行验证无法得以实现,直到今天我们仍然我发准确掌握Google中大型的C++二进制程序的依赖要求的具体情况。

The consequence of these uncontrolled dependencies and massive scale is that it is impractical to build Google server binaries on a single computer, so a large distributed compilation system was created. With this system, involving many machines, much caching, and much complexity (the build system is a large program in its own right), builds at Google are practical, if still cumbersome.

Even with the distributed build system, a large Google build can still take many minutes. That 2007 binary took 45 minutes using a precursor distributed build system; today's version of the same program takes 27 minutes, but of course the program and its dependencies have grown in the interim. The engineering effort required to scale up the build system has barely been able to stay ahead of the growth of the software it is constructing.

由于这种失控的依赖关系以及程序的规模非常之大,所以在单个的计算机上build出Google的服务器二进制程序就变得不太实际了,因此我们创建了一个大型分布式编译系统。该系统非常复杂(这个Build系统本身也是个大型程序)还使用了大量机器以及大量缓存,藉此在Google进行Build才算行得通了,尽管还是有些困难。 即时采用了分布式Build系统,在Google进行一次大规模的build仍需要花几十分钟的时间才能完成。前文提到的2007年那个二进制程序使用上一版本的分布式build系统花了45分钟进行build。现在所花的时间是27分钟,但是,这个程序的长度以及它的依赖关系在此期间当然也增加了。为了按比例增大build系统而在工程方面所付出的劳动刚刚比软件创建的增长速度提前了一小步。

6. Enter Go

When builds are slow, there is time to think. The origin myth for Go states that it was during one of those 45 minute builds that Go was conceived. It was believed to be worth trying to design a new language suitable for writing large Google programs such as web servers, with software engineering considerations that would improve the quality of life of Google programmers.

Although the discussion so far has focused on dependencies, there are many other issues that need attention. The primary considerations for any language to succeed in this context are:

  • It must work at scale, for large programs with large numbers of dependencies, with large teams of programmers working on them.
  • It must be familiar, roughly C-like. Programmers working at Google are early in their careers and are most familiar with procedural languages, particularly from the C family. The need to get programmers productive quickly in a new language means that the language cannot be too radical.
  • It must be modern. C, C++, and to some extent Java are quite old, designed before the advent of multicore machines, networking, and web application development. There are features of the modern world that are better met by newer approaches, such as built-in concurrency.

With that background, then, let us look at the design of Go from a software engineering perspective.

6. 走进 Go 语言

当编译缓慢进行时,我们有充足的时间来思考。关于 Go 的起源有一个传说,话说正是一次长达45分钟的编译过程中,Go 的设想出现了。人们深信,为类似谷歌网络服务这样的大型程序编写一门新的语言是很有意义的,软件工程师们认为这将极大的改善谷歌程序员的生活质量。

尽管现在的讨论更专注于依赖关系,这里依然还有很多其他需要关注的问题。这一门成功语言的主要因素是:

  • 它必须适应于大规模开发,如拥有大量依赖的大型程序,且又一个很大的程序员团队为之工作。
  • 它必须是熟悉的,大致为 C 风格的。谷歌的程序员在职业生涯的早期,对函数式语言,特别是 C 家族更加熟稔。要想程序员用一门新语言快速开发,新语言的语法不能过于激进。
  • 它必须是现代的。C、C++以及Java的某些方面,已经过于老旧,设计于多核计算机、网络和网络应用出现之前。新方法能够满足现代世界的特性,例如内置的并发。

说完了背景,现在让我们从软件工程的角度谈一谈 Go 语言的设计。

返回顶部
顶部