翻译于 2014/10/15 13:07
3 人 顶 此译文
Apache Storm recently became a top-level project, marking a huge milestone for the project and for me personally. It's crazy to think that four years ago Storm was nothing more than an idea in my head, and now it's a thriving project with a large community used by a ton of companies. In this post I want to look back at how Storm got to this point and the lessons I learned along the way.
The topics I will cover through Storm's history naturally follow whatever key challenges I had to deal with at those points in time. The first 25% of this post is about how Storm was conceived and initially created, so the main topics covered there are the technical issues I had to figure out to enable the project to exist. The rest of the post is about releasing Storm and establishing it as a widely used project with active user and developer communities. The main topics discussed there are marketing, communication, and community development.
Any successful project requires two things:
It solves a useful problem
You are able to convince a significant number of people that your project is the best solution to their problem
What I think many developers fail to understand is that achieving that second condition is as hard and as interesting as building the project itself. I hope this becomes apparent as you read through Storm's history.
Storm originated out of my work at BackType. At BackType we built analytics products to help businesses understand their impact on social media both historically and in realtime. Before Storm, the realtime portions of our implementation were done using a standard queues and workers approach. For example, we would write the Twitter firehose to a set of queues, and then Python workers would read those tweets and process them. Oftentimes these workers would send messages through another set of queues to another set of workers for further processing.
We were very unsatisfied with this approach. It was brittle – we had to make sure the queues and workers all stayed up – and it was very cumbersome to build apps. Most of the logic we were writing had to do with where to send/receive messages, how to serialize/deserialize messages, and so on. The actual business logic was a small portion of the codebase. Plus, it didn't feel right – the logic for one application would be spread across many workers, all of which were deployed separately. It felt like all that logic should be self-contained in one application.
Storm来自于我在BackType的工作. 在BackType我们的工作是产品分析以帮助用户实时的了解他们的产品对社交媒体的影响,当然也能查询到历史记录. 在Storm之前,实时部分的实现用的是标准的队列和worker的方法. 比如, 我们向一个队列集合里面写入Twitter firehose, 再用Python worker从这个队列集合读取tweets并处理他们. 通常情况下这些worker需要通过另一个队列集合向另一个worker集合发送消息来进一步处理这些tweets.
我们非常不满意这种处理方式. 这种方法不稳定--我们必须要保证所有的队列和worker一直处于工作状态--并且在构建apps它也显得很笨重. 我们写的大部分逻辑都集中在从哪发送/获取信息和怎样序列化/反序列化这些消息等等. 但是在实际的业务逻辑里面它只是代码库的一小部分.再加上一个应用的正确逻辑应该是可以跨多个worker,并且这些worker之间是可以独立部署的. 一个应用的逻辑也应该是自我约束的.
In December of 2010, I had my first big realization. That's when I came up with the idea of a "stream" as a distributed abstraction. Streams would be produced and processed in parallel, but they could be represented in a single program as a single abstraction. That led me to the idea of "spouts" and "bolts" – a spout produces brand new streams, and a bolt takes in streams as input and produces streams as output. They key insight was that spouts and bolts were inherently parallel, similar to how mappers and reducers are inherently parallel in Hadoop. Bolts would simply subscribe to whatever streams they need to process and indicate how the incoming stream should be partitioned to the bolt. Finally, the top-level abstraction I came up with was the "topology", a network of spouts and bolts.
I tested these abstractions against our use cases at BackType and everything fit together very nicely. I especially liked the fact that all the grunt work we were dealing with before – sending/receiving messages, serialization, deployment, etc. would be automated by these new abstractions.
在2010年12月，我完成了第一个重大实现。也就是在那时我想出了将"stream"作为分布式抽象的想法。stream会被并行地产生和处理，但它们可以在一个程序中被表示为一个单独的抽象。这使我产生了"spout"和"bolt"的想法－spout生产全新的stream, 而bolt将产生的stream作为输入并产出stream。这就是spout和bolt的并行本质， 它与hadoop中mapper和reducer的并行原理相似。bolt只需简单地对其要进行处理的stream进行注册，并指出接入的stream在bolt中的划分方式。最后，我所想到的顶级抽象就是"topology"－由spout和bolt组成的网络。
Before embarking on building Storm, I wanted to validate my ideas against a wider set of use cases. So I sent out this tweet:
I'm working on a new kind of stream processing system. If this sounds interesting to you, ping me. I want to learn your use cases.— Nathan Marz (@nathanmarz) December 14, 2010
A bunch of people responded and we emailed back and forth with each other. It became clear that my abstractions were very, very sound.
I then embarked on designing Storm. I quickly hit a roadblock when trying to figure out how to pass messages between spouts and bolts. My initial thoughts were that I would mimic the queues and workers approach we were doing before and use a message broker like RabbitMQ to pass the intermediate messages. I actually spent a bunch of time diving into RabbitMQ to see how it be used for this purpose and what that would imply operationally. However, the whole idea of using message brokers for intermediate messages didn't feel right and I decided to sit on Storm until I could better think things through.
我正在研究一个全新的流处理系统。如果你对这个感兴趣请联系我，我需要你的用例。— Nathan Marz (@nathanmarz) December 14, 2010
然后我开始了Storm的设计。在我尝试找出spout和bolt间传递消息的方式时我很快就被卡住了。我最初的想法是模仿我们之前采用的队列和工人方法并使用一个像 RabbitMQ 的消息代理来传递中间消息。我实际花费了大量时间来研究RabbitMQ用于此目的的方案和操作上的影响。但是，为中间消息使用消息代理的想法似乎并不好，于是我决定暂时搁置Storm直到我能想到更好的方法。
The reason I thought I needed those intermediate brokers was to provide guarantees on the processing of data. If a bolt failed to process a message, it could replay it from whatever broker it got the message from. However, a lot of things bothered me about intermediate message brokers:
They were a huge, complex moving part that would have to be scaled alongside Storm.
They create uncomfortable situations, such as what to do when a topology is redeployed. There might still be intermediate messages on the brokers that are no longer compatible with the new version of the topology. So those messages would have to be cleaned up/ignored somehow.
They make fault-tolerance harder. I would have to figure out what to do not just when Storm workers went down, but also when individual brokers went down.
They're slow. Instead of sending messages directly between spouts and bolts, the messages go through a 3rd party, and not only that, the messages need to be persisted to disk.
I had an instinct that there should be a way to get that message processing guarantee without using intermediate message brokers. So I spent a lot of time pondering how to get that guarantee with spouts and bolts passing messages directly to one another. Without intermediate message persistence, it was implied that retries would have to come from the source (the spout). The tricky thing was that the failure of processing could happen anywhere downstream from the spout, on a completely different server, and this would have to be detected with perfect accuracy.
After a few weeks of thinking about the problem I finally had my flash of insight. I developed an algorithm based on random numbers and xors that would only require about 20 bytes to track each spout tuple, regardless of how much processing was triggered downstream. It's easily one of the best algorithms I ever developed and one of the few times in my career I can say I would not have come up with it without a good computer science education.
Once I figured out this algorithm, I knew I was onto something big. It massively simplified the design of the system by avoiding all the aforementioned problems, along with making things way more performant. (Amusingly, the day I figured out the algorithm I had a date with a girl I'd met recently. But I was so excited by what I'd just discovered that I was completely distracted the whole time. Needless to say, I did not do well with the girl.)
在苦思冥想了几周后我突然灵光一现。我开发了一个基于随机数和异或运算的算法，它只需大约20字节就可以跟踪每个spout tuple, 不论下游触发了多少处理过程。它是我研究出的最优算法之一，它也是在我生涯中有限的几次，可以说如果没有接受良好的计算机科学教育我是不会想出的算法。
Over the next 5 months, I built the first version of Storm. From the beginning I knew I wanted to open source it, so I made some key decisions in the early days with that in mind. First off, I made all of Storm's APIs in Java, but implemented Storm in Clojure. By keeping Storm's APIs 100% Java, Storm was ensured to have a very large amount of potential users. By doing the implementation in Clojure, I was able to be a lot more productive and get the project working sooner.
I also planned from the beginning to make Storm usable from non-JVM languages. Topologies are defined as Thrift data structures, and topologies are submitted using a Thrift API. Additionally, I designed a protocol so that spouts and bolts could be implemented in any language. Making Storm accessible from other languages makes the project accessible by more people. It makes it much easier for people to migrate to Storm, as they don't necessarily have to rewrite their existing realtime processing in Java. Instead they can port their existing code to run on Storm's multi-language API.
一开始时我也计划在非JVM的语言中使用Storm。拓扑被定义为Thrift的数据结构并提交了一个Thrift的API。除此之外，我设计了一个协议使得spouts和bolts可以在任何语言中的实现。Storm可以应用在其他语言让更多的人使用了项目。它让人们迁移到Storm中更容易，因为他们不必用 JAVA 重写现有的实时处理器。相反，他们可以迁移现有的代码运行在Storm的多语言的API上。
I was a long time Hadoop user and used my knowledge of Hadoop's design to make Storm's design better. For example, one of the most aggravating issues I dealt with from Hadoop was that in certain cases Hadoop workers would not shut down and the processes would just sit there doing nothing. Eventually these "zombie processes" would accumulate, soaking up resources and making the cluster inoperable. The core issue was that Hadoop put the burden of worker shutdown on the worker itself, and for a variety of reasons workers would sometimes fail to shut themselves down. So in Storm's design, I put the burden of worker shutdown on the same Storm daemon that started the worker in the first place. This turned out to be a lot more robust and Storm has never had issues with zombie processes.
Another problem I faced with Hadoop was if the JobTracker died for any reason, any running jobs would terminate. This was a real hair-puller when you had jobs that had been running for many days. With Storm, it was even more unacceptable to have a single point of failure like that since topologies are meant to run forever. So I designed Storm to be "process fault-tolerant": if a Storm daemon is killed and restarted it has absolutely no effect on running topologies. It makes the engineering more challenging, since you have to consider the effect of the process being kill -9'd and restarted at any point in the program, but it makes things far more robust.
我是Hadoop的长期用户，用我已有的Hadoop经验来设计Storm使得Storm会更好.比如, 在Hadoop的workers不关闭，所有的进程不做任何操作的情况下。这些”僵死进程“积累到一定程度用尽了所有的资源，最终会导致这个集群停止不能运作--Hadoop最严重的问题之一. 这个问题的关键在于Hadoop中关闭worker的工作是由它自身负责。但是有时会因为其他很多原因导致worker自我关闭失败. 所以在Storm的设计里面，我把关闭worker的任务交给第一次启动这个worker的daemon负责.最后也证明Storm的这种设计比Hadoop更鲁棒,也不存在”僵尸进程”的问题.
我在Hadoop中遇到的另一个问题就是如果JobTracker因为某种原因停掉,那么在这个JobTracker跑的所有的任务都会终止.真正让人着急的是已经跑了几天的任务就这样被停止了.在Storm里面不会存在单个节点失败的问题，因为“topologies"是一旦开始就不会停止的。因为设计Storm时就加入了”进程容错“机制：一个Storm daemon的停止和重启对一个正在运行的topologies绝对不会有影响. 当然这种考量会使设计面临更多的挑战，因为你必须要考虑到进程可能在任何时候用kill -9强行停止和重启的情况,但是这样也使它更健壮。
A key decision I made early on in development was to assign one of our interns, Jason Jackson, to develop an automated deploy for Storm on AWS. This massively accelerated Storm's development, as it made it easy for me to test clusters of all different sizes and configurations. I really cannot emphasize enough how important this tool was, as it enabled me to iterate much, much faster.
In May of 2011, BackType got into acquisition talks with Twitter. The acquisition made a lot of sense for us for a variety of reasons. Additionally, it was attractive to me because I realized I could do so much more with Storm by releasing it from within Twitter than from within BackType. Being able to make use of the Twitter brand was very compelling.
During acquisition talks I announced Storm to the world by writing a post on BackType's tech blog. The purpose of the post was actually just to raise our valuation in the negotiations with Twitter. And it worked: Twitter became extremely interested in the technology, and when they did their tech due-diligence on us, the entire due-diligence turned into a big demo of Storm.
在开发阶段的早期我做的一个很关键性的决定就是让我们的一个实习生--Jason Jackson-- 在AWS上做一个Storm的自动部署工具.这个工具在很大程度加快了Storm的开发，因为它能够让我很容易的测试不同大小的集群和配置， 并且迭代更快.
2011年5月,BackType与Twitter谈收购问题. 从各方面来讲，这次收购对我们来讲非常的重要.另外, 对我个人而言也很具有吸引力,因为Twitter品牌效应的作用，由Twitter来发布Storm比由BackType发布更能让Storm有所作为.
The post had some surprising other effects. In the post I casually referred to Storm as "the Hadoop of realtime", and this phrase really caught on. To this day people still use it, and it even gets butchered into "realtime Hadoop" by many people. This accidental branding was really powerful and helped with adoption later on.
We officially joined Twitter in July of 2011, and I immediately began planning Storm's release.
There are two ways you can go about releasing open source software. The first is to "go big", build a lot of hype for the project, and then get as much exposure as possible on release. This approach can be risky though, since if the quality isn't there or you mess up the messaging, you will alienate a huge number of people to the project on day one. That could kill any chance the project had to be successful.
The second approach is to quietly release the code and let the software slowly gain adoption. This avoids the risks of the first approach, but it has its own risk of people viewing the project as insignificant and ignoring it.
这引发了其他令人惊讶的影响。在那篇博客上我不经意的提及Storm作为 “实时的Hadoop” ，这句话就这样流行起来。直到现在人们还在使用它，甚至被许多人简洁地称为 “实时Hadoop” 。这个意外的品牌是非常强有力的，也有利于推广。