加载中

Towards the end of last year I attended a workshop with my colleagues in ThoughtWorks to discuss the nature of “event-driven” applications. Over the last few years we've been building lots of systems that make a lot of use of events, and they've been often praised, and often damned. Our North American office organized a summit, and ThoughtWorks senior developers from all over the world showed up to share ideas.

The biggest outcome of the summit was recognizing that when people talk about “events”, they actually mean some quite different things. So we spent a lot of time trying to tease out what some useful patterns might be. This note is a brief summary of the main ones we identified.

去年年底我和我在ThoughtWorks 的同事参加了一个专题研讨会,讨论事件驱动应用程序的本质。过去几年来我们已经构建了大量的使用事件的系统,一些受人称赞,而有些则遭到人们谴责。我们北美办公室组织了一次峰会,来自世界的ThoughtWorks 高级开发工程师参与了这次峰会,并做了分享。

峰会的最大成果是认识到当人们讨论“事件”时,他们实际上意味着一些十分不同的东西。所以我们花了大量时间尝试弄清楚一些有用的模式可能是什么。此说明是我们确定的主要内容的简要总结。

Event Notification

This happens when a system sends event messages to notify other systems of a change in its domain. A key element of event notification is that the source system doesn't really care much about the response. Often it doesn't expect any answer at all, or if there is a response that the source does care about, it's indirect. There would be a marked separation between the logic flow that sends the event and any logic flow that responds to some reaction to that event.

Event notification is nice because it implies a low level of coupling, and is pretty simple to set up. It can become problematic, however, if there really is a logical flow that runs over various event notifications. The problem is that it can be hard to see such a flow as it's not explicit in any program text. Often the only way to figure out this flow is from monitoring a live system. This can make it hard to debug and modify such a flow. The danger is that it's very easy to make nicely decoupled systems with event notification, without realizing that you're losing sight of that larger-scale flow, and thus set yourself up for trouble in future years. The pattern is still very useful, but you have to be careful of the trap.

事件通知

当一个系统发送了事件消息通知其它系统在自身域中做改变时,会发生事件通知。事件通知的一个关键因素是源系统并不真正十分关心响应。通常源系统根本就不希望得到应答,或者如果有一个源系统关心的响应,这也是间接的。发送事件的逻辑流与响应该事件的某些反应的任何逻辑流之间会有明显的分离。

事件通知很好,因为它实现了一个低水平的耦合,并且很容易设置。然而,这会成为问题,如果真有一个逻辑流运行了各种事件通知。这个问题是很难看到这样一个流程,因为在任何程序文本中都没有明确。通常,发现此流程的唯一方式是监控实时系统。这样就很难去debug和修改这样的流程。危险在于使用事件通知非常容易实现了良好的系统解耦,而没有意识到你正在失去大规模的流程控制,因此这就给未来几年设置了麻烦。这个模式依然是非常有用的,但是你必须要小心这里的陷阱。

A simple example of this trap is when an event is used as a passive-aggressive command. This happens when the source system expects the recipient to carry out an action, and ought to use a command message to show that intention, but styles the message as an event instead.

An event need not carry much data on it, often just some id information and a link back to the sender that can be queried for more information. The receiver knows something has changed, may get some minimal information on the nature of the change, but then issues a request back to the sender to decide what to do next.

陷阱的一个简单例子是:当一个事件被作为一个被动攻击命令来使用。当源系统期望接收方执行一个动作,并且应该使用命令消息来表明这种意图,但将消息设置为事件的方式时会有问题发生。

一个事件不必携带过多的数据,通常仅仅携带一些id信息和一个返回到发送方的连接,这样可以查询到更多的信息。接收者知道事情已经发生了变化,可能会得到一些关于变化本质的最小信息,然后给发送者发出一个请求决定下一步该做什么。

Event-Carried State Transfer

This pattern shows up when you want to update clients of a system in such a way that they don't need to contact the source system in order to do further work. A customer management system might fire off events whenever a customer changes their details (such as an address) with events that contain details of the data that changed. A recipient can then update it's own copy of customer data with the changes, so that it never needs to talk to the main customer system in order to do its work in the future.

An obvious down-side of this pattern is that there's lots of data schlepped around and lots of copies. But that's less of a problem in an age of abundant storage. What we gain is greater resilience, since the recipient systems can function if the customer system is becomes unavailable. We reduce latency, as there's no remote call required to access customer information. We don't have to worry about load on the customer system to satisfy queries from all the consumer systems. But it does involve more complexity on the receiver, since it has to sort out maintaining all the state, when it's usually easier just to call the sender for more information when needed.

事件状态转移

当你想更新系统的客户端,又不用通知源系统做进一步工作时,这种模式就出现了。 客户管理系统可能会在客户更改其详细信息(例如地址)时触发事件,事件包含已更改数据详细信息。 然后,接收方可以使用最新数据来更新自己的客户数据副本,这样就无需源客户系统通信,以便以后开展工作。

这种模式明显的缺点就是存在大量数据和副本。 但在海量存储的时代,这个问题也不大。 我们获得更大的弹性,因为如果客户系统出现问题,接收系统仍然可以正常工作。 由于访问客户信息不需要远程调用,这就降低了延时。 我们不必担心客户系统上的负载是否满足所有消费者系统的查询。 但接收方确实就更复杂了,因为它必须维持所有状态,简单点的方式就是在必要时通知发送方获取更多信息。

Event-Sourcing

The core idea of event sourcing is that whenever we make a change to the state of a system, we record that state change as an event, and we can confidently rebuild the system state by reprocessing the events at any time in the future. The event store becomes the principal source of truth, and the system state is purely derived from it. For programmers, the best example of this is a version-control system. The log of all the commits is the event store and the working copy of the source tree is the system state.

Event-sourcing introduces a lot of issues, which I won't go into here, but I do want to highlight some common misconceptions. There's no need for event processing to be asynchronous, consider the case of updating a local git repository - that's entirely a synchronous operation, as is updating a centralized version-control system like subversion. Certainly having all these commits allows you to do all sorts of interesting behaviors, git is the great example, but the core commit is fundamentally a simple action.

事件源

事件源主要讲的是,无论什么时候我们去改变一个系统的状态,我们将此做为一个事件来记录状态改变。同时我们确信在未来的任何时间通过重新处理事件能够重建系统状态。事件存储成为事实上的主要来源,系统的状态完全源于事件存储。对于程序员来说,最好的例子是版本控制系统。所有的提交日志是事件存储,源代码树的工作拷贝是这个系统的状态。

事件源介绍了很多议题,我不会在这里逐一介绍,但是我很想强调一些常见误解。事件处理用不着使用异步完成,想一想更新本地git版本库的场景-那是一个完全的同步操作,就像更新subversion这样的集中式版本控制系统一样。当然拥有所有这些提交允许您做各种有趣的行为,git是一个很好的例子,但核心功能提交从根本上说是一个简单的动作。

Another common mistake is to assume that everyone using an event-sourced system should understand and access the event log to determine useful data. But knowledge of the event log can be limited. I'm writing this in an editor that is ignorant of all the commits in my source tree, it just assumes there is a file on the disk. Much of the processing in an event-sourced system can be based on a useful working copy. Only elements that really need the information in the event log should have to manipulate it. We can have multiple working copies with different schema, if that helps; but usually there should be a clear separation between domain processing and deriving a working copy from the event log.

When working with an event log, it is often useful to build snapshots of the working copy so that you don't have to process all the events from scratch every time you need a working copy. Indeed there is a duality here, we can look at the event log as either a list of changes, or as a list of states. We can derive one from the other. Version-control systems often mix snapshots and deltas in their event log in order to get the best performance. [1]

另一个常见错误就是,使用事件源系统的人都应该理解并访问事件日志来确定有用的数据。但是对事件日志的认识可能是有限的。我在一个编辑器中写这个,它不清楚源代码树中的所有提交,它认为磁盘上只有一个文件。事件源系统中的大部分处理可以基于有用的工作副本。只有真正需要事件日志中信息的那些元素才能处理副本。如果这样有效果的话,我们可以有多个具有不同模式的工作副本;但通常应该在域处理和派生工作副本之间有明确区分。

使用事件日志时,构建工作副本的快照通常很有用,这样你就不必在每次需要工作副本时都从头开始处理所有事件。实际上这里存在二元性,我们可以将事件日志视为变更列表,或者是状态列表。我们可以从另一个派生出一个。版本控制系统通常在事件日志中混合快照和增量,以获得最佳性能。 [1]

Event-sourcing has many interesting benefits, which easily come to mind when thinking of the value of version-control systems. The event log provides a strong audit capability (accounting transactions are an event source for account balances). We can recreate historic states by replaying the event log up to a point. We can explore alternative histories by injecting hypothetical events when replaying. Event sourcing make it plausible to have non-durable working copies, such as a Memory Image.

Event sourcing does have its problems. Replaying events becomes problematic when results depend on interactions with outside systems. We have to figure out how to deal with changes in the schema of events over time. Many people find the event processing adds a lot of complexity to an application (although I do wonder if that's more due to poor separation between components that derive a working copy and components that do the domain processing).

事件源有许多有趣的优点,在考虑版本控制系统的价值时很容易想到。事件日志提供强大的审计功能(会计事务是帐户余额的事件源)。我们可以通过将事件日志重播到某个点来重新创建历史状态。我们也可以在重播时注入假设事件来探索可代替的历史。事件源使得非持久性的工作副本成为可能,例如记忆图像

事件源也确实存在一些问题。当结果取决于与外部系统的交互时,重播事件就会出现问题。我们必须弄清楚如何处理随着时间变化的事件模式。许多人发现事件处理为应用程序增加了很多复杂性(尽管我确实想知道这是否是由于派生工作副本的组件与执行域处理的组件之间的分离不良造成的)。

CQRS

Command Query Responsibility Segregation (CQRS) is the notion of having separate data structures for reading and writing information. Strictly CQRS isn't really about events, since you can use CQRS without any events present in your design. But commonly people do combine CQRS with the earlier patterns here, hence their presence at the summit.

The justification for CQRS is that in complex domains, a single model to handle both reads and writes gets too complicated, and we can simplify by separating the models. This is particularly appealing when you have a difference in access patterns, such as lots of reads and very few writes. But the gain for using CQRS has to be balanced against the additional complexity of having separate models. I find many of my colleagues are deeply wary of using CQRS, finding it often misused.

CQRS

命令查询责任隔离(CQRS)这一概念是指读取和写入信息具有单独的数据结构。 严格地说,CQRS 与事件无关,因为在没有任何事件的情况下,你可以在设计中使用 CQRS 。 但通常人们会将 CQRS 与之前的模式结合起来,所以他们会出现在峰会。

CQRS 的意义在于, 在复杂域中单个模型处理读取和写入的会过于复杂,我们可以通过分离模型来简化。 当你在访问模式上有所不同时(例如多读少写),这一点尤其具有吸引力。 但是,使用 CQRS 的收益必须与具有单独模型的额外复杂性相平衡。 我发现很多同事对使用 CQRS 非常警惕,经常被滥用。

Making sense of these patterns

As a sort of software botanist, keen to collect samples, I find this a tricky terrain. The core problem is confusing the different patterns. On one project the capable and experienced project manager told me that event sourcing had been a disaster - any change took twice the work to update both the read and write models. Just in that phrase I can detect a potential confusion between event-sourcing and CQRS - so how can I figure out which was culprit? The tech lead on the project claimed the main problem was lots of asynchronous communications, certainly a known complexity-booster, but not one that's a necessary part of either event-sourcing or CQRS. Furthermore we have to beware that all these patterns are good in the right place and bad when put on the wrong terrain. But it's hard to figure out what the right terrain is when we conflate the patterns.

弄懂这些模式

作为一名热衷于收集样本的软件植物学家,我发现这是一个棘手的领域。核心问题是混淆了不同的模式。在一个项目中,有能力和经验丰富的项目经理告诉我,事件溯源是一团糟 - 任何改动都需要双倍时间来更新读写模型。就在这句话中,我可以发现事件溯源和CQRS之间可能存在些混淆 - 那么我怎样才能找出哪个是罪魁祸首? 该项目的技术主管声称主要问题是大量的异步通信,当然它是一个已知的复杂性助推器,但并不是一个必须参与事件溯源或CQRS的部分。此外,我们必须要注意所有这些模式在正确的地方都很棒,而在错误的领域上则很糟糕。但是当我们混淆模式时,很难弄清楚正确的领域是什么。

I'd love to write some definitive treatise that sorts all this confusion out, and gives solid guidelines on how to do each pattern well, and when it should be used. Sadly I don't have the time to do it. I write this note in the hope it will be useful, but am quite aware that it falls well short of what is really needed.

Footnotes

1: I sometimes hear people say that git isn't an example of event sourcing because it stores states of files and trees in .git/objects. But whether a system uses changes or snapshots for its internal storage doesn't affect whether it's event sourced. Git happily will summon up a list of changes for me on demand. And when it compresses data into packfiles, it does use a combination of snapshots and changes, with the mix chosen for performance reasons.

我想写一些明确的论文,将所有这些混淆整理妥当,并为如何很好地使用每个模式以及何时应该使用它提供可靠的指导准则。可悲的是,我没有时间去做这些。我写这篇文章是希望它是有用的,但我很清楚它完全没有达到它被真正需要的程度。

脚注

1:我有时会听到人们说git不是一个事件溯源的示例,因为它在.git/objects中存储文件和树的状态。但是,系统是否对其内部存储使用改动或快照不会影响它是否为事件溯源。Git会根据需要为我提供一份变更清单。当它将数据压缩为packfiles时,它会使用快照和改动的组合,并根据性能原因选择兼用二者。

返回顶部
顶部