动机

Motivation

The way we consume services from the internet today includes many instances of streaming data, both down- loading from a service as well as uploading to it or peer-to-peer data transfers. Regarding data as a stream of elements instead of in its entirety is very useful because it matches the way computers send and receive them (for example via TCP), but it is often also a necessity because data sets frequently become too large to be handled as a whole. We spread computations or analyses over large clusters and call it “big data”, where the whole principle of processing them is by feeding those data sequentially—as a stream—through some CPUs.

现如今我们从因特网上获取服务的方式包括了很多流式的数据,比如下载、上传或是p2p(peer to peer)的数据传输。把数据视为元素(译注:即整体的组成部分)的流而不是整体可以更有用,因为它符合计算机实际上处理它的方式(例如,通过TCP),但是经常这也是必需的,因为数据集经常变得太大而不能当作整体处理。我们把计算和分析分布到一个大集群中,称之为“大数据”,它的处理原则就是把数据顺序地(作为流)提供给一些CPU

Actors can be seen as dealing with streams as well: they send and receive series of messages in order to transfer knowledge (or data) from one place to another. We have found it tedious and error-prone to implement all the proper measures in order to achieve stable streaming between actors, since in addition to sending and receiving we also need to take care to not overflow any buffers or mailboxes in the process. Another pitfall is that Actor messages can be lost and must be retransmitted in that case lest the stream have holes on the receiving side. When dealing with streams of elements of a fixed given type, Actors also do not currently offer good static guarantees that no wiring errors are made: type-safety could be improved in this case.

也可以认为actor处理的也是流:它们接收消息、发送消息,来把知识(或者数据)从一个地方传送到另一个地方。我们发现想要使用恰当的实现手段来在actor之间构造一个稳定的流非常繁杂、容易出错,因为在接收和发送之外,我们还得确保不会使得缓冲区和mailbox(actor的mailbox)溢出。另一个陷阱是,Actor的消息可能会丢失,因此需要进行重传,以免在流的接收端出现漏洞。有的流的元素是一个给定的类型,在处理这种情况是,actor并不能提供很好的静态保证(译注:指编译器的类型检查)来确保不发生交织时的错误(译注:应该是指消息流的编织),在这种情况下类型安全可以改进。

(译注:这一段说明了设计Akka Stream的动机:

  1. 确保stream经过的各处的缓冲不会溢出(可以认为mailbox是actor的消息缓冲)

  2. 保证消息传递的可靠性,提供高于at-least-once的消息传递语义。

  3. 在处理元素类型给定的流时,提供类型安全。

  这些问题在构造一个actor系统时,是非常核心的问题。特别是前两个,自己构造actor系统时的确得采用很繁琐的手段才能实现。actor系统存在的问题,可以参考下这篇文章

  Why I Don't Like Akka Actors

 )

For these reasons we decided to bundle up a solution to these problems as an Akka Streams API. The purpose is to offer an intuitive and safe way to formulate stream processing setups such that we can then execute them efficiently and with bounded resource usage—no more OutOfMemoryErrors. In order to achieve this our streams need to be able to limit the buffering that they employ, they need to be able to slow down producers if the consumers cannot keep up. This feature is called back-pressure and is at the core of the Reactive Streams initiative of which Akka is a founding member. For you this means that the hard problem of propagating and reacting to back-pressure has been incorporated in the design of Akka Streams already, so you have one less thing to worry about; it also means that Akka Streams interoperate seamlessly with all other Reactive Streams implementations (where Reactive Streams interfaces define the interoperability SPI while implementations like Akka Streams offer a nice user API).

由于这些原因,我们想要把解决方案打包进Akka Stream API里。目的是提供一个直观和安全的方式来设计流处理过程,使得我们可以高效地执行它们,而且使用有界的资源消耗——不再有OutOfMemoryErrors。为了实现这点,我们的流需要能够限制它们采用的缓冲大小,在消费者跟不上生产者时,我们需要能使用生产者慢下来。这个特性称为后向压力(back-pressure),它是Reactive Streams的核心提议,而Akka是Reactive Streams的创始成员。对你而言这意味着传递和应对back-pressure的问题已经被纳入了Akka Stream的设计,所以你的担心可以少一个了;这也意味着Akka Streams可以无缝地与其它Reactive Streams的实现(Reactive Streams接口定义了互操作的Service Provider Interface,而像Akka这样的实现提供了一个很好地用户API)互操作。

Akka Stream文档翻译:Motivation的更多相关文章

  1. Akka Stream文档翻译:Quick Start Guide: Reactive Tweets

    Quick Start Guide: Reactive Tweets 快速入门指南: Reactive Tweets (reactive tweets 大概可以理解为“响应式推文”,在此可以测试下GF ...

  2. 报错:Flink Could not resolve substitution to a value: ${akka.stream.materializer}

    报错现象: Exception in thread "main" com.typesafe.config.ConfigException$UnresolvedSubstitutio ...

  3. Akka Stream之Graph

    最近在项目中需要实现图的一些操作,因此,初步考虑使用Akka Stream的Graph实现.从而学习了下: 一.介绍 我们知道在Akka Stream中有三种简单的线性数据流操作:Source/Flo ...

  4. Lagom学习 六 Akka Stream

    lagom中的stream 流数据处理是基于akka stream的,异步的处理流数据的.如下看代码: 流式service好处是: A: 并行:  hellos.mapAsync(8, name -& ...

  5. Akka官方文档翻译:Cluster Specification

    参加了CSDN的一个翻译项目,翻译Akka的文档.CSDN提供的翻译系统不好使,故先排版一下放在博客上. 5.1 集群规范 注意:本文档介绍了集群的设计理念.它分成两部分,第一部分描述了当前已经实现的 ...

  6. Akka(17): Stream:数据流基础组件-Source,Flow,Sink简介

    在大数据程序流行的今天,许多程序都面临着共同的难题:程序输入数据趋于无限大,抵达时间又不确定.一般的解决方法是采用回调函数(callback-function)来实现的,但这样的解决方案很容易造成“回 ...

  7. Akka(18): Stream:组合数据流,组件-Graph components

    akka-stream的数据流可以由一些组件组合而成.这些组件统称数据流图Graph,它描述了数据流向和处理环节.Source,Flow,Sink是最基础的Graph.用基础Graph又可以组合更复杂 ...

  8. Akka(19): Stream:组合数据流,组合共用-Graph modular composition

    akka-stream的Graph是一种运算方案,它可能代表某种简单的线性数据流图如:Source/Flow/Sink,也可能是由更基础的流图组合而成相对复杂点的某种复合流图,而这个复合流图本身又可以 ...

  9. Akka(20): Stream:压力缓冲-Batching backpressure and buffering

    akka-stream原则上是一种推式(push-model)的数据流.push-model和pull-model的区别在于它们解决问题倾向性:push模式面向高效的数据流下游(fast-downst ...

随机推荐

  1. linux下vsftpd的安装与配置说明

    问题: 1.530 Permission denied.答:配置文件中userlist_enable=YES(如果启用即YES,则看userlist_deny=YES/NO,如果为NO,则要把登录的用 ...

  2. 存储过程往拼接的sql语句中传递日期值

    存储过程往拼接的sql语句中传递日期值 declare @start datetime declare @end datetime set @start='2014-3-1' set @end='20 ...

  3. Access和Sql区别

    假设表game有一字段为gameYuiJian为bit字段(SQL SERVER 20005)和"是/否"字段(ACCSS数据库),在编写脚本文件时,如下才能正确执行 SQL st ...

  4. MYSQL 排行类的相关SQL写法,仅供参考

    SELECT * FROM () )) b

  5. Mac OSX用终端检测文件的sha1值

    打开终端,输入shasum空格然后把文件拖进来回车即可;

  6. Java Calendar类简单用法

    我的技术博客经常被流氓网站恶意爬取转载.请移步原文:http://www.cnblogs.com/hamhog/p/3832307.html,享受整齐的排版.有效的链接.正确的代码缩进.更好的阅读体验 ...

  7. Linux C 程序 指针数组和二级指针(TEN)

    指针数组和二级指针 #include<stdio.h> int main(){ ] = {,,,,}; ], i; int **pp = p; //使p指针数组指向每一个a ; i < ...

  8. Projected Coordinate Systems

    Coordinate Systems Projected Coordinate Systems This is an archive of a previous version of the ArcG ...

  9. 认识HTML5

    引言,认识两个标准制定的组织 在讲什么是Html5之前得先了解两个组织:WHATWG :网页超文本技术工作小组(英语:Web Hypertext Application Technology Work ...

  10. ActiveMq+zookeeper+levelDB集群整合配置

    ActiveMq+zookeeper+levelDB集群整合配置 环境:linux系统,jdk1.7  三台linux系统电脑.我这里使用一台window,分别远程3台linux电脑.三台电脑的ip分 ...