I was recently working a bit with Twitter’s Storm, and it got me wondering, how does it compare to another high-performance, concurrent-data-processing framework, Akka. What’s Akka and Storm?Let’s start with a short description of both systems. Storm is a distributed, real-time computation system. On a Storm cluster, you execute topologies, which process streams of tuples (data). Each topology is a graph consisting of spouts (which produce tuples) and bolts (which transform tuples). Storm takes care of cluster communication, fail-over and distributing topologies across cluster nodes. |
译者信息
最近一段时间工作很多都在和Twitter’s Storm打交道,这使我想知道,它和另一个高性能、并发数据处理的框架Akka对比有什么优缺点呢? 关于Akka和Storm?让我们简单的描述一下两个系统。Storm是一个分布式实时计算系统。在一个Storm集群中,执行拓扑结构,在这个拓扑中进行元组(数据)流进程。每一 个拓扑是一个由喷嘴(他产生元组)和螺栓(他负责转化元组)组成。Storm负责集群通讯,容错和通过集群节点分配拓扑。 |
Akka is a toolkit for building distributed, concurrent, fault-tolerant applications. In an Akka application, the basic construct is an actor; actors process messages asynchronously, and each actor instance is guaranteed to be run using at most one thread at a time, making concurrency much easier. Actors can also be deployed remotely. There’s a clustering module coming, which will handle automatic fail-over and distribution of actors across cluster nodes. Both systems scale very well and can handle large amounts of data. But when to use one, and when to use the other? There’s another good blog post on the subject, but I wanted to take the comparison a bit further: let’s see how elementary constructs in Storm compare to elementary constructs in Akka. |
译者信息
Akka 是一个构建分布式、并发、容错应用程序的工具集。在Akka应用中,基本结构是一个物件;物件异步的处理消息,并且保证每一个物件在某一个时刻最多运行在一个线程中,这是的同步更加容易。物件也可以远程部署。这儿是一个集群模块,它可以自动处理实现容错和通过集群节点进行物件的分配。 两个系统都大型架构可以处理超大数量的数据。但是在使用时候我们怎么选择呢? 这儿有一个很好的博文论述了这个主题,但是我想跟进一步对他们做个比较:让我们看看两者最基本的结构有什么不同? |
Comparing the basicsFirstly, the basic unit of data in Storm is a tuple. A tuple can have any number of elements, and each tuple element can be any object, as long as there’s a serializer for it. In Akka, the basic unit is a message, which can be any object, but it should be serializable as well (for sending it to remote actors). So here the concepts are almost equivalent. Let’s take a look at the basic unit of computation. In Storm, we have components: bolts and sprouts. A bolt can be any piece of code, which does arbitrary processing on the incoming tuples. It can also store some mutable data, e.g. to accumulate results. Moreover, bolts run in a single thread, so unless you start additional threads in your bolts, you don’t have to worry about concurrent access to the bolt’s data. This is very similar to an actor, isn’t it? Hence a Storm bolt/sprout corresponds to an Akka actor. How do these two compare in detail? Actors can receive arbitrary messages; bolts can receive arbitrary tuples. Both are expected to do some processing basing on the data received. Both have internal state, which is private and protected from concurrent thread access. |
译者信息
基本比较首先,Storm最基本的数据部分是一个元组。一个元组有人以数量的元素构成,每一个元组元素可以是任意的对象,并且实现了序列化。而Akka的最基本单 元是消息,消息也可以是任意的对象,同样他应该可被序列化(为了把它送到远程的物件)。所以在概念上两者基本上是一样的。 让我们看看基本的计算单元。在Storm中,拥有组件:螺栓和喷嘴。一个螺栓可以是任意块的代码,可在随后的元组中处理任意的进程。他也可以存储一些动态 的数据。例如,累加的结果。同时,螺栓运行在单线程,所以除非你要在你的螺栓中开启一个新的线程,你不用担心并行处理数据的问题。这和物件非常像,不是 吗?因此,一个Storm螺栓/喷嘴相当于一个Akka物件。那两者的细节上有上有和异同呢? 物件可以接收任意消息;螺栓可以接收任意的元组。两者都被期望在接收到的数据上做一些处理。两者都有内部状态,包括对并发线程处理的私有的和受保护态 |
Actors & bolts: differencesOne crucial difference is how actors and bolts communicate. An actor can send a message to any other actor, as long as it has theActorRef(and if not, an actor can be looked up by-name). It can also send back a reply to the sender of the message that is being handled. Storm, on the other hand is one-way. You cannot send back messages; you also can’t send messages to arbitrary bolts. You can also send a tuple to a named channel (stream), which will cause the tuple (message) to be broadcast to all listeners, defined in the topology. (Bolts also ack messages, which is also a form of communication, to the ackers.) In Storm, multiple copies of a bolt’s/sprout’s code can be run in parallel (depending on the parallelism setting). So this corresponds to a set of (potentially remote) actors, with a load-balancer actor in front of them; a concept well-known from Akka’s routing. There are a couple of choices on how tuples are routed to bolt instances in Storm (random, consistent hashing on a field), and this roughly corresponds to the various router options in Akka (round robin, consistent hashing on the message). |
译者信息
物件和螺栓的不同一个显著的不同是两者的通讯方式。一个物件可以传递消息给任意的物件,只要有物件的引用(如果没有,一个物件可以通过名称查询出)。物件也可以给消息发送 者返回一个回复,说明消息已被处理。Storm处理的过程则是单向的。他不能返回消息。你也不能给任意的螺栓传递消息。你可以把一个元组放入一个命名通道 (流),这个通道可以把螺栓发出的消息在所在通道进行广播,在拓扑中定义(螺栓也确认消息,这也是一种通讯的方式,对确认者) 在Storm中,多份螺栓/喷嘴的代码拷贝可以并行执行(取决于并行设置)。这相当于一系列(可能是远程的)物件,但是在前端有一个负载均衡的物件;Akka中的概念叫路由。关于元组如何被传递给螺栓实例有很多中的算法,包括随机、一个域的相容哈希等。这和Akka中的多种路由选择类似(循环、消息相容哈希) |
There’s also a difference in the “weight” of a bolt and an actor. In Akka, it is normal to have lots of actors (up to millions). In Storm, the expected number of bolts is significantly smaller; this isn’t in any case a downside of Storm, but rather a design decision. Also, Akka actors typically share threads, while each bolt instance tends to have a dedicated thread. Other featuresStorm also has one crucial feature which isn’t implemented in Akka out-of-the-box: guaranteed message delivery. Storm tracks the whole tree of tuples that originate from any tuple produced by a sprout. If all tuples aren’t acknowledged, the tuple will be replayed. Also the cluster management of Storm is more advanced (automatic fail-over, automatic balancing of workers across the cluster; based on Zookeeper); however the upcoming Akka clustering module should address that. |
译者信息
在螺栓和物件在构成权重上有有所不同。在Akka,并列地存在这很多物件(可以到数百万)。在Storm中,预期的螺栓数量是很小的。没有任何的理由限制这个数量,这实际上是由设计初衷导致的。同时,Akka物件通常都是共享线程的。而螺栓实例却是独享线程。 其他特性Storm 也有一个重要的特性,有在Akka中所没有的:保证消息传递。Storm遍历整个元组数,这个数的根节点都是嘴发出的元组。如果所有的元组还没有被确认,元组将被重新遍历。 同时,Storm的集群管理也是更先进(自动容错,集群中自动负载均衡;基于Zookeeper)。一个好消息是Akka的集群模块将会增加这些特性。 |
Finally, the layout of the communication in Storm – the topology – is static and defined upfront. In Akka, the communication patterns can change over time and can be totally dynamic; actors can send messages to any other actors, or can even send addresses (ActorRefs). So overall, Storm implements a specific range of usages very well, while Akka is more of a general-purpose toolkit. It would be possible to build a Storm-like system on top of Akka, but not the other way round (at least it would be very hard). Adam |
译者信息
最后,Storm的层状通讯、拓扑是静态的,并且是事先定义的。Akka通讯模式可以通过时间转换,并且可以完全动态配置。物件可以传递消息给任何其他物件,也可以送抵到地址(物件引用)。 总而言之,Storm适用于特定范围的用户,而Akka是一个更加通用的工具集。基于Akka可以构建类似Storm的系统,但是相反做法是不可行的(至少可以可以说那将非常的艰难) 亚当
|
- 浏览: 4940916 次
- 性别:
- 来自: 南京
文章分类
- 全部博客 (2844)
- java (1094)
- hadoop (37)
- jvm (39)
- hbase (11)
- sql (25)
- 异常 (83)
- div css (6)
- 数据库 (95)
- 有趣的code (15)
- struts2 (6)
- spring (124)
- js (44)
- 算法 (65)
- linux (36)
- hibernate (7)
- 中间件 (78)
- 设计模式 (2)
- 架构 (275)
- 操作系统 (91)
- maven (35)
- tapestry (1)
- mybatis (9)
- MQ (101)
- zookeeper (18)
- 搜索引擎,爬虫 (208)
- 分布式计算 (45)
- c# (7)
- 抓包 (28)
- 开源框架 (45)
- 虚拟化 (12)
- mongodb (15)
- 计算机网络 (2)
- 缓存 (97)
- memcached (6)
- 分布式存储 (13)
- scala (5)
- 分词器 (24)
- spark (104)
- 工具 (23)
- netty (5)
- Mahout (6)
- neo4j (6)
- dubbo (36)
- canal (3)
- Hive (10)
- Vert.x (3)
- docker (115)
- 分布式追踪 (2)
- spring boot (5)
- 微服务 (56)
- 淘客 (5)
- mesos (67)
- php (3)
- etcd (2)
- jenkins (4)
- nginx (7)
- 区块链 (1)
- Kubernetes (92)
- 驾照 (1)
- 深度学习 (15)
- JGroups (1)
- 安全 (5)
- 测试 (16)
- 股票 (1)
- Android (2)
- 房产 (1)
- 运维 (6)
- 网关 (3)
最新评论
-
明兜3号:
部署落地+业务迁移 玩转k8s进阶与企业级实践技能(又名:Ku ...
Kubernetes系统常见运维技巧 -
q328965539:
牛掰啊 资料收集的很全面
HDFS小文件处理解决方案总结+facebook(HayStack) + 淘宝(TFS) -
guichou:
fluent挂载了/var/lib/kubelet/pods目 ...
kubernetes上部署Fluentd+Elasticsearch+kibana日志收集系统 -
xu982604405:
System.setProperty("java.r ...
jmx rmi 穿越防火墙问题及jmxmp的替代方案 -
大漠小帆:
麻烦问下,“获取每个Item相似性最高的前N个Item”,这个 ...
协同过滤推荐算法在MapReduce与Spark上实现对比
发表评论
-
Kryo 使用指南
2017-12-05 20:14 18931、Kryo 的简介 Kryo 是一个快速序列化/ ... -
spring session序列化问题排查
2017-12-01 19:07 6160严重: Servlet.service() for ser ... -
利用junit对springMVC的Controller进行测试
2017-11-30 16:26 1403平时对junit测试service/D ... -
Java内存模型之重排序
2017-11-29 09:44 824在执行程序时,为了提供性能,处理器和编译器常常会对指令进行重 ... -
pmd spotbugs 文档
2017-11-28 10:02 0https://pmd.github.io/pmd/pmd ... -
PMD、FindBug、checkstyle、sonar这些代码检查工具的区别?各自的侧重点是什么?
2017-11-28 10:01 2100可以说都是代码静态分析工具,但侧重点不同。pmd:基于源代码 ... -
阿里巴巴Java代码规约插件p3c-pmd使用指南与实现解析
2017-11-23 17:09 1538阿里巴巴Java代码规约插件安装 阿里Java代码规 ... -
静态分析工具PMD使用说明 (文章来源: Java Eye)
2017-11-23 17:07 1106质量是衡量一个软件是否成功的关键要素。而对于商业软件系统,尤 ... -
MyBatis 使用 MyCat 实现多租户的一种简单思路
2017-11-20 18:27 2803本文的多租户是基于多数据库进行实现的,数据是通过不同数据库进 ... -
Spring+MyBatis实现数据库读写分离方案
2017-11-20 17:15 1027百度关键词:spring mybatis 多数据源 读写分离 ... -
数据库连接池druid wallfilter配置
2017-11-20 11:38 1231使用缺省配置的WallFilter <be ... -
java restful 实体封装
2017-11-16 09:47 1542package com.mogoroom.bs.commo ... -
dak
2017-11-15 11:21 0package zzm; import jodd.ht ... -
Java内存模型之从JMM角度分析DCL
2017-11-15 09:35 596DCL,即Double Check Lock,中卫双重检查锁 ... -
Java 打印堆栈的几种方法
2017-11-14 09:36 4665java 中可以通过 eclipse 等工具直接打印堆栈, ... -
Servlet Session学习
2017-11-10 09:25 506HTTP 是一种"无状 ... -
浅析Cookie中的Path与domain
2017-11-10 09:26 1014Path – 路径。指定与co ... -
入分析volatile的实现原理
2017-11-08 09:47 630通过前面一章我们了解了synchronized是一个重量级的 ... -
Spring MVC-ContextLoaderListener和DispatcherServlet
2017-11-15 09:35 637Tomcat或Jetty作为Servlet ... -
搭建spring框架的时候,web.xml中的spring相关配置,可以不用配置ContextLoaderListener(即只配DispatcherServl
2017-11-07 18:27 1386搭建spring框架的时候,web.xml中的sprin ...
相关推荐
Akka is a distributed computing toolkit that enables developers to build correct concurrent and distributed applications using Java and Scala with ease, applications that scale across servers and ...
使用案例和部署场景 Akka使用实例 概述 术语,概念 Actor系统 什么是Actor? 监管与监控 Actor引用,路径与地址 位置透明性 Akka与Java内存模型 消息传递可靠性 配置 Actors Actors Akka类型 容错 ...
akka-http-rest, 在 akka http上使用灵活REST服务编写示例 Akka平滑REST服务模板 例如展示如何使用Akka和Slick在Lightbend堆栈上创建反应性REST服务。示例包含实体交互的完整REST服务。插件功能:CRUD操作实体部分...
akka-kryo-serialization, 基于Kryo的Akka序列化 akka-kryo-serialization-- Scala 和Akka基于kryo的序列化程序这个库为 Scala 和Akka提供定制的基于kryo的序列化程序。 它可以用于更高效的akka远程处理。它还可以...
Akka Essentials,学习akka很好的一本书
Akka in Action shows you how to build message-oriented systems with Akka. This comprehensive, hands-on tutorial introduces each concept with a working example. You’ll start with the big picture of ...
如何使用 Akka 来构建具备高容错性、可以横向扩展的分布式网络应用程序。Akka 是一 个强大的工具集,提供了很多选项,可以对在本地机器上处理或网络远程机器上处理的 某项工作进行抽象封装,使之对开发者不可见。...
akka实例 java实现tcp远程调用,一个服务端,一个客户端
如何使用 Akka 来构建具备高容错性、可以横向扩展的分布式网络应用程序。Akka 是一 个强大的工具集,提供了很多选项,可以对在本地机器上处理或网络远程机器上处理的 某项工作进行抽象封装,使之对开发者不可见。...
初学akka使用实例,有很好的帮助啊,可实际运行
Learning Akka Learning Akka Learning AkkaLearning Akka
akka 实战。akka in action。v13 2014新版。 互联网技术入门必备 清晰,非扫描。
akka-quartz, 因为用Camel来安排Akka演员是愚蠢 akka石英Akka调度程序有限,并且使用 Apache camel 运行计时器是愚蠢的。 特性石英调度程序Akka演员们Fin版本使用 Akka 2.1.x 在 Scala 2.10.x/2.11.x 上使用
赠送jar包:akka-actor_2.11-2.5.19.jar; 赠送原API文档:akka-actor_2.11-2.5.19-javadoc.jar; 赠送源代码:akka-actor_2.11-2.5.19-sources.jar; 赠送Maven依赖信息文件:akka-actor_2.11-2.5.19.pom; 包含...
另外,本书介绍了 Actor 模型的一个实现框架 Akka 以及它的工具,而后讨论了在充分利用 actor 架构的基础上使用 Akka 框架来设计软件系统的方法,以及使用它来开发并发性和分布式应用程序的方怯。本书还介绍了领域 ...
本文档描述了akka的作用及原理,及服务端和客户端之间的通信赋代码
兼容性Scala 2.12和Akka 2.5.x 使用0.9.0以上的版本resolvers += Resolver.jcenterRepo // Adds Bintray to resolvers for akka-persistence-redis and rediscalalibraryDependencies ++= Seq(...
Akka Concurrency
基于AKKA实现的搞并发设计,描述AKKA在特定场景下解决的问题
In March 2010 I noticed a tweet by Dean Wampler that made me look into Akka: W00t! RT @jboner: #akka 0.7 is released: http://bit.ly/9yRGSB After some investigation into the source code and building a ...