ClusterClient可以与某个集群通信,而本身节点不必是集群的一部分。它只需要知道一个或多个节点的位置作为联系节点。它会跟ClusterReceptionist 建立连接,来跟集群中的特定节点发送消息。而且必须把provider改成remote或cluster。receptionist需要在集群所有节点或一组节点内启动,它可以自行启动或通过ClusterReceptionist 扩展来启动。ClusterClient可以进行通信的actor必须是通过ClusterReceptionis扩展注册过的actor。

  看到这里,你是不是想骂人了,这么简单的功能我都可以自己实现了。不过akka就是这样,一些看似非常简单的功能,框架本身提供的功能更加稳定、更加通用,但性能不一定是最优的。废话不多说,我们来看看ClusterClient的具体实现。

/**
* This actor is intended to be used on an external node that is not member
* of the cluster. It acts like a gateway for sending messages to actors
* somewhere in the cluster. From the initial contact points it will establish
* a connection to a [[ClusterReceptionist]] somewhere in the cluster. It will
* monitor the connection to the receptionist and establish a new connection if
* the link goes down. When looking for a new receptionist it uses fresh contact
* points retrieved from previous establishment, or periodically refreshed
* contacts, i.e. not necessarily the initial contact points.
*
* You can send messages via the `ClusterClient` to any actor in the cluster
* that is registered in the [[ClusterReceptionist]].
* Messages are wrapped in [[ClusterClient.Send]], [[ClusterClient.SendToAll]]
* or [[ClusterClient.Publish]].
*
* Use the factory method [[ClusterClient#props]]) to create the
* [[akka.actor.Props]] for the actor.
*
* If the receptionist is not currently available, the client will buffer the messages
* and then deliver them when the connection to the receptionist has been established.
* The size of the buffer is configurable and it can be disabled by using a buffer size
* of 0. When the buffer is full old messages will be dropped when new messages are sent
* via the client.
*
* Note that this is a best effort implementation: messages can always be lost due to the distributed
* nature of the actors involved.
*/
final class ClusterClient(settings: ClusterClientSettings) extends Actor with ActorLogging

  通过ClusterClient的定义和官方注释来看,就是一个普通的actor,它可以集群中的特定actor(ClusterReceptionist)进行通信。它通过初始的联系点(其实就是ActorPath)与集群内的ClusterReceptionist发消息,同时会监控receptionist的链接状态,以确保链接正常。ClusterClient没有重定义preStart,那就看它的主构造函数吧。

sendGetContacts()
scheduleRefreshContactsTick(establishingGetContactsInterval)
self ! RefreshContactsTick

  分别调用了上面三段代码。

def sendGetContacts(): Unit = {
val sendTo =
if (contacts.isEmpty) initialContactsSel
else if (contacts.size == 1) initialContactsSel union contacts
else contacts
if (log.isDebugEnabled)
log.debug(s"""Sending GetContacts to [${sendTo.mkString(",")}]""")
sendTo.foreach { _ ! GetContacts }
}

  sendGetContacts很简单就是给当前的联系点发送GetContacts消息。

def scheduleRefreshContactsTick(interval: FiniteDuration): Unit = {
refreshContactsTask foreach { _.cancel() }
refreshContactsTask = Some(context.system.scheduler.schedule(
interval, interval, self, RefreshContactsTick))
}

  scheduleRefreshContactsTick启动定时器在interval之后,每隔interval时间,给自己发送RefreshContactsTick消息。

  第三段给自己发送了RefreshContactsTick消息。感觉后面两个代码有点重复,定时器第一个参数直接设置成0不就好了?省略了第三段代码的调用。

case RefreshContactsTick ⇒ sendGetContacts()

  收到RefreshContactsTick消息怎么处理?还是调用sendGetContacts。那请问在主构造函数里面调用sendGetContacts干啥呢?

var contactPaths: HashSet[ActorPath] =
initialContacts.to[HashSet]
val initialContactsSel =
contactPaths.map(context.actorSelection)
var contacts = initialContactsSel

  initialContactsSel、contacts、contactPaths、initialContacts是不是很相似呢?

  其中initialContactsSel最关键,这是把initialContacts给map成了ActorSelection,同时还给initialContacts发送了Identity消息。ActorPath是远程的actor,怎么select呢?还记得上文说过么?必须把provider配置成remote或者cluster,为啥?你猜。

case ActorIdentity(_, Some(receptionist)) ⇒
log.info("Connected to [{}]", receptionist.path)
scheduleRefreshContactsTick(refreshContactsInterval)
sendBuffered(receptionist)
context.become(active(receptionist) orElse contactPointMessages)
connectTimerCancelable.foreach(_.cancel())
failureDetector.heartbeat()
self ! HeartbeatTick // will register us as active client of the selected receptionist

  收到ActorIdentity之后调用scheduleRefreshContactsTick重新设置定时器,把缓存的消息发送给receptionist ,修改当前行为变成active。至此就可以通过Send、SendToAll、Publish给集群内特定的actor转发消息了。

def active(receptionist: ActorRef): Actor.Receive = {
case Send(path, msg, localAffinity) ⇒
receptionist forward DistributedPubSubMediator.Send(path, msg, localAffinity)
case SendToAll(path, msg) ⇒
receptionist forward DistributedPubSubMediator.SendToAll(path, msg)
case Publish(topic, msg) ⇒
receptionist forward DistributedPubSubMediator.Publish(topic, msg)
case HeartbeatTick ⇒
if (!failureDetector.isAvailable) {
log.info("Lost contact with [{}], reestablishing connection", receptionist)
reestablish()
} else
receptionist ! Heartbeat
case HeartbeatRsp ⇒
failureDetector.heartbeat()
case RefreshContactsTick ⇒
receptionist ! GetContacts
case Contacts(contactPoints) ⇒
// refresh of contacts
if (contactPoints.nonEmpty) {
contactPaths = contactPoints.map(ActorPath.fromString).to[HashSet]
contacts = contactPaths.map(context.actorSelection)
}
publishContactPoints()
case _: ActorIdentity ⇒ // ok, from previous establish, already handled
case ReceptionistShutdown ⇒
if (receptionist == sender()) {
log.info("Receptionist [{}] is shutting down, reestablishing connection", receptionist)
reestablish()
}
}

  总结下ClusterClient的行为,它通过配置的initialContacts给远程的actor(集群内的ClusterReceptionist)发送ActorSelection消息,然后在收到第一个ActorIdentity消息后,就算联系上了集群。(剩下的ActorIdentity消息被忽略,其实就是最快返回的作为联系点)。定时第一个返回ActorIdentity消息的ClusterReceptionist发送GetContacts消息,获取所有的ClusterReceptionist实例的位置。那如何判断第一个联系点失去联系了呢?看到HeartbeatTick了吗?

val heartbeatTask = context.system.scheduler.schedule(
heartbeatInterval, heartbeatInterval, self, HeartbeatTick)

  我们刚才忽略了heartbeatTask的定义,其实这是一个定时器,每隔heartbeatInterval秒给自己发送HeartbeatTick消息。其实关于在变量定义过程中写代码,我是不喜欢的,不利于分析源码的啊。

  收到HeartbeatTick消息就给receptionist发送了Heartbeat消息,在收到HeartbeatRsp后更新failureDetector当前的心跳信息。如果failureDetector检测到失败则调用reestablish方法,重新建立链接。

  ClusterClient的源码就分析到这里,下面我们来看看Cluster内的ClusterReceptionist的实现,之前说过,我们可以用actorOf启动或者ClusterReceptionist扩展来启动。当然优先看ClusterReceptionist扩展了啊。

object ClusterClientReceptionist extends ExtensionId[ClusterClientReceptionist] with ExtensionIdProvider {
override def get(system: ActorSystem): ClusterClientReceptionist = super.get(system) override def lookup() = ClusterClientReceptionist override def createExtension(system: ExtendedActorSystem): ClusterClientReceptionist =
new ClusterClientReceptionist(system)
}

  上面是ExtensionId的定义,很显然它还扩展了ExtensionIdProvider,也就是说,通过配置这个Extension就可以启动了,无需代码显式的启动。

/**
* Extension that starts [[ClusterReceptionist]] and accompanying [[akka.cluster.pubsub.DistributedPubSubMediator]]
* with settings defined in config section `akka.cluster.client.receptionist`.
* The [[akka.cluster.pubsub.DistributedPubSubMediator]] is started by the [[akka.cluster.pubsub.DistributedPubSub]] extension.
*/
final class ClusterClientReceptionist(system: ExtendedActorSystem) extends Extension

  有没有发现关于重要的类,官方注释都很清晰?这个扩展启动ClusterReceptionist和DistributedPubSubMediator,而DistributedPubSubMediator由DistributedPubSub扩展启动,关于DistributedPubSub后面再分析。

/**
* The [[ClusterReceptionist]] actor
*/
private val receptionist: ActorRef = {
if (isTerminated)
system.deadLetters
else {
val name = config.getString("name")
val dispatcher = config.getString("use-dispatcher") match {
case "" ⇒ Dispatchers.DefaultDispatcherId
case id ⇒ id
}
// important to use val mediator here to activate it outside of ClusterReceptionist constructor
val mediator = pubSubMediator
system.systemActorOf(ClusterReceptionist.props(mediator, ClusterReceptionistSettings(config))
.withDispatcher(dispatcher), name)
}
} /**
* Returns the underlying receptionist actor, particularly so that its
* events can be observed via subscribe/unsubscribe.
*/
def underlying: ActorRef =
receptionist
/**
* Register the actors that should be reachable for the clients in this [[DistributedPubSubMediator]].
*/
private def pubSubMediator: ActorRef = DistributedPubSub(system).mediator

  ClusterClientReceptionist定义中有上面源码,非常关键,它启动了一个ClusterReceptionist,其他源码都是注册和注销服务的,我们先忽略。

/**
* [[ClusterClient]] connects to this actor to retrieve. The `ClusterReceptionist` is
* supposed to be started on all nodes, or all nodes with specified role, in the cluster.
* The receptionist can be started with the [[ClusterClientReceptionist]] or as an
* ordinary actor (use the factory method [[ClusterReceptionist#props]]).
*
* The receptionist forwards messages from the client to the associated [[akka.cluster.pubsub.DistributedPubSubMediator]],
* i.e. the client can send messages to any actor in the cluster that is registered in the
* `DistributedPubSubMediator`. Messages from the client are wrapped in
* [[akka.cluster.pubsub.DistributedPubSubMediator.Send]], [[akka.cluster.pubsub.DistributedPubSubMediator.SendToAll]]
* or [[akka.cluster.pubsub.DistributedPubSubMediator.Publish]] with the semantics described in
* [[akka.cluster.pubsub.DistributedPubSubMediator]].
*
* Response messages from the destination actor are tunneled via the receptionist
* to avoid inbound connections from other cluster nodes to the client, i.e.
* the `sender()`, as seen by the destination actor, is not the client itself.
* The `sender()` of the response messages, as seen by the client, is `deadLetters`
* since the client should normally send subsequent messages via the `ClusterClient`.
* It is possible to pass the original sender inside the reply messages if
* the client is supposed to communicate directly to the actor in the cluster.
*
*/
final class ClusterReceptionist(pubSubMediator: ActorRef, settings: ClusterReceptionistSettings)
extends Actor with ActorLogging

  加上我们之前的分析和官方注释,这个actor就很好理解了。ClusterClient就是发送GetContracts消息给这个actor的,ClusterReceptionist在集群内所有节点或一组节点启动。它可以通过ClusterClientReceptionist这个扩展启动,或者作为普通的actor启动(actorOf)。ClusterReceptionist把ClusterClient转发的消息再吃给你信转发给DistributedPubSubMediator或注册的DistributedPubSubMediator(也就是我们注册的Service)。目标actor返回的消息通过DistributedPubSubMediator打的“洞”返回给客户端,其实就是修改了sender。

  这个定义也可以看出,它就是一个非常普通的actor。从源码来看,主构造函数和preStart函数都没有需要特别注意的地方,那就直接看receive喽。

case GetContacts ⇒
// Consistent hashing is used to ensure that the reply to GetContacts
// is the same from all nodes (most of the time) and it also
// load balances the client connections among the nodes in the cluster.
if (numberOfContacts >= nodes.size) {
val contacts = Contacts(nodes.map(a ⇒ self.path.toStringWithAddress(a))(collection.breakOut))
if (log.isDebugEnabled)
log.debug("Client [{}] gets contactPoints [{}] (all nodes)", sender().path, contacts.contactPoints.mkString(","))
sender() ! contacts
} else {
// using toStringWithAddress in case the client is local, normally it is not, and
// toStringWithAddress will use the remote address of the client
val a = consistentHash.nodeFor(sender().path.toStringWithAddress(cluster.selfAddress))
val slice = {
val first = nodes.from(a).tail.take(numberOfContacts)
if (first.size == numberOfContacts) first
else first union nodes.take(numberOfContacts - first.size)
}
val contacts = Contacts(slice.map(a ⇒ self.path.toStringWithAddress(a))(collection.breakOut))
if (log.isDebugEnabled)
log.debug("Client [{}] gets contactPoints [{}]", sender().path, contacts.contactPoints.mkString(","))
sender() ! contacts
}

  对GetContacts消息的处理我们需要特别关注,毕竟ClusterClient就是发送这个消息来获取集群内service信息的。第一个if语句的注释也很明白,有一个一致性hash来保证所有节点对GetContacts消息的返回都是一致的。

  case msg @ (_: Send | _: SendToAll | _: Publish) ⇒
val tunnel = responseTunnel(sender())
tunnel ! Ping // keep alive
pubSubMediator.tell(msg, tunnel)

  上面就是收到Send、SendToAll、Publish消息的处理逻辑。好像就是把消息发送给了pubSubMediator,这里出现了前面注释中说的“打洞”

def responseTunnel(client: ActorRef): ActorRef = {
val encName = URLEncoder.encode(client.path.toSerializationFormat, "utf-8")
context.child(encName) match {
case Some(tunnel) ⇒ tunnel
case None ⇒
context.actorOf(Props(classOf[ClientResponseTunnel], client, responseTunnelReceiveTimeout), encName)
}
}

  它在干啥,又创建了一个ClientResponseTunnel这个actor?把这个actor作为service消息的返回者?然后还有一个responseTunnelReceiveTimeout超时时间?

/**
* Replies are tunneled via this actor, child of the receptionist, to avoid
* inbound connections from other cluster nodes to the client.
*/
class ClientResponseTunnel(client: ActorRef, timeout: FiniteDuration) extends Actor with ActorLogging {
context.setReceiveTimeout(timeout) private val isAsk = {
val pathElements = client.path.elements
pathElements.size == 2 && pathElements.head == "temp" && pathElements.tail.head.startsWith("$")
} def receive = {
case Ping ⇒ // keep alive from client
case ReceiveTimeout ⇒
log.debug("ClientResponseTunnel for client [{}] stopped due to inactivity", client.path)
context stop self
case msg ⇒
client.tell(msg, Actor.noSender)
if (isAsk)
context stop self
}
}

  这个actor功能很简单,就是给client转发消息,这尼玛有点太绕了啊。在本地给各个client有创建了一个代理actor啊,返回的消息都通过这个actor返回啊,为啥不直接在服务端就把消息发送给client了呢?其实想想这是非常合理且必要的。有可能service所在的节点,与客户端网络是不通的。或者为了安全管理不能直接通信,通过这个代理回送消息就很必要了。不管怎么样吧,akka的都是对的,akka的都是好的。

case Heartbeat ⇒
if (verboseHeartbeat) log.debug("Heartbeat from client [{}]", sender().path)
sender() ! HeartbeatRsp
updateClientInteractions(sender())

  还有就是对客户端发送的Heartbeat消息的处理,处理逻辑很简单,但有一点需要注意,那就是对客户端列表的一个维护。也就是说在每个ClusterReceptionist都是有客户端列表的。其实吧,这一点我是非常不赞同的。毕竟客户端有可能是海量的,光是维护这个列表就非常耗内存了。弄这个列表虽然功能上非常丰富,但容易造成OOM啊。如果客户端不多,说明akka还没有正式被大家所熟知或者被大公司使用啊。

  好了,ClusterClient就分析到这里了。聪明的读者可能会问,我还没有看到消息是如何通过ClusterReceptionist发送给实际的服务actor啊,pubSubMediator.tell(msg, tunnel)这段代码是如何路由消息的呢?嗯,确实,不过别急,这个会在下一章节(DistributedPubSubMediator)讲解。毕竟官方在ClusterClient的文档中,直接推荐用DistributedPubSubMediator来实现类似的功能。我觉得吧,这又是一个坑,既然你都推荐DistributedPubSubMediator了,还提供ClusterClient模块干啥呢?直接废弃掉啊。

Akka源码分析-Cluster-ClusterClient的更多相关文章

  1. Akka源码分析-Cluster-Distributed Publish Subscribe in Cluster

    在ClusterClient源码分析中,我们知道,他是依托于“Distributed Publish Subscribe in Cluster”来实现消息的转发的,那本文就来分析一下Pub/Sub是如 ...

  2. Akka源码分析-Cluster-Sharding

    个人觉得akka提供的cluster工具中,sharding是最吸引人的.当我们需要把actor分布在不同的节点上时,Cluster sharding非常有用.我们可以使用actor的逻辑标识符与ac ...

  3. Akka源码分析-Cluster-Metrics

    一个应用软件维护的后期一定是要做监控,akka也不例外,它提供了集群模式下的度量扩展插件. 其实如果读者读过前面的系列文章的话,应该是能够自己写一个这样的监控工具的.简单来说就是创建一个actor,它 ...

  4. Akka源码分析-Cluster-Singleton

    akka Cluster基本实现原理已经分析过,其实它就是在remote基础上添加了gossip协议,同步各个节点信息,使集群内各节点能够识别.在Cluster中可能会有一个特殊的节点,叫做单例节点. ...

  5. Akka源码分析-Persistence

    在学习akka过程中,我们了解了它的监督机制,会发现actor非常可靠,可以自动的恢复.但akka框架只会简单的创建新的actor,然后调用对应的生命周期函数,如果actor有状态需要回复,我们需要h ...

  6. Akka源码分析-Cluster-ActorSystem

    前面几篇博客,我们依次介绍了local和remote的一些内容,其实再分析cluster就会简单很多,后面关于cluster的源码分析,能够省略的地方,就不再贴源码而是一句话带过了,如果有不理解的地方 ...

  7. storm操作zookeeper源码分析-cluster.clj

    storm操作zookeeper的主要函数都定义在命名空间backtype.storm.cluster中(即cluster.clj文件中).backtype.storm.cluster定义了两个重要p ...

  8. Akka源码分析-Akka Typed

    对不起,akka typed 我是不准备进行源码分析的,首先这个库的API还没有release,所以会may change,也就意味着其概念和设计包括API都会修改,基本就没有再深入分析源码的意义了. ...

  9. Akka源码分析-Akka-Streams-概念入门

    今天我们来讲解akka-streams,这应该算akka框架下实现的一个很高级的工具.之前在学习akka streams的时候,我是觉得云里雾里的,感觉非常复杂,而且又难学,不过随着对akka源码的深 ...

  10. Akka源码分析-local-DeathWatch

    生命周期监控,也就是死亡监控,是akka编程中常用的机制.比如我们有了某个actor的ActorRef之后,希望在该actor死亡之后收到响应的消息,此时我们就可以使用watch函数达到这一目的. c ...

随机推荐

  1. SQL Server 2008如何创建定期自动备份任务

    我们知道,利用SQL Server 2008数据库可以实现数据库的定期自动备份.方法是用SQL SERVER 2008自带的维护计划创建一个计划对数据库进行备份,下面我们将SQL SERVER 200 ...

  2. mysql复制知识整理

    主服务器(master)简称M,从服务器(slave)简称S  一.原理:  M监听S的复制请求,S创建一个I/O线程以连接M并让它发送记录在其二进制日志中的语句,M接受到请求,创建一个Binlog ...

  3. [luoguP1439] 排列LCS问题(DP + 树状数组)

    传送门 无重复元素的LCS问题 n2 做法不说了. nlogn 做法 —— 因为LCS问题求的是公共子序列,顺序不影响答案,影响答案的只是两个串的元素是否相同,所以可以交换元素位置. 首先简化一下问题 ...

  4. POJ 3320_Jessica's Reading Problem

    题意: 每页书都对应一个知识点,问最少看连续的多少页,才能把所有知识点都看完? 分析: <挑战程序设计竞赛>介绍的尺取法,反复推进区间的开头和结尾,来求取满足条件的最小区间,先确定好一个满 ...

  5. csu1365 Play with Chain

    很显然的splay,第一次用splay操作区间...我实在佩服这个targan大佬,居然搞出这么牛逼的平衡树,调了大概5个小时终于搞定了.. #include<cstdio> #inclu ...

  6. 条款一:仔细区别pointers 和 reference

    1.一个reference必须总代表某个对象,没有所谓的null reference.如果你有一个变量,其目的是用来指向(代表)另一个对象,但是也有可能它不指向(代表)另一个对象,那么应该使用poin ...

  7. SaltStack学习系列之Nginx部署

    目录结构 |-- nginx | |-- files #放包文件的 | | |-- admin_22.conf | | |-- fastcgi_params | | |-- jim_fix_param ...

  8. Lightoj 1027 - A Dangerous Maze 【期望】

    1027 - A Dangerous Maze PDF (English) Statistics Forum Time Limit: 2 second(s) Memory Limit: 32 MB Y ...

  9. AutoCAD如何方便截图放到Word文档,改成白底黑字

    将模型视图切换到布局2即可   比如下图所示的效果   先回到模型视图把所有线条颜色都改成白色,然后添加适当的标注(比如要受力分析,则在CAD中绘制箭头也很方便的),文字说明.然后切换到布局2就OK ...

  10. (六)Net Core项目使用Controller之一 c# log4net 不输出日志 .NET Standard库引用导致的FileNotFoundException探究 获取json串里的某个属性值 common.js 如何调用common.js js 筛选数据 Join 具体用法

    (六)Net Core项目使用Controller之一 一.简介 1.当前最流行的开发模式是前后端分离,Controller作为后端的核心输出,是开发人员使用最多的技术点. 2.个人所在的团队已经选择 ...