生命周期监控,也就是死亡监控,是akka编程中常用的机制。比如我们有了某个actor的ActorRef之后,希望在该actor死亡之后收到响应的消息,此时我们就可以使用watch函数达到这一目的。

  1. class WatchActor extends Actor {
  2. val child = context.actorOf(Props.empty, "child")
  3. context.watch(child) // <-- this is the only call needed for registration
  4. var lastSender = context.system.deadLetters
  5.  
  6. def receive = {
  7. case "kill"
  8. context.stop(child); lastSender = sender()
  9. case Terminated(`child`) lastSender ! "finished"
  10. }
  11. }

  我们从官网的一个例子入手,其实DeathWatch用起来还是非常方便的,就是调用context.watch,在对应的actor由于某种原因stop之后,就会收到Terminated消息,该消息只有一个参数,那就是stop的ActorRef。看起来简单,那具体是怎么实现的呢?

  1. /**
  2. * Registers this actor as a Monitor for the provided ActorRef.
  3. * This actor will receive a Terminated(subject) message when watched
  4. * actor is terminated.
  5. *
  6. * `watch` is idempotent if it is not mixed with `watchWith`.
  7. *
  8. * It will fail with an [[IllegalStateException]] if the same subject was watched before using `watchWith`.
  9. * To clear the termination message, unwatch first.
  10. *
  11. * *Warning*: This method is not thread-safe and must not be accessed from threads other
  12. * than the ordinary actor message processing thread, such as [[java.util.concurrent.CompletionStage]] and [[scala.concurrent.Future]] callbacks.
  13. *
  14. * @return the provided ActorRef
  15. */
  16. def watch(subject: ActorRef): ActorRef

  上面是ActorContex关于watch的官方注释,非常简单,就是watch一个actor,然后就会收到对应的Terminated消息,还说这个方法不是线程安全的。

  如果读者看过我之前的源码分析文章的话,一定知道context就是ActorContext的实例,而ActorContext是ActorCell的一个功能截面,那么watch函数的具体实现应该就是在ActorCell里面了。由于ActorCell实现的接口比较多,就不再具体分析如何找到watch实现在哪个类了,直接告诉答案:dungeon.DeathWatch。

  1. private[akka] trait DeathWatch { this: ActorCell

  首先它是一个自我类型限定的trait,这种方式我之前吐槽过这里就不展开说了,来看看watch如何实现的。

  1. override final def watch(subject: ActorRef): ActorRef = subject match {
  2. case a: InternalActorRef
  3. if (a != self) {
  4. if (!watchingContains(a))
  5. maintainAddressTerminatedSubscription(a) {
  6. a.sendSystemMessage(Watch(a, self)) // ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅
  7. updateWatching(a, None)
  8. }
  9. else
  10. checkWatchingSame(a, None)
  11. }
  12. a
  13. }

  从上面源码可以分析出几个简单的技术点:1、不能watch自身;2、如果已经被监控则调用checkWatchingSame;3、没有被监控过,就给被监控的actor发送Watch整个系统消息;4、没有监控过则更新监控信息。

  1. /**
  2. * This map holds a [[None]] for actors for which we send a [[Terminated]] notification on termination,
  3. * ``Some(message)`` for actors for which we send a custom termination message.
  4. */
  5. private var watching: Map[ActorRef, Option[Any]] = Map.empty
  1. // when all actor references have uid, i.e. actorFor is removed
  2. private def watchingContains(subject: ActorRef): Boolean =
  3. watching.contains(subject) || (subject.path.uid != ActorCell.undefinedUid &&
  4. watching.contains(new UndefinedUidActorRef(subject)))

  判断是否已经监控过,这个具体实现比较有意思,watching是一个Map,首先判断Map中是否需包含该ActorRef;如果不包含该ActorRef,就去判断有没有UID,有UID则创建一个UndefinedUidActorRef,再去watching中判断是否包含。难道不奇怪么?既然都不包含了,创建一个UndefinedUidActorRef就有可能包含了?谁说不是呢,哈哈。其实也不是。我们来看看ActorRef是如何定义equals的。

  1. /**
  2. * Equals takes path and the unique id of the actor cell into account.
  3. */
  4. final override def equals(that: Any): Boolean = that match {
  5. case other: ActorRef path.uid == other.path.uid && path == other.path
  6. case _ false
  7. }

  上面源码逻辑比较清晰,如果两个ActorRef相等,则一定是path相等,且对应的uid相等。ActorPath的判等就不再分析了,肯定是各个层次相同喽。

  那么有没有可能path相同,而uid不同呢?当然可能了,如果一个actor被stop之后,再用相同的actorOf参数创建呢?此时uid是不同的,而path是相同的。

  1. private[akka] class UndefinedUidActorRef(ref: ActorRef) extends MinimalActorRef {
  2. override val path = ref.path.withUid(ActorCell.undefinedUid)
  3. override def provider = throw new UnsupportedOperationException("UndefinedUidActorRef does not provide")
  4. }

  UndefinedUidActorRef就是与原ActorRef路径相同,而uid是ActorCell.undefinedUid的一个新的ActorRef。

  maintainAddressTerminatedSubscription,它会判断是不是本地actor,如果是本地actor则调用后面的block,对于远程actor会有一些特殊操作,这里不再分析。

  1. private def updateWatching(ref: InternalActorRef, newMessage: Option[Any]): Unit =
  2. watching = watching.updated(ref, newMessage)

  updateWatching比较简单,就是把要watch的actorRef插入到watching这个Map中去。你要问我这个ActorRef在Map中对应的value是啥,我也是拒绝回答的,你可以看看watchWith的用法,这里不再分析。下面我们来分析一下被监控的Actor收到Watching之后是如何做响应的。

  1. case Watch(watchee, watcher) addWatcher(watchee, watcher)

  它命中了ActorCell.systemInvoke中的以上分支。

  1. protected def addWatcher(watchee: ActorRef, watcher: ActorRef): Unit = {
  2. val watcheeSelf = watchee == self
  3. val watcherSelf = watcher == self
  4.  
  5. if (watcheeSelf && !watcherSelf) {
  6. if (!watchedBy.contains(watcher)) maintainAddressTerminatedSubscription(watcher) {
  7. watchedBy += watcher
  8. if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), s"now watched by $watcher"))
  9. }
  10. } else if (!watcheeSelf && watcherSelf) {
  11. watch(watchee)
  12. } else {
  13. publish(Warning(self.path.toString, clazz(actor), "BUG: illegal Watch(%s,%s) for %s".format(watchee, watcher, self)))
  14. }
  15. }

  正常情况下,会命中第一个if的第一个分支的代码,其实也比较简答,就是去watchedBy里面查找是否保存过watcher,如果没有就把它加到watchedBy里面。

  1. private var watchedBy: Set[ActorRef] = ActorCell.emptyActorRefSet

  watchedBy是一个set,也就是里面的ActorRef不重复。那如果这个actor被stop之后,啥时候通知对应的watchedBy呢?这个问题其实还是满复杂的。

  如果想知道什么时候通知了watchedBy,就需要知道stop的逻辑,那么ActorCell的stop是如何实现的呢?

  1. // ➡➡➡ NEVER SEND THE SAME SYSTEM MESSAGE OBJECT TO TWO ACTORS ⬅⬅⬅
  2. final def stop(): Unit = try dispatcher.systemDispatch(this, Terminate()) catch handleException

  stop在Dispatch这个trait里面实现,很简单,它又用当前dispatcher发送了一个Terminate消息给自己。

  1. case Terminate() terminate()

  收到Terminate消息后,调用了terminate方法。

  1. protected def terminate() {
  2. setReceiveTimeout(Duration.Undefined)
  3. cancelReceiveTimeout
  4.  
  5. // prevent Deadletter(Terminated) messages
  6. unwatchWatchedActors(actor)
  7.  
  8. // stop all children, which will turn childrenRefs into TerminatingChildrenContainer (if there are children)
  9. children foreach stop
  10.  
  11. if (systemImpl.aborting) {
  12. // separate iteration because this is a very rare case that should not penalize normal operation
  13. children foreach {
  14. case ref: ActorRefScope if !ref.isLocal self.sendSystemMessage(DeathWatchNotification(ref, true, false))
  15. case _
  16. }
  17. }
  18.  
  19. val wasTerminating = isTerminating
  20.  
  21. if (setChildrenTerminationReason(ChildrenContainer.Termination)) {
  22. if (!wasTerminating) {
  23. // do not process normal messages while waiting for all children to terminate
  24. suspendNonRecursive()
  25. // do not propagate failures during shutdown to the supervisor
  26. setFailed(self)
  27. if (system.settings.DebugLifecycle) publish(Debug(self.path.toString, clazz(actor), "stopping"))
  28. }
  29. } else {
  30. setTerminated()
  31. finishTerminate()
  32. }
  33. }

  terminate方法,逻辑清晰,它会通知子actor进行stop。那么子actor是如何stop的呢?

  1. final def stop(actor: ActorRef): Unit = {
  2. if (childrenRefs.getByRef(actor).isDefined) {
  3. @tailrec def shallDie(ref: ActorRef): Boolean = {
  4. val c = childrenRefs
  5. swapChildrenRefs(c, c.shallDie(ref)) || shallDie(ref)
  6. }
  7.  
  8. if (actor match {
  9. case r: RepointableRef r.isStarted
  10. case _ true
  11. }) shallDie(actor)
  12. }
  13. actor.asInstanceOf[InternalActorRef].stop()
  14. }

  其实比较简单,就是判断当前actor是否存在,若存在且已经启动则调用swapChildrenRefs,最后调用这个子actor的stop()方法,进行递归stop。

  1. override def shallDie(actor: ActorRef): ChildrenContainer = TerminatingChildrenContainer(c, Set(actor), UserRequest)

  shallDie其实就是创建一个TerminatingChildrenContainer,然后去替换childrenRefs。

  1. @tailrec final protected def setChildrenTerminationReason(reason: ChildrenContainer.SuspendReason): Boolean = {
  2. childrenRefs match {
  3. case c: ChildrenContainer.TerminatingChildrenContainer
  4. swapChildrenRefs(c, c.copy(reason = reason)) || setChildrenTerminationReason(reason)
  5. case _ false
  6. }
  7. }

  最后一个if语句会调用setChildrenTerminationReason,此时childrenRefs已经是TerminatingChildrenContainer类型的了,所以会返回true。

  1. private def finishTerminate() {
  2. val a = actor
  3. /* The following order is crucial for things to work properly. Only change this if you're very confident and lucky.
  4. *
  5. * Please note that if a parent is also a watcher then ChildTerminated and Terminated must be processed in this
  6. * specific order.
  7. */
  8. try if (a ne null) a.aroundPostStop()
  9. catch handleNonFatalOrInterruptedException { e publish(Error(e, self.path.toString, clazz(a), e.getMessage)) }
  10. finally try dispatcher.detach(this)
  11. finally try parent.sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))
  12. finally try stopFunctionRefs()
  13. finally try tellWatchersWeDied()
  14. finally try unwatchWatchedActors(a) // stay here as we expect an emergency stop from handleInvokeFailure
  15. finally {
  16. if (system.settings.DebugLifecycle)
  17. publish(Debug(self.path.toString, clazz(a), "stopped"))
  18.  
  19. clearActorFields(a, recreate = false)
  20. clearActorCellFields(this)
  21. actor = null
  22. }
  23. }

  所以最终会调用finishTerminate,在finishTerminate代码中会去调用tellWatchersWeDied

  1. protected def tellWatchersWeDied(): Unit =
  2. if (!watchedBy.isEmpty) {
  3. try {
  4. // Don't need to send to parent parent since it receives a DWN by default
  5. def sendTerminated(ifLocal: Boolean)(watcher: ActorRef): Unit =
  6. if (watcher.asInstanceOf[ActorRefScope].isLocal == ifLocal && watcher != parent)
  7. watcher.asInstanceOf[InternalActorRef].sendSystemMessage(DeathWatchNotification(self, existenceConfirmed = true, addressTerminated = false))
  8.  
  9. /*
  10. * It is important to notify the remote watchers first, otherwise RemoteDaemon might shut down, causing
  11. * the remoting to shut down as well. At this point Terminated messages to remote watchers are no longer
  12. * deliverable.
  13. *
  14. * The problematic case is:
  15. * 1. Terminated is sent to RemoteDaemon
  16. * 1a. RemoteDaemon is fast enough to notify the terminator actor in RemoteActorRefProvider
  17. * 1b. The terminator is fast enough to enqueue the shutdown command in the remoting
  18. * 2. Only at this point is the Terminated (to be sent remotely) enqueued in the mailbox of remoting
  19. *
  20. * If the remote watchers are notified first, then the mailbox of the Remoting will guarantee the correct order.
  21. */
  22. watchedBy foreach sendTerminated(ifLocal = false)
  23. watchedBy foreach sendTerminated(ifLocal = true)
  24. } finally {
  25. maintainAddressTerminatedSubscription() {
  26. watchedBy = ActorCell.emptyActorRefSet
  27. }
  28. }
  29. }

  tellWatchersWeDied做了什么呢?其实就是给watchedBy对应的actorRef发送DeathWatchNotification消息。请注意DeathWatchNotification的第一个参数是self,就是要stop的actor。

  1. case DeathWatchNotification(a, ec, at) watchedActorTerminated(a, ec, at)

  而watcher收到DeathWatchNotification如何响应呢?

  1. /**
  2. * When this actor is watching the subject of [[akka.actor.Terminated]] message
  3. * it will be propagated to user's receive.
  4. */
  5. protected def watchedActorTerminated(actor: ActorRef, existenceConfirmed: Boolean, addressTerminated: Boolean): Unit = {
  6. watchingGet(actor) match {
  7. case None // We're apparently no longer watching this actor.
  8. case Some(optionalMessage)
  9. maintainAddressTerminatedSubscription(actor) {
  10. watching = removeFromMap(actor, watching)
  11. }
  12. if (!isTerminating) {
  13. self.tell(optionalMessage.getOrElse(Terminated(actor)(existenceConfirmed, addressTerminated)), actor)
  14. terminatedQueuedFor(actor)
  15. }
  16. }
  17. if (childrenRefs.getByRef(actor).isDefined) handleChildTerminated(actor)
  18. }

  很明显watchedActorTerminated在当前actor处于正常状态,且已经监控了对应的actor时,会给自己发送一个Terminated(actor),或者Terminated(actor,msg)的消息。这样监控者就收到了被监控actor的Terminated消息了。

  其实吧,抛开子actor状态的维护以及其他复杂的操作,简单来说就是,监控者保存自己监控了哪些actor,被监控者保存了自己被哪些actor监控了,在被监控者stop的最后一刻发送Terminated消息给监控者就好了。当然了,这还涉及到remote模式,此时就比较复杂,后面再分析。

Akka源码分析-local-DeathWatch的更多相关文章

  1. Akka源码分析-Cluster-Metrics

    一个应用软件维护的后期一定是要做监控,akka也不例外,它提供了集群模式下的度量扩展插件. 其实如果读者读过前面的系列文章的话,应该是能够自己写一个这样的监控工具的.简单来说就是创建一个actor,它 ...

  2. Akka源码分析-Cluster-Distributed Publish Subscribe in Cluster

    在ClusterClient源码分析中,我们知道,他是依托于“Distributed Publish Subscribe in Cluster”来实现消息的转发的,那本文就来分析一下Pub/Sub是如 ...

  3. Akka源码分析-Persistence

    在学习akka过程中,我们了解了它的监督机制,会发现actor非常可靠,可以自动的恢复.但akka框架只会简单的创建新的actor,然后调用对应的生命周期函数,如果actor有状态需要回复,我们需要h ...

  4. Akka源码分析-Cluster-ActorSystem

    前面几篇博客,我们依次介绍了local和remote的一些内容,其实再分析cluster就会简单很多,后面关于cluster的源码分析,能够省略的地方,就不再贴源码而是一句话带过了,如果有不理解的地方 ...

  5. Akka源码分析-Akka Typed

    对不起,akka typed 我是不准备进行源码分析的,首先这个库的API还没有release,所以会may change,也就意味着其概念和设计包括API都会修改,基本就没有再深入分析源码的意义了. ...

  6. Akka源码分析-Akka-Streams-概念入门

    今天我们来讲解akka-streams,这应该算akka框架下实现的一个很高级的工具.之前在学习akka streams的时候,我是觉得云里雾里的,感觉非常复杂,而且又难学,不过随着对akka源码的深 ...

  7. Akka源码分析-Cluster-Singleton

    akka Cluster基本实现原理已经分析过,其实它就是在remote基础上添加了gossip协议,同步各个节点信息,使集群内各节点能够识别.在Cluster中可能会有一个特殊的节点,叫做单例节点. ...

  8. Akka源码分析-Akka-Streams-Materializer(1)

    本博客逐步分析Akka Streams的源码,当然必须循序渐进,且估计会分很多篇,毕竟Akka Streams还是比较复杂的. implicit val system = ActorSystem(&q ...

  9. Akka源码分析-Cluster-Sharding

    个人觉得akka提供的cluster工具中,sharding是最吸引人的.当我们需要把actor分布在不同的节点上时,Cluster sharding非常有用.我们可以使用actor的逻辑标识符与ac ...

随机推荐

  1. pyinstaller打包问题总结

    1.pyinstaller常见用法 -w:禁止cmd窗口 -F:打包为单文件 比如:pyinstaller -w -F test.py 2.QT中UI转py文件 pyuic5 test.ui -o t ...

  2. The C Programming Language-4.1

    下面是c程序设计语言4.1代码以及我的一些理解 strindex函数,通过嵌套两次循环,在s[ ]和t[ ]两个数组对映元素相等且t[ ]尚未遍历完毕的情况下,不断循环,最终返回正数或-1 代码如下 ...

  3. centos7安装:license information(license not accepted)

    安装centos7的时候明明已经选择了默认的许可证信息,不知道哪里出错了,安装到最后,就会显示license information(license not accepted)的信息.解决方法如下: ...

  4. scanf与getchar

    如下: 5  5 R  R  R  R  R R  R  R  R  R R  R  R  R  R R  R  R  R  R R  R  R  R  R 只允许用scanf,如何写读取函数. 由于 ...

  5. git命令大杂烩

    查看版本库中的文件: git ls-files添加到暂存区: git add filesName|\folderName(循环递归) git add .(添加当前目录下的所有文件包括子目录,如果添加文 ...

  6. UVAL - 6755 - Swyper Keyboard

    先上题目: https://icpcarchive.ecs.baylor.edu/external/67/6755.pdf 题目复制起来比较麻烦. 题意:定义一种操作:给出一个字符串,然后手指就按照给 ...

  7. 【BZOJ3676&UOJ103】回文串(manacher,Trie)

    题意:考虑一个只包含小写拉丁字母的字符串s.我们定义s的一个子串t的“出现值”为t在s中的出现次数乘以t的长度. 请你求出s的所有回文子串中的最大出现值. len<=300000 思路:鸣谢UO ...

  8. 洛谷——P1082 同余方程

    P1082 同余方程 题目描述 求关于 x 的同余方程 ax ≡ 1 (mod b)的最小正整数解. 输入输出格式 输入格式: 输入只有一行,包含两个正整数 a, b,用一个空格隔开. 输出格式: 输 ...

  9. MyBatis与Spring MVC结合时,使用DAO注入出现:Invocation of init method failed; nested exception is java.lang.IllegalArgumentException: Property 'sqlSessionFactory' or 'sqlSessionTemplate' are required

    错误源自使用了这个例子:http://www.yihaomen.com/article/java/336.htm,如果运行时会出现如下错误: Invocation of init method fai ...

  10. mysql limit具体用法

    MYSQL中LIMIT用法_百度知道 答 limit是mysql的语法select * from table limit m,n其中m是指记录开始的index,从0开始,表示第一条记录n是指从第m+1 ...