Part 1: not disposing a subscription

Judging by the number of talks, articles and discussions related to reactive programming in Swift, it looks like the community has been taken by the storm. It's not that the concept of reactiveness itself is a new shiny thing. The idea of using it for the development within the Apple ecosystem had been played with for a long time. Frameworks like ReactiveCocoa have existed for years and did an awesome job at bringing the reactive programming to the Objective-C. However, the new and exciting features of Swift make it even more convenient to go full in on the "signals as your apps’ building blocks" model.

Here at Polidea, we’ve also embraced the reactive paradigm, mostly in the form of RxSwift, the port of C#-originated Reactive Extensions. And we couldn’t be happier! It helps us build more expressive and better-architectured apps faster and easier. Unifying various patterns (target-action, completion block, notification) under a universal API that is easy to use, easy to compose and easy to test has so many benefits. Also, introducing new team members is way easier now, when so much logic is written with methods familiar either from sequences (map, filter, zip, flatMap) or from other languages that Reactive Extensions had been ported to.

The process of learning RxSwift, however, hasn’t been painless. We’ve made many mistakes, fallen into many traps and eventually arrived at the other end to share what we’ve learned along the way. This is what this series is about: showing you the most common pitfalls to avoid when going reactive. They all come from the everyday practical use of RxSwift in non-trivial applications. It took us many hours to learn our lessons and we hope that with our help it’s going to take you only few minutes to enjoy the benefits of reactive programming without ever encountering its dark side.

So, let’s start!

Not disposing a subscription

When you started using RxSwift for the first time, you've probably tried to observe some events by writing:

Such an expression was, however, openly criticized by Xcode with the default Result to call to 'subscribe' is unused warning. Luckily, there's an easy fix available just around the corner. Telling the compiler that we ignore the call result with _ = would be enough, right? So now it's:

and everything is fixed, isn't it? If you think so, prepare yourself for a treat. There're probably a whole lot of low-hanging fruits of undisposed subscriptions just waiting to be picked from your memory-management tree. Ignoring the subscription’s result is a clear path to memory leaks. While there are situations in which you'll be spared any problems, in the worst-case scenario both your observable and the observer closure will never be released. The bad news is that by ignoring the value returned from subscribe method you're giving away the control over which scenario is going to happen.

To understand the problem, I'll show you the mental model of the subscription process in terms of memory-management first. Then, I'll derive the best practices. Finally, I'm going to peek into RxSwift source code to understand what is actually happening in the current (v3.X/4.0) implementation and how it relates to the mental model presented earlier.

The mental model for subscription memory-management

Calling subscribe creates a reference cycle that retains both the observable and the observer. Neither of them is going to be released unless the cycle is broken, and it’s broken only in two situations:

  • when the observable sequence completes, either with .completed or .error event,
  • when someone explicitly calls .dispose() on the reference cycle manager returned by subscribemethod.

The details may vary, but the basic idea of what it means to subscribe holds regardless of your particular observable, observer or subscription. The crucial thing to spot is that ignoring the reference cycle manager, aka disposable, strips you of the possibility to break reference cycle yourself. It is your gateway drug into the memory arrangement, and once it's not available, there is no going back. If you use the _ = syntax, you basically state that the only way for the observable and observer to be released is by completing the observable sequence.

This might sometimes be exactly what you want! For example, if you're calling Observable.just, it doesn't really matter that you won't ensure breaking the cycle. The single element is being emitted instantaneously, followed by .completed event. There are, however, many situations in which you might not be entirely sure of the completion possibilities for observable in question:

  • you're given the Observable from another object and the documentation doesn't state whether it completes,
  • you're given the Observable from another object and the documentation does state it completes, but there have been some changes in the internal implementation of that object along the way and no one remembered to update documentation,
  • the Observable is explicitly not completing (examples include VariableObservable.interval, subjects),
  • there is an error in observable implementation, such as forgetting to send .completed event in Observable.create closure.

Since you're rarely in control of all the observables in your app, and even then there's a possibility for a mistake, the rule of thumb is to ensure yourself that the reference cycle will be broken. Either keep the reference to disposable and call the .dispose() method when the time comes, or use a handy helper like DisposeBag that's gonna do it for you. You might also provide a separate cycle-breaking observable with .takeUntil operator. What way to choose depends on your particular situation, but always remember that:

Subscription creates a reference cycle between the observable and the observer. It might be broken implicitly, when observable completes, or explicitly, via .dispose()call. If you're not 100% sure when or whether observable will complete, break the subscription reference cycle yourself!

Now that we've cleared things up, I feel like I owe you a little bit of explanation. The mental model I've drawn above is, well, a mental model, and therefore not strictly correct. What's happening in the current RxSwift implementation (version 3.x/4.x at the time of writing) is a little bit more complicated. To understand the actual behavior, let us have a deeper dive into the RxSwift internals.

The implementation of the subscribe method

Where is the subscribe method implemented? First place to search would be, unsurprisingly, the ObservableType.swift file. It contains declaration of subscribe method as a part of the ObservableType protocol:

What implements this protocol? Basically, all the various types of observables. Let's concentrate on the major implementation called Observable, since it's a base class for all but one of the observables defined in RxSwift. Its version of subscribe method is short and simple:

Oh, the abstract method. We need to look into the Observable subclasses then. A quick search reveals that there are 14 different overridden subscribe methods within the RxSwift source code at the time of writing. We can put each of them in one of three buckets:

  • implementations in subjects, which provide their own subscription logic due to the extraordinary place they occupy in the RxSwift lore,
  • implementations in connectable observables, which must deal with subscriptions in a special way due to their ability of multicasting,
  • implementation in Producer, a subclass of Observable which provides the subscription logic for most of the operators you've grown to love and use.

Let's concentrate on Producer type, since it represents the variant of observable that is simplest to reason about: the emitter of the sequence of events, from the single source to single recipient. It's definitely the most common use case. Almost all the operators are derived from Producer base class. While a few of them provide a dedicated subscription logic that's optimized further to their particular needs (see JustEmpty or Error for basic examples), the vast majority use the following implementation of subscribe from Producer (some scheduler-related logic was stripped for better readability):

So, what's happening here? First, the observable creates a SinkDisposer object. Then it uses the SinkDisposer instance to create two additional objects: sink and subscription. They both have the same type: Disposable, which is a protocol exposing a single dispose method. These two objects are being passed back to SinkDisposer via a setter method, which suggests, correctly, that their references will be kept. After all that setup is done, the SinkDisposer is being returned. So, when we're calling .dispose() on the object returned from the subscribe method to break the subscription, we're actually calling it on SinkDisposerinstance.

So far, so good. One mystery down, still a few to go. Let's dive into two crucial steps performed here: let sinkAndSubscription = run(observer, cancel: disposer) and disposer.setSinkAndSubscription(sink: sinkAndSubscription.sink, subscription: sinkAndSubscription.subscription) methods. They are, as you'll see, the essential parts of creating the reference cycle that keeps the subscription alive.

Sinking in the sea of Observables

The run method is provided by the Producer, but only in an abstract variant:

The actual logic is specific to the particular Producer subclass. Before we check them, it's crucial to understand the pattern that is very common across the RxSwift operators implementation: sink. This is the way that RxSwift deals with the complexity of observable streams and how it separates the creation of the observable from the logic that is being run the moment you subscribe to it.

The idea is simple: when you use the particular operator (say you map the existing observable), it returns an instance of a particular observable type dedicated to the task at hand. So calling Observable.just(1) gives you back the instance of Just class, which is a subclass of the Producer optimized for returning just one element and then completing. When you call Observable<Int>.just(1).map { $0 == 42 }, you're being given back the instance of  Map class, which is a subclass of the Producer optimized for applying the closure to each element in the .next event. However, at the very moment you create an observable, there's nothing being actually sent to anyone yet, because no one has subscribed. The actual work of passing the events starts during the subscribe method, more precisely: in the run method that we're so interested in.

That’s where the sink pattern shines. Each observable type has its own dedicated Sink subclass. For the interval operator, represented by the Timer observable, there is the TimerSink. For the flatMap operator, represented by the FlatMap observable, there is the FlatMapSink. For the catchErrorJustReturn operator, represented by the Catch observable, there is the CatchSink. I think you get the idea!

But what is this Sink object, exactly? It is the place that stores the actual operator logic. So, for the interval, the TimerSink is the place that schedules sending events after each period and keeps track of the internal state (i.e. how many events were already sent). For the flatMap, the FlatMapSink (and its superclass, MergeSink) is the place that subscribes to the observables returned from flatmapping closure, keeps track of them and passes their events further. You may basically think of a Sink as a wrapper for the observer. It listens for the events from observable, applies the operator-related logic and then passes those transformed events further down the stream.

This is how RxSwift isolates the creation of observables from the execution of subscription logic for Producer-based observables. The former is encapsulated in the Observable subclass, the latter is provided by the Sink subclass. The separation of responsibilities greatly simplifies the actual objects’ implementations and makes it possible to write multiple variants of Sink optimized for different scenarios.

Sink full of knowledge

Now that we know what the sink pattern is, let's go back to the run method. Each of these Producersubclasses provides its own run implementation. While details may vary, it usually can be abstracted into three steps:

  • create a sink object as an instance of a class that derives from Sink type,
  • create a subscription instance, usually by running sink.run method,
  • return both instances wrapped in a tuple.

To clarify things further, please look at the FlatMap.run example:

The most important thing from the memory-management perspective is that in the moment of subscription the sink is given everything that's needed to do the job:

  • the events source (aka Observable),
  • the event recipient (called observer),
  • the operator-related data (for example, the flatmapping closure),
  • and the SinkDisposer instance (under the name cancel).

sink is free to store as many of these references as it sees fit for providing the required behavior of the operator. At the minimum, it's gonna store the observer and, what's gonna be crucial later, the SinkDisposer. Possibly more! Looking at the memory graph, sink quickly becomes the Northern Star in the constellation of objects related to the subscription.

There is, however, one more object returned from observable's run method. It's subscription. This is the object that takes care of the logic that should be run when the subscription is being disposed of. Remember create operator? It takes a closure that returns Disposable, an object responsible for performing the cleanup. This is the same Disposable that's returned from AnonymousObservableSink's run method as subscription. For each operator there might be some tasks to cancel, some resources to free, some internal subscription to dispose of. They're all enclosed in the subscription object, and the ability to perform the cleanup is exposed via subscription.dispose method.

The Producer's reference cycle: Sink and SinkDisposer

Knowing that, let's get back to the last component of the subscribe method implementation. Before the SinkDisposer is returned, the setSinkAndSubscription method is called. It does exactly what you might expect: the sink and subscription objects are passed via setter and kept in the SinkDisposer properties. They are referenced strongly, but wrapped into Optionals, which makes it possible set the references to nil later.

Have you already spotted the reference cycle from our mental model? It's hidden in the plain sight! sinkstores the reference to SinkDisposer, and SinkDisposer stores the reference to sink. That's why the subscription doesn't release itself on the scope exit. Two objects keep each other alive, in an eternal hug of memory-lockup, until the end of the app. And since sink keeps SinkDisposer as non-Optional property, the one and only way of breaking the cycle is by asking the SinkDisposer to set the sink Optional reference to nil. And guess what? This is exactly what's happening in the SinkDisposer.dispose method. It calls dispose on sink, then it calls dispose on subscription and then it nils out references to break the retain cycle. So for the Producer-based observables, the SinkDisposer is the reference cycle managerfrom the mental model that we've introduced earlier.

After all those details, you might wonder how come the reference cycle breaks itself when observable completes? Well, we've just stated that it requires SinkDisposer.dispose() method, so the answer is simple. The central point of subscription process, sink object, keeps the reference to SinkDisposer and also receives all the events from the observable. So once it gets either .completed or .error event and once its own logic determines that this is the sequence completion, it simply calls dispose method on its SinkDisposer reference. This way the cycle is being broken from the inside.

To summarize the process, here comes the diagram of the actual reference cycle in the usual Producer-based observable subscription:

The road goes ever on and on

Aren't you curious what happens in non-Producer-based cases, such as subjects or connectable observables? The concept is very similar. There is always a reference cycle that's controlled by some kind of reference cycle manager and there is always a way of breaking this cycle by dispose method invocation. I encourage you to dive into RxSwift source code and see for yourself!

Now it is clear where the mental model comes from. The details of particular subscription vary, and each observable type has specific optimizations applied for better performance and cleaner architecture. However, the basic idea prevails: there's a reference cycle and the only way of breaking this cycle is either by completing the observable or through reference cycle manager.

Relying on the completion of the observable, while useful in many real-life situations, should always be a road taken with much care and deliberation. If you're not sure of how to handle the subscription's memory management, or you simply want your code to be more resilient to the future changes, it's always best to default to supplying a mechanism of breaking the reference cycle explicitly.

That's all for this time. More ways to shoot yourself in the foot with RxSwift are coming. Next time we're going to look at memory management from a different perspective, focusing not on the subscription process, but on what's being passed to operators. Until then, don't forget to follow Polidea on Twitter for more mobile development related posts!

https://www.polidea.com/blog/8-Mistakes-to-Avoid-while-Using-RxSwiftPart-1/

8 Mistakes to Avoid while Using RxSwift. Part 1的更多相关文章

  1. C# Development 13 Things Every C# Developer Should Know

    https://dzone.com/refcardz/csharp C#Development 13 Things Every C# Developer Should Know Written by ...

  2. Angular2新人常犯的5个错误

    看到这儿,我猜你肯定已经看过一些博客.技术大会录像了,现在应该已经准备好踏上angular2这条不归路了吧!那么上路后,哪些东西是我们需要知道的? 下面就是一些新手常见错误汇总,当你要开始自己的ang ...

  3. <译>Spark Sreaming 编程指南

    Spark Streaming 编程指南 Overview A Quick Example Basic Concepts Linking Initializing StreamingContext D ...

  4. 10 Biggest Business Mistakes That Every Entrepreneur Should Avoid

    原文链接:http://www.huffingtonpost.com/syed-balkhi/10-biggest-business-mista_b_7626978.html When I start ...

  5. 5 Common Interview Mistakes that Could Cost You Your Dream Job (and How to Avoid Them)--ref

    There have been many articles on our site on software testing interviews. That is because, we, as IT ...

  6. 11 Clever Methods of Overfitting and how to avoid them

    11 Clever Methods of Overfitting and how to avoid them Overfitting is the bane of Data Science in th ...

  7. [转]50 Shades of Go: Traps, Gotchas, and Common Mistakes for New Golang Devs

    http://devs.cloudimmunity.com/gotchas-and-common-mistakes-in-go-golang/ 50 Shades of Go: Traps, Gotc ...

  8. Top 10 Mistakes Java Developers Make--reference

    This list summarizes the top 10 mistakes that Java developers frequently make. #1. Convert Array to ...

  9. Yet Another 10 Common Mistakes Java Developers Make When Writing SQL (You Won’t BELIEVE the Last One)--reference

    (Sorry for that click-bait heading. Couldn’t resist ;-) ) We’re on a mission. To teach you SQL. But ...

随机推荐

  1. 爬取表格类网站数据并保存为excel文件

    本文转载自以下网站:50 行代码爬取东方财富网上市公司 10 年近百万行财务报表数据 https://www.makcyun.top/web_scraping_withpython6.html 主要学 ...

  2. Linux—Ubuntu14.0.5安装MySQL

    1.更新资援列表 sudo apt-get update 2.安装mysql的操作命令(下一步选中“Y”) sudo apt-get install mysql-server 3.输入MySQLroo ...

  3. 强大的jQuery图片查看器插件Viewer.js

    简介 Viewer.js 是一款强大的图片查看器 Viewer.js 有以下特点: 支持移动设备触摸事件 支持响应式 支持放大/缩小 支持旋转(类似微博的图片旋转) 支持水平/垂直翻转 支持图片移动 ...

  4. 11、mybatis的映射xml中参数类型的别名

    在mapper.xml中,定义很多的statement,statement需要parameterType指定输入参数的类型.需要resultType指定输出结果的映射类型. 如果在指定类型时输入类型全 ...

  5. centos 7.2 安装php56-xml

    linux下, 使用thinkphp的模板标签,如 eq, gt, volist defined, present , empty等 标签时, 报错: used undefined function ...

  6. CentOS 7 x64下Apache+MySQL(Mariadb)+PHP5.6的安装

    每次搭建新服务器,都要来来回回把这些包再装一下,来来回回搞了不下20遍了吧,原来都是凭经验,配置过程中重复入坑是难免的,故写此文做个备忘.虽然有像xampp这样的集成包,但是在生产环境的Linux发行 ...

  7. Extensions for Spatial Data

    http://dev.mysql.com/worklog/task/?spm=5176.100239.blogcont4270.8.j3asa7&id=6609 前文: 这两天因为项目原因看了 ...

  8. redis之Hash存储与String存储内存消耗对照

    存储对象User String存储方式: SET media:1155315 939 GET media:1155315 > 939 String结构存储该对象 User243 243600 存 ...

  9. hdu 1002 A + B Problem II(大正整数相加)

    代码: #include<cstdio> #include<cstring> #define Min(a,b) ((a)<(b)?(a):(b)) using names ...

  10. PHP设计模式之 单例模式 工厂模式 实例讲解

    单例模式又称为职责模式,它用来在程序中创建一个单一功能的访问点,通俗地说就是实例化出来的对象是唯一的.所有的单例模式至少拥有以下三种公共元素:1. 它们必须拥有一个构造函数,并且必须被标记为priva ...