How Blink works

Author: haraken@

Last update: 2018 Aug 14

Status: PUBLIC

Working on Blink is not easy. It's not easy for new Blink developers because there are a lot of Blink-specific concepts and coding conventions that have been introduced to implement a very fast rendering engine. It's not easy even for experienced Blink developers because Blink is huge and extremely sensitive to performance, memory and security.

This document aims at providing a 10k foot overview of "how Blink works", which I hope will help Blink developers get familiar with the architecture quickly:

The document is NOT a thorough tutorial of Blink's detailed architectures and coding rules (which are likely to change and be outdated). Rather the document concisely describes Blink's fundamentals that are not likely to change in short term and points out resources you can read if you want to learn more.
The document does NOT explain specific features (e.g., ServiceWorkers, editing). Rather the document explains fundamental features used by a broad range of the code base (e.g., memory management, V8 APIs).

For more general information about Blink's development, see the Chromium wiki page.

What Blink does

Process / thread architecture

Processes

Threads

Initialization of Blink

Directory structure

Content public APIs and Blink public APIs

Directory structure and dependencies

WTF

Memory management

Task scheduling

Page, Frame, Document, DOMWindow etc

Concepts

OOPIF

Detached Frame / Document

Web IDL bindings

V8 and Blink

Isolate, Context, World

What Blink does

Blink is a rendering engine of the web platform. Roughly speaking, Blink implements everything that renders content inside a browser tab:

Implement the specs of the web platform (e.g., HTML standard), including DOM, CSS and Web IDL
Embed V8 and run JavaScript
Request resources from the underlying network stack
Build DOM trees
Calculate style and layout
Embed Chrome Compositor and draw graphics

Blink is embedded by many customers such as Chromium, Android WebView and Opera via content public APIs.

From the code base perspective, "Blink" normally means //third_party/blink/. From the project perspective, "Blink" normally means projects that implement web platform features. Code that implement web platform features span //third_party/blink/, //content/renderer/, //content/browser/ and other places.

Process / thread architecture

Processes

Chromium has a multi-process architecture. Chromium has one browser process and N sandboxed renderer processes. Blink runs in a renderer process.

How many renderer processes are created? For security reasons, it is important to isolate memory address regions between cross-site documents (this is called Site Isolation). Conceptually each renderer process should be dedicated to at most one site. Realistically, however, it's sometimes too heavy to limit each renderer process to a single site when users open too many tabs or the device does not have enough RAM. Then a renderer process may be shared by multiple iframes or tabs loaded from different sites. This means that iframes in one tab may be hosted by different renderer processes and that iframes in different tabs may be hosted by the same renderer process. There is no 1:1 mapping between renderer processes, iframes and tabs.

Given that a renderer process runs in a sandbox, Blink needs to ask the browser process to dispatch system calls (e.g., file access, play audio) and access user profile data (e.g., cookie, passwords). This browser-renderer process communication is realized by Mojo. (Note: In the past we were using Chromium IPC and a bunch of places are still using it. However, it's deprecated and uses Mojo under the hood.) On the Chromium side, Servicification is ongoing and abstracting the browser process as a set of "service"s. From the Blink perspective, Blink can just use Mojo to interact with the services and the browser process.

If you want to learn more:

Multi-process Architecture
Mojo programming in Blink: platform/mojo/MojoProgrammingInBlink.md

Threads

How many threads are created in a renderer process?

Blink has one main thread, N worker threads and a couple of internal threads.

Almost all important things happen on the main thread. All JavaScript (except workers), DOM, CSS, style and layout calculations run on the main thread. Blink is highly optimized to maximize the performance of the main thread, assuming the mostly single-threaded architecture.

Blink may create multiple worker threads to run Web Workers, ServiceWorker and Worklets.

Blink and V8 may create a couple of internal threads to handle webaudio, database, GC etc.

For cross-thread communications, you have to use message passing using PostTask APIs. Shared memory programming is discouraged except a couple of places that really need to use it for performance reasons. This is why you don't see many MutexLocks in the Blink code base.

If you want to learn more:

Thread programming in Blink: platform/wtf/ThreadProgrammingInBlink.md
Workers: core/workers/README.md

Initialization & finalization of Blink

Blink is initialized by BlinkInitializer::Initialize(). This method must be called before executing any Blink code.

On the other hand, Blink is never finalized; i.e., the renderer process is forcibly exited without being cleaned up. One reason is performance. The other reason is in general it's really hard to clean up everything in the renderer process in a gracefully ordered manner (and it's not worth the effort).

Directory structure

Content public APIs and Blink public APIs

Content public APIs are the API layer that enables embedders to embed the rendering engine. Content public APIs must be carefully maintained because they are exposed to embedders.

Blink public APIs are the API layer that exposes functionalities from //third_party/blink/ to Chromium. This API layer is just historical artifact inherited from WebKit. In the WebKit era, Chromium and Safari shared the implementation of WebKit, so the API layer was needed to expose functionalities from WebKit to Chromium and Safari. Now that Chromium is the only embedder of //third_party/blink/, the API layer does not make sense. We're actively decreasing # of Blink public APIs by moving web-platform code from Chromium to Blink (the project is called Onion Soup).

Directory structure and dependencies

//third_party/blink/ has the following directories. See this document for a more detailed definition of these directories:

platform/

○ A collection of lower level features of Blink that are factored out of a monolithic core/. e.g., geometry and graphics utils.

core/ and modules/

○ The implementation of all web-platform features defined in the specs. core/ implements features tightly coupled with DOM. modules/ implements more self-contained features. e.g., webaudio, indexeddb.

bindings/core/ and bindings/modules/

○ Conceptually bindings/core/ is part of core/, and bindings/modules/ is part of modules/. Files that heavily use V8 APIs are put in bindings/{core,modules}.

controller/

○ A set of high-level libraries that use core/ and modules/. e.g., devtools front-end.

Dependencies flow in the following order:

Chromium => controller/ => modules/ and bindings/modules/ => core/ and bindings/core/ => platform/ => low-level primitives such as //base, //v8 and //cc

Blink carefully maintains the list of low-level primitives exposed to //third_party/blink/.

If you want to learn more:

Directory structure and dependencies: blink/renderer/README.md

WTF

WTF is a "Blink-specific base" library and located at platform/wtf/. We are trying to unify coding primitives between Chromium and Blink as much as possible, so WTF should be small. This library is needed because there are a number of types, containers and macros that really need to be optimized for Blink's workload and Oilpan (Blink GC). If types are defined in WTF, Blink has to use the WTF types instead of types defined in //base or std libraries. The most popular ones are vectors, hashsets, hashmaps and strings. Blink should use WTF::Vector, WTF::HashSet, WTF::HashMap, WTF::String and WTF::AtomicString instead of std::vector, std::*set, std::*map and std::string.

If you want to learn more:

How to use WTF: platform/wtf/README.md

Memory management

As far as Blink is concerned, you need to care about three memory allocators:

PartitionAlloc
Oilpan (a.k.a. Blink GC)
malloc (discouraged)

You can allocate an object on PartitionAlloc's heap by using USING_FAST_MALLOC():

class SomeObject {

USING_FAST_MALLOC(SomeObject);

static std::unique_ptr<SomeObject> Create() {

return std::make_unique<SomeObject>(); // Allocated on PartitionAlloc's heap.

}

};

The lifetime of objects allocated by PartitionAlloc should be managed by scoped_refptr<> or std::unique_ptr<>. It is strongly discouraged to manage the lifetime manually. Manual delete is banned in Blink.

You can allocate an object on Oilpan's heap by using GarbageCollected:

class SomeObject : public GarbageCollected<SomeObject> {

static SomeObject* Create() {

return new SomeObject; // Allocated on Oilpan's heap.

}

};

The lifetime of objects allocated by Oilpan is automatically managed by garbage collection. You have to use special pointers (e.g., Member<>, Persistent<>) to hold objects on Oilpan's heap. See this API reference to get familiar with programming restrictions about Oilpan. The most important restriction is that you are not allowed to touch any other Oilpan's object in a destructor of Oilpan's object (because the destruction order is not guaranteed).

If you use neither USING_FAST_MALLOC() nor GarbageCollected, objects are allocated on system malloc's heap. This is strongly discouraged in Blink. All Blink objects should be allocated by PartitionAlloc or Oilpan, as follows:

Use Oilpan by default.
Use PartitionAlloc only when 1) the lifetime of the object is very clear and std::unique_ptr<> is enough, 2) allocating the object on Oilpan introduces a lot of complexity or 3) allocating the object on Oilpan introduces a lot of unnecessary pressure to the garbage collection runtime.

Regardless of whether you use PartitionAlloc or Oilpan, you have to be really careful not to create dangling pointers (Note: raw pointers are strongly discouraged) or memory leaks.

If you want to learn more:

How to use PartitionAlloc: platform/wtf/allocator/Allocator.md
How to use Oilpan: platform/heap/BlinkGCAPIReference.md
Oilpan GC design: platform/heap/BlinkGCDesign.md

Task scheduling

To improve responsiveness of the rendering engine, tasks in Blink should be executed asynchronously whenever possible. Synchronous IPC / Mojo and any other operations that may take several milliseconds are discouraged (although some are inevitable e.g., user's JavaScript execution).

All tasks in a renderer process should be posted to Blink Scheduler with proper task types, like this:

// Post a task to frame's scheduler with a task type of kNetworking

frame->GetTaskRunner(TaskType::kNetworking)->PostTask(..., WTF::Bind(&Function));

Blink Scheduler maintains multiple task queues and smartly prioritizes tasks to maximize user-perceived performance. It is important to specify proper task types to let Blink Scheduler schedule the tasks correctly and smartly.

If you want to learn more:

How to post tasks: platform/scheduler/PostTask.md

Page, Frame, Document, DOMWindow etc

Concepts

Page, Frame, Document, ExecutionContext and DOMWindow are the following concepts:

A Page corresponds to a concept of a tab (if OOPIF explained below is not enabled). Each renderer process may contain multiple tabs.
A Frame corresponds to a concept of a frame (the main frame or an iframe). Each Page may contain one or more Frames that are arranged in a tree hierarchy.
A DOMWindow corresponds to a window object in JavaScript. Each Frame has one DOMWindow.
A Document corresponds to a window.document object in JavaScript. Each Frame has one Document.
An ExecutionContext is a concept that abstracts a Document (for the main thread) and a WorkerGlobalScope (for a worker thread).

Renderer process : Page = 1 : N.

Page : Frame = 1 : M.

Frame : DOMWindow : Document (or ExecutionContext) = 1 : 1 : 1 at any point in time, but the mapping may change over time. For example, consider the following code:

iframe.contentWindow.location.href = "https://example.com";

In this case, a new Window and a new Document are created for https://example.com. However, the Frame may be reused.

(Note: Precisely speaking, there are some cases where a new Document is created but the Window and the Frame are reused. There are even more complex cases.)

If you want to learn more:

core/frame/FrameLifecycle.md

Out-of-Process iframes (OOPIF)

Site Isolation makes things more secure but even more complex. :) The idea of Site Isolation is to create one renderer process per site. (A site is a page’s registrable domain + 1 label, and its URL scheme. For example, https://mail.example.com and https://chat.example.com are in the same site, but https://noodles.com and https://pumpkins.com are not.) If a Page contains one cross-site iframe, the Page may be hosted by two renderer processes. Consider the following page:

<body>

</body>

The main frame and the <iframe> may be hosted by different renderer processes. A frame local to the renderer process is represented by LocalFrame and a frame not local to the renderer process is represented by RemoteFrame.

From the perspective of the main frame, the main frame is a LocalFrame and the <iframe> is a RemoteFrame. From the perspective of the <iframe>, the main frame is a RemoteFrame and the <iframe> is a LocalFrame.

Communications between a LocalFrame and RemoteFrame (which may exist in different renderer processes) are handled via the browser process.

If you want to learn more:

Design docs: Site isolation design docs
How to write code with site isolation: core/frame/SiteIsolation.md

Detached Frame / Document

Frame / Document may be in a detached state. Consider the following case:

doc = iframe.contentDocument;

iframe.remove(); // The iframe is detached from the DOM tree.

doc.createElement("div"); // But you still can run scripts on the detached frame.

The tricky fact is that you can still run scripts or DOM operations on the detached frame. Since the frame has already been detached, most DOM operations will fail and throw errors. Unfortunately, behaviors on detached frames are not really interoperable among browsers nor well-defined in the specs. Basically the expectation is that JavaScript should keep running but most DOM operations should fail with some proper exceptions, like this:

void someDOMOperation(...) {

if (!script_state_->ContextIsValid()) { // The frame is already detached

…; // Set an exception etc

return;

}

This means that in common cases Blink needs to do a bunch of clean-up operations when the frame gets detached. You can do this by inheriting from ContextLifecycleObserver, like this:

class SomeObject : public GarbageCollected<SomeObject>, public ContextLifecycleObserver {

void ContextDestroyed() override {

// Do clean-up operations here.

}

~SomeObject() {

// It's not a good idea to do clean-up operations here because it's too late to do them. Also a destructor is not allowed to touch any other objects on Oilpan's heap.

}

};

Web IDL bindings

When JavaScript accesses node.firstChild, Node::firstChild() in node.h gets called. How does it work? Let's take a look at how node.firstChild works.

First of all, you need to define an IDL file per the spec:

// node.idl

interface Node : EventTarget {

[...] readonly attribute Node? firstChild;

};

The syntax of Web IDL is defined in the Web IDL spec. [...] is called IDL extended attributes. Some of the IDL extended attributes are defined in the Web IDL spec and others are Blink-specific IDL extended attributes. Except the Blink-specific IDL extended attributes, IDL files should be written in a spec-comformant manner (i.e., just copy and paste from the spec).

Second, you need to define a C++ class for Node and implement a C++ getter for firstChild:

class EventTarget : public ScriptWrappable { // All classes exposed to JavaScript must inherit from ScriptWrappable.

...;

};

class Node : public EventTarget {

DEFINE_WRAPPERTYPEINFO(); // All classes that have IDL files must have this macro.

Node* firstChild() const { return first_child_; }

};

In common cases, that's it. When you build node.idl, the IDL compiler auto-generates Blink-V8 bindings for the Node interface and Node.firstChild. The auto-generated bindings are generated in //src/out/{Debug,Release}/gen/third_party/ blink/renderer/bindings/core/v8/v8_node.h. When JavaScript calls node.firstChild, V8 calls V8Node::firstChildAttributeGetterCallback() in v8_node.h, then it calls Node::firstChild() which you defined in the above.

If you want to learn more:

How to add Web IDL bindings: bindings/IDLCompiler.md
How to use IDL extended attributes: bindings/IDLExtendedAttributes.md
Spec: Web IDL spec

V8 and Blink

Isolate, Context, World

When you write code that touches V8 APIs, it is important to understand the concept of Isolate, Context and World. They are represented by v8::Isolate, v8::Context and DOMWrapperWorld in the code base respectively.

Isolate corresponds to a physical thread. Isolate : physical thread in Blink = 1 : 1. The main thread has its own Isolate. A worker thread has its own Isolate.

Context corresponds to a global object (In case of a Frame, it's a window object of the Frame). Since each frame has its own window object, there are multiple Contexts in a renderer process. When you call V8 APIs, you have to make sure that you're in a correct context. Otherwise, v8::Isolate::GetCurrentContext() will return a wrong context and in the worst case it will end up leaking objects and causing security issues.

World is a concept to support content scripts of Chrome extensions. Worlds do not correspond to anything in web standards. Content scripts want to share DOM with the web page, but for security reasons JavaScript objects of content scripts must be isolated from the JavaScript heap of the web page. (Also a JavaScript heap of one content script must be isolated from a JavaScript heap of another content script.) To realize the isolation, the main thread creates one main world for the web page and one isolated world for each content script. The main world and the isolated worlds can access the same C++ DOM objects but their JavaScript objects are isolated. This isolation is realized by creating multiple V8 wrappers for one C++ DOM object; i.e., one V8 wrapper per world.

What's a relationship between Context, World and Frame?

Imagine that there are N Worlds on the main thread (one main world + (N - 1) isolated worlds). Then one Frame should have N window objects, each of which is used for one world. Context is a concept that corresponds to a window object. This means that when we have M Frames and N Worlds, we have M * N Contexts (but the Contexts are created lazily).

In case of a worker, there is only one World and one global object. Thus there is only one Context.

Again, when you use V8 APIs, you should be really careful about using the correct context. Otherwise you'll end up leaking JavaScript objects between isolated worlds and causing security disasters (e.g., an extension from A.com can manipulate an extension from B.com).

If you want to learn more:

bindings/core/v8/V8BindingDesign.md

V8 APIs

There are a lot of V8 APIs defined in //v8/include/v8.h. Since V8 APIs are low-level and hard to use correctly, platform/bindings/ provides a bunch of helper classes that wrap V8 APIs. You should consider using the helper classes as much as possible. If your code has to use V8 APIs heavily, the files should be put in bindings/{core,modules}.

V8 uses handles to point to V8 objects. The most common handle is v8::Local<>, which is used to point to V8 objects from a machine stack. v8::Local<> must be used after allocating v8::HandleScope on the machine stack. v8::Local<> should not be used outside the machine stack:

void function() {

v8::HandleScope scope;

v8::Local<v8::Object> object = ...; // This is correct.

}

class SomeObject : public GarbageCollected<SomeObject> {

v8::Local<v8::Object> object_; // This is wrong.

};

If you want to point to V8 objects from outside the machine stack, you need to use wrapper tracing. However, you have to be really careful not to create a reference cycle with it. In general V8 APIs are hard to use. Ask blink-review-bindings@ if you're not sure about what you're doing.

If you want to learn more:

How to use V8 APIs and helper classes: platform/bindings/HowToUseV8FromBlink.md

V8 wrappers

Each C++ DOM object (e.g., Node) has its corresponding V8 wrapper. Precisely speaking, each C++ DOM object has its corresponding V8 wrapper per world.

V8 wrappers have strong references to their corresponding C++ DOM objects. However, the C++ DOM objects have only weak references to the V8 wrappers. So if you want to keep V8 wrappers alive for a certain period of time, you have to do that explicitly. Otherwise, V8 wrappers will be prematurely collected and JS properties on the V8 wrappers will be lost...

div = document.getElementbyId("div");

child = div.firstChild;

child.foo = "bar";

child = null;

gc(); // If we don't do anything, the V8 wrapper of |firstChild| is collected by the GC.

assert(div.firstChild.foo === "bar"); //...and this will fail.

If we don't do anything, child is collected by the GC and thus child.foo is lost. To keep the V8 wrapper of div.firstChild alive, we have to add a mechanism that "keeps the V8 wrapper of div.firstChild alive as long as the DOM tree which div belongs to is reachable from V8".

There are two ways to keep V8 wrappers alive: ActiveScriptWrappable and wrapper tracing.

If you want to learn more:

How to manage lifetime of V8 wrappers: bindings/core/v8/V8Wrapper.md
How to use wrapper tracing: platform/bindings/TraceWrapperReference.md

Rendering pipeline

There is a long journey from when an HTML file is delivered to Blink to when pixels are displayed on the screen. The rendering pipeline is architectured as follows.

Read this excellent deck to learn what each phase of the rendering pipeline does. (I don't think I can write a better explanation than the deck :-)

If you want to learn more:

Questions?

You can ask any questions to blink-dev@chromium.org (for general questions) or platform-architecture-dev@chromium.org (for architecture-related questions). We're always happy to help! :D

How Blink works的更多相关文章

Multi-process Resource Loading
For Developers‎ > ‎Design Documents‎ > ‎ Multi-process Resource Loading 目录 1 This design doc n ...
SetForegroundWindow Win32-API not always works on Windows-7
BIG NOTE After messing with this API for the last 2 months, the solution/s below are all not stable ...
spring注解源码分析--how does autowired works？
1. 背景注解可以减少代码的开发量,spring提供了丰富的注解功能.我们可能会被问到,spring的注解到底是什么触发的呢?今天以spring最常使用的一个注解autowired来跟踪代码,进行d ...
[Unity][Heap sort]用Unity动态演示堆排序的过程(How Heap Sort Works)
[Unity][Heap sort]用Unity动态演示堆排序的过程 How Heap Sort Works 最近做了一个用Unity3D动态演示堆排序过程的程序. I've made this ap ...
How PhoneGap & Titanium Works
转载自 http://www.appcelerator.com/blog/2012/05/comparing-titanium-and-phonegap/ How PhoneGap Works As ...
Blink, 通向哈里·波特的魔法世界
<哈里·波特>的故事里面,魔法界的新闻报纸都是动画的,配图带有动画效果.能够回放新闻的主要场景. 初次看到这个,感觉还挺新鲜的.不过现在,Blink 这样的 App 可以让这个魔法世界的幻 ...
Saying that Java is nice because it works on every OS is like saying that anal sex is nice because it works on every gender.
Saying that Java is nice because it works on every OS is like saying that anal sex is nice because i ...
How Garbage Collection Really Works
Java Memory Management, with its built-in garbage collection, is one of the language's finest achiev ...
Blink Without Delay: 不使用 delay() 函数而使 LED 闪烁
不使用 delay() 函数而使 LED 闪烁有些时候你需要同时做两件事.例如,你可能希望在读取按键按下状态同时让LED闪烁. 在这种情况下,你不能使用 delay(),因为Arduino程序会在d ...

随机推荐

LaTeX 写算法伪码
本系列文章由 @YhL_Leo 出品,转载请注明出处. 文章链接: http://blog.csdn.net/yhl_leo/article/details/50054953 LaTeX写算法伪码,需 ...
PNG文件结构分析
http://blog.163.com/iwait2012@126/blog/static/16947232820124411174877/ PNG文件结构分析对于一个PNG文件来说,其文件头总是由 ...
在AutoLyout中动态获得cell的高度和 autoLyout中的小随笔
autoLyout中动态获得cell的高度和autoLyout小总结一.在autoLyout中通过动态的方式来获取cell 的方式呢? 1. 在布局时候要有对于cell中contentV ...
hadoop学习；datajoin；chain签名；combine（）
hadoop有种简化机制来管理job和control的非线性作业之间的依赖.job对象时mapreduce的表现形式.job对象的实例化可通过传递一个jobconf对象到作业的构造函数中来实现. x. ...
创建一个Spring的HelloWorld程序
Spring IOC IOC指的是控制反转,把对象的创建.初始化.销毁等工作都交给Spring容器.由spring容器来控制对象的生命周期.下图能够说明我们传统创建类的方式和使用Spring之后的差别 ...
Unity3D的场景单位和 3D建模软件的单位之间的关系
转载自 : http://www.ceeger.com/Unity/Doc/2011/3D_to_Unity.html Date:2011-08-24 03:52 Unity的系统单位为米,其他3D软 ...
angularjs 自定义服务
<!DOCTYPE HTML> <html ng-app="myApp"> <head> <meta http-equiv="C ...
javascript中如何获取对象名
javascript中如何获取对象名一.总结一句话总结:将对象传入参数,看参数是否为函数(js中的对象和函数是一个意思么(函数肯定是对象)),对象参数.name属性即可获得 //版本4 funct ...
spark pipeline 例子
""" Pipeline Example. """ # $example on$ from pyspark.ml import Pipeli ...
修改host方法
打开路径 C:\Windows\System32\drivers\etc 将hosts文件拷贝出来修改之后放回去覆盖即可以下是一个例子,想得到ip可以先ping一下那个域名. 左边是ip,右边是域名 ...

How Blink works

What Blink does

Process / thread architecture

Processes

Threads

Initialization & finalization of Blink

Directory structure

Content public APIs and Blink public APIs

Directory structure and dependencies

WTF

Memory management

Task scheduling

Page, Frame, Document, DOMWindow etc

Concepts

Out-of-Process iframes (OOPIF)

Detached Frame / Document

Web IDL bindings

V8 and Blink

Isolate, Context, World

V8 APIs

V8 wrappers

Rendering pipeline

Questions?

How Blink works的更多相关文章

随机推荐

热门专题