做一个快乐的互联网搬运工～

逃逸分析

逃逸分析的概念

在编译程序优化理论中，逃逸分析是一种确定指针动态范围的方法——分析在程序的哪些地方可以访问到指针。它涉及到指针分析和形状分析。当一个变量(或对象)在子程序中被分配时，一个指向变量的指针可能逃逸到其它执行线程中，或是返回到调用者子程序。

——维基百科

Go在一定程度消除了堆和栈的区别，因为go在编译的时候进行逃逸分析，来决定一个对象放栈上还是放堆上，不逃逸的对象放栈上，可能逃逸的放堆上。

逃逸分析的作用

堆与栈相比，栈具有速度快，易分配、易释放的特点，而堆适合不可预知大小的内存分配，但分配、回收成本大。通过逃逸分析，可以尽量把那些不需要分配到堆上的变量直接分配到栈上，堆上的变量少了，会减轻分配堆内存的开销，同时也会减少gc的压力，提高程序的运行速度。

另外，通过逃逸分析，还可以进行同步消除，如果你定义的对象的方法上有同步锁，但在运行时，却只有一个线程在访问，此时逃逸分析后的机器码，会去掉同步锁运行。

如何逃逸分析

根据维基百科的定义我们可以简化一下概念：如果一个函数返回对一个变量的引用，那么它就会发生逃逸。

看一个例子：

package main

import "fmt"

//func foo() *int {

//	t:=3

//	return &t

//}

func boo() int {

	t:=3

	return t

}

func main()  {

	//x:=foo()

	//fmt.Println(*x)

	//# command-line-arguments

	//./memory.go:7:9: &t escapes to heap

	//./memory.go:6:2: moved to heap: t

	//./memory.go:17:14: *x escapes to heap

	//./memory.go:17:13: main ... argument does not escape

	var x int

	x=boo()

	fmt.Println(x)

	//# command-line-arguments

	//./memory.go:20:13: x escapes to heap

	//./memory.go:20:13: main ... argument does not escape

}

其中有两个函数，一个传值，一个传地址，我们可以通过命令

go build -gcflags '-m -l' main.go

发现，传值变量没有发生逃逸，传址变量发生了逃逸。

但是代码中，main函数里的x也逃逸了，这是因为有些函数参数为interface类型，比如fmt.Println(a ...interface{})，编译期间很难确定其参数的具体类型，也会发生逃逸。

并且，如果一个变量需要申请的内存过大，超过了栈的存储能力，即便在函数返回后，它不会被引用，还是会被分配到堆上。

因此，当出现地址引用、申请内存过大、无法确定内存大小（比如切片）、无法确定引用类型（比如interface）这4种情况的时候，会发生逃逸。

所以啊，指针是把双刃剑，减少逃逸代码，有助于提升代码运行效率。

实例

type S struct {}

func main()  {

	var x S

	y:=&x

	_=*identity(y)

}

func identity(z *S) *S {

	return z

}

//# command-line-arguments

//./memory2.go:11:15: leaking param: z to result ~r1 level=0

//./memory2.go:7:5: main &x does not escape

//没有发生逃逸

//因为identity这个函数仅仅输入一个变量，

//又将这个变量作为返回输出，但identity并没有引用z，所以这个变量没有逃逸，

//而x没有被引用，且生命周期也在mian里，x没有逃逸，分配在栈上。

//但是参数z发生了泄漏（也有的说法是z是流式的），参数泄漏不是内存泄漏，

//仅仅是指而是指该传入参数的内容的生命期，超过函数调用期，也就是函数返回后，该参数的内容仍然存活

example1

package main

type M struct {}

func main()  {

	var x M

	_=*ref(x)

}

func ref(z M) *M {

	return &z

}

//# command-line-arguments

//./memory3.go:11:9: &z escapes to heap

//./memory3.go:10:10: moved to heap: z

//变量z发生了逃逸

//go都是值传递，ref函数copy了x的值，传给z，返回z的指针，

//然后在函数外被引用，说明z这个变量在函数內声明，可能会被函数外的其他程序访问。

//所以z逃逸了，分配在堆上。

example2

type Q struct {

	A *int

}

func main()  {

	var i int

	refStruct(i)

}

func refStruct(y int) (z Q) {

	z.A=&y

	return z

}

//# command-line-arguments

//./memory4.go:13:6: &y escapes to heap

//./memory4.go:12:16: moved to heap: y

//y发生逃逸

//在struct里好像并没有区别，有可能被函数外的程序访问就会逃逸

example3

type Q2 struct {

	A *int

}

func main()  {

	var i int

	refStruct2(&i)

}

func refStruct2(y *int) (z Q2) {

	z.A=y

	return z

}

//# command-line-arguments

//./memory5.go:12:17: leaking param: y to result z level=0

//./memory5.go:9:13: main &i does not escape

//y没有逃逸，只是发生了参数泄漏

//因为y只作为一个复制的载体，并不会被外部引用

example4

type D struct {

	J *int

}

func main()  {

	var x D

	var i int

	ref2(&i,&x)

}

func ref2(y *int,z *D)  {

	z.J=y

	//return z

}

//# command-line-arguments

//./memory6.go:13:11: leaking param: y

//./memory6.go:13:18: ref2 z does not escape

//./memory6.go:10:7: &i escapes to heap

//./memory6.go:9:6: moved to heap: i

//./memory6.go:10:10: main &x does not escape

//y没有逃逸，只是泄漏，因为显而易见的原因

//z也没有发生逃逸，因为它作为一个在函数内被定义的变量，只是被赋了个值而已，并不会再被外部引用

//而i发生了逃逸，因为i作为一个在main函数里被定义的变量，被传到函数ref2里，它这块内存锁标定的地址被外部引用，因此发生逃逸。

//x没有发生逃逸，因为它作为一个在main函数里被定义的变量，只是被赋值，并没有将自己的地址进行传递，因此没有发生逃逸

//加上return，是如下结果

//# command-line-arguments

//./memory6.go:13:11: leaking param: y

//./memory6.go:13:18: leaking param: z to result ~r2 level=0

//./memory6.go:10:7: &i escapes to heap

//./memory6.go:9:6: moved to heap: i

//./memory6.go:10:10: main &x does not escape

//大体结果一样，只不过z发生了泄漏

example5

看完这些例子，你应该对逃逸分析有了一个大体的了解，或者，更凌乱了。

内存管理

对于内置runtime system的编程语言，通常会抛弃传统的内存分配方式，改为自主管理内存。这样可以完成类似预分配、内存池、垃圾回收等操作，以避开频繁地向操作系统申请、释放内存，产生过多的系统调用而导致的性能问题。

golang的runtime system同样实现了一套内存池机制，接管了所有的内存申请和释放的动作。

基础概念

Golang程序启动伊始，会向系统申请一块内存（虚拟的地址空间），如下图：

盗了下面链接一个作者的图～

预申请的内存分为三部分：spans、bitmap、arena。

arena就是通俗意义上的堆区，它被分割成页，每一页8KB，所以一共有512GB/8KB页。

bitmap用于标识arena中哪些地址保存了对象，以及对象是否包含指针，如下图：

其中1个byte（8bit）标识arena中4页的信息，也就是说2bit标识1页，所以bitmap的大小为(（512GB/8KB）/4)*1B=16GB。

spans用于表示arena区中的某一页属于哪个mspan（就是由arena页组合起来的内存管理基本单元），如下图：

每个指针对应一页，所以spans的大小为（512GB/8K）*8B。

核心数据结构

内存管理单元

mspan

mspan是内存分配和管理的基本单位，用于管理预分配个数为sizeClass的连续地址内存。

如下源码：

package runtime

// class  bytes/obj  bytes/span  objects  tail waste  max waste

//     1          8        8192     1024           0     87.50%

//     2         16        8192      512           0     43.75%

//     3         32        8192      256           0     46.88%

//     4         48        8192      170          32     31.52%

//     5         64        8192      128           0     23.44%

//     6         80        8192      102          32     19.07%

//     7         96        8192       85          32     15.95%

//     8        112        8192       73          16     13.56%

//     9        128        8192       64           0     11.72%

//    10        144        8192       56         128     11.82%

//    11        160        8192       51          32      9.73%

//    12        176        8192       46          96      9.59%

//    13        192        8192       42         128      9.25%

//    14        208        8192       39          80      8.12%

//    15        224        8192       36         128      8.15%

//    16        240        8192       34          32      6.62%

//    17        256        8192       32           0      5.86%

//    18        288        8192       28         128     12.16%

//    19        320        8192       25         192     11.80%

//    20        352        8192       23          96      9.88%

//    21        384        8192       21         128      9.51%

//    22        416        8192       19         288     10.71%

//    23        448        8192       18         128      8.37%

//    24        480        8192       17          32      6.82%

//    25        512        8192       16           0      6.05%

//    26        576        8192       14         128     12.33%

//    27        640        8192       12         512     15.48%

//    28        704        8192       11         448     13.93%

//    29        768        8192       10         512     13.94%

//    30        896        8192        9         128     15.52%

//    31       1024        8192        8           0     12.40%

//    32       1152        8192        7         128     12.41%

//    33       1280        8192        6         512     15.55%

//    34       1408       16384       11         896     14.00%

//    35       1536        8192        5         512     14.00%

//    36       1792       16384        9         256     15.57%

//    37       2048        8192        4           0     12.45%

//    38       2304       16384        7         256     12.46%

//    39       2688        8192        3         128     15.59%

//    40       3072       24576        8           0     12.47%

//    41       3200       16384        5         384      6.22%

//    42       3456       24576        7         384      8.83%

//    43       4096        8192        2           0     15.60%

//    44       4864       24576        5         256     16.65%

//    45       5376       16384        3         256     10.92%

//    46       6144       24576        4           0     12.48%

//    47       6528       32768        5         128      6.23%

//    48       6784       40960        6         256      4.36%

//    49       6912       49152        7         768      3.37%

//    50       8192        8192        1           0     15.61%

//    51       9472       57344        6         512     14.28%

//    52       9728       49152        5         512      3.64%

//    53      10240       40960        4           0      4.99%

//    54      10880       32768        3         128      6.24%

//    55      12288       24576        2           0     11.45%

//    56      13568       40960        3         256      9.99%

//    57      14336       57344        4           0      5.35%

//    58      16384       16384        1           0     12.49%

//    59      18432       73728        4           0     11.11%

//    60      19072       57344        3         128      3.57%

//    61      20480       40960        2           0      6.87%

//    62      21760       65536        3         256      6.25%

//    63      24576       24576        1           0     11.45%

//    64      27264       81920        3         128     10.00%

//    65      28672       57344        2           0      4.91%

//    66      32768       32768        1           0     12.50%

const (

	_MaxSmallSize   = 32768

	smallSizeDiv    = 8

	smallSizeMax    = 1024

	largeSizeDiv    = 128

	_NumSizeClasses = 67

	_PageShift      = 13

)

var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 256, 288, 320, 352, 384, 416, 448, 480, 512, 576, 640, 704, 768, 896, 1024, 1152, 1280, 1408, 1536, 1792, 2048, 2304, 2688, 3072, 3200, 3456, 4096, 4864, 5376, 6144, 6528, 6784, 6912, 8192, 9472, 9728, 10240, 10880, 12288, 13568, 14336, 16384, 18432, 19072, 20480, 21760, 24576, 27264, 28672, 32768}

var class_to_allocnpages = [_NumSizeClasses]uint8{0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 3, 2, 3, 1, 3, 2, 3, 4, 5, 6, 1, 7, 6, 5, 4, 3, 5, 7, 2, 9, 7, 5, 8, 3, 10, 7, 4}

type divMagic struct {

	shift    uint8

	shift2   uint8

	mul      uint16

	baseMask uint16

}

var class_to_divmagic = [_NumSizeClasses]divMagic{{0, 0, 0, 0}, {3, 0, 1, 65528}, {4, 0, 1, 65520}, {5, 0, 1, 65504}, {4, 9, 171, 0}, {6, 0, 1, 65472}, {4, 10, 205, 0}, {5, 9, 171, 0}, {4, 11, 293, 0}, {7, 0, 1, 65408}, {4, 9, 57, 0}, {5, 10, 205, 0}, {4, 12, 373, 0}, {6, 7, 43, 0}, {4, 13, 631, 0}, {5, 11, 293, 0}, {4, 13, 547, 0}, {8, 0, 1, 65280}, {5, 9, 57, 0}, {6, 9, 103, 0}, {5, 12, 373, 0}, {7, 7, 43, 0}, {5, 10, 79, 0}, {6, 10, 147, 0}, {5, 11, 137, 0}, {9, 0, 1, 65024}, {6, 9, 57, 0}, {7, 6, 13, 0}, {6, 11, 187, 0}, {8, 5, 11, 0}, {7, 8, 37, 0}, {10, 0, 1, 64512}, {7, 9, 57, 0}, {8, 6, 13, 0}, {7, 11, 187, 0}, {9, 5, 11, 0}, {8, 8, 37, 0}, {11, 0, 1, 63488}, {8, 9, 57, 0}, {7, 10, 49, 0}, {10, 5, 11, 0}, {7, 10, 41, 0}, {7, 9, 19, 0}, {12, 0, 1, 61440}, {8, 9, 27, 0}, {8, 10, 49, 0}, {11, 5, 11, 0}, {7, 13, 161, 0}, {7, 13, 155, 0}, {8, 9, 19, 0}, {13, 0, 1, 57344}, {8, 12, 111, 0}, {9, 9, 27, 0}, {11, 6, 13, 0}, {7, 14, 193, 0}, {12, 3, 3, 0}, {8, 13, 155, 0}, {11, 8, 37, 0}, {14, 0, 1, 49152}, {11, 8, 29, 0}, {7, 13, 55, 0}, {12, 5, 7, 0}, {8, 14, 193, 0}, {13, 3, 3, 0}, {7, 14, 77, 0}, {12, 7, 19, 0}, {15, 0, 1, 32768}}

var size_to_class8 = [smallSizeMax/smallSizeDiv + 1]uint8{0, 1, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19, 20, 20, 20, 20, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23, 24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31}

var size_to_class128 = [(_MaxSmallSize-smallSizeMax)/largeSizeDiv + 1]uint8{31, 32, 33, 34, 35, 36, 36, 37, 37, 38, 38, 39, 39, 39, 40, 40, 40, 41, 42, 42, 43, 43, 43, 43, 43, 44, 44, 44, 44, 44, 44, 45, 45, 45, 45, 46, 46, 46, 46, 46, 46, 47, 47, 47, 48, 48, 49, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 52, 52, 53, 53, 53, 53, 54, 54, 54, 54, 54, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 57, 57, 57, 57, 57, 57, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66}

/usr/local/go/src/runtime/sizeclasses.go

可以看到，

mspan共有67种sizeClass，每个mspan按照自身的sizeClass分割成若干个object，每个object可存储一个对象。

比如，某mspan的sizeClass为3，那么其划分的object大小就是32B，可以存储(16B，32B]大小的的对象。同时，sizeClass还决定了mspan能够分到的页数，比如，sizeClass对应能分到的页数是1。

另外，数组中最大的数是32768，也就是32KB，超过32KB即是大对象，sizeClass=0就表示大对象，不通过mspan，直接由mheap分配内存。对于，小于16B的是微小对象，分配器会将其合并，将几个对象分配到同一个object中。

有一点我有些恍惚，最大的容量是32KB，而arena中每一页的大小是8KB，也就是说，定然存在跨页存储，虽然没有找到可靠的资料，但是，观察一下源码可以发现，页数为1的最大object容量是8192B，也就正好是8KB。当object为32KB时，页数为4，当object为24KB时，页数为3，也就是说，在golang写死的sizeClass里，针对页面容量、可存放对象大小和页数已经预置了对应关系，或许，这也是写死sizeClass的原因之一。

mspan的结构体如下：

type mspan struct {

next *mspan // next span in list, or nil if none

prev *mspan // previous span in list, or nil if none

list *mSpanList // For debugging. TODO: Remove.

startAddr uintptr // address of first byte of span aka s.base()

npages uintptr // number of pages in span

manualFreeList gclinkptr // list of free objects in mSpanManual spans

// freeindex is the slot index between 0 and nelems at which to begin scanning

// for the next free object in this span.

// Each allocation scans allocBits starting at freeindex until it encounters a 0

// indicating a free object. freeindex is then adjusted so that subsequent scans begin

// just past the newly discovered free object.

// If freeindex == nelem, this span has no free objects.

// allocBits is a bitmap of objects in this span.

// If n >= freeindex and allocBits[n/8] & (1<<(n%8)) is 0

// then object n is free;

// otherwise, object n is allocated. Bits starting at nelem are

// undefined and should never be referenced.

// Object n starts at address n*elemsize + (start << pageShift).

freeindex uintptr

// TODO: Look up nelems from sizeclass and remove this field if it

// helps performance.

nelems uintptr // number of object in the span.

// Cache of the allocBits at freeindex. allocCache is shifted

// such that the lowest bit corresponds to the bit freeindex.

// allocCache holds the complement of allocBits, thus allowing

// ctz (count trailing zero) to use it directly.

// allocCache may contain bits beyond s.nelems; the caller must ignore

// these.

allocCache uint64

// allocBits and gcmarkBits hold pointers to a span's mark and

// allocation bits. The pointers are 8 byte aligned.

// There are three arenas where this data is held.

// free: Dirty arenas that are no longer accessed

// and can be reused.

// next: Holds information to be used in the next GC cycle.

// current: Information being used during this GC cycle.

// previous: Information being used during the last GC cycle.

// A new GC cycle starts with the call to finishsweep_m.

// finishsweep_m moves the previous arena to the free arena,

// the current arena to the previous arena, and

// the next arena to the current arena.

// The next arena is populated as the spans request

// memory to hold gcmarkBits for the next GC cycle as well

// as allocBits for newly allocated spans.

// The pointer arithmetic is done "by hand" instead of using

// arrays to avoid bounds checks along critical performance

// paths.

// The sweep will free the old allocBits and set allocBits to the

// gcmarkBits. The gcmarkBits are replaced with a fresh zeroed

// out memory.

allocBits *gcBits

gcmarkBits *gcBits

// sweep generation:

// if sweepgen == h->sweepgen - 2, the span needs sweeping

// if sweepgen == h->sweepgen - 1, the span is currently being swept

// if sweepgen == h->sweepgen, the span is swept and ready to use

// if sweepgen == h->sweepgen + 1, the span was cached before sweep began and is still cached, and needs sweeping

// if sweepgen == h->sweepgen + 3, the span was swept and then cached and is still cached

// h->sweepgen is incremented by 2 after every GC

sweepgen uint32

divMul uint16 // for divide by elemsize - divMagic.mul

baseMask uint16 // if non-0, elemsize is a power of 2, & this will get object allocation base

allocCount uint16 // number of allocated objects

spanclass spanClass // size class and noscan (uint8)

state mSpanState // mspaninuse etc

needzero uint8 // needs to be zeroed before allocation

divShift uint8 // for divide by elemsize - divMagic.shift

divShift2 uint8 // for divide by elemsize - divMagic.shift2

scavenged bool // whether this span has had its pages released to the OS

elemsize uintptr // computed from sizeclass or from npages

unusedsince int64 // first time spotted by gc in mspanfree state

limit uintptr // end of data in span

speciallock mutex // guards specials list

specials *special // linked list of special records sorted by offset.

}

/usr/local/go/src/runtime/mheap.go

next *mspan

指向链表里的下一个mspan

prev *mspan

指向链表里的上一个mspan

list *mSpanList

mspan链表

startAddr uintptr

管理页的起始地址，直接指向arena区域的的某个位置

npages uintptr

管理的页数

manualFreeList gclinkptr

空闲对象列表

freeindex uintptr

freeindex是0到nelems之间的位置索引，在该索引处开始扫描此mspan的下一个空闲对象。每次分配从freeindex开始扫描allocBits，直到遇到空闲对象。然后调整freeindex，以便于后续扫描发现下一个空闲对象。如果freeindex等于nelems，那么这片mspan没有空闲对象。

nelems uintptr

管理的对象数

allocBits *gcBits

allocBits是此mspan中对象的位图。当n>=freeindex并且其位图标记为0，那么这个对象就是空闲的，否则，这个对象就是被占用的。

allocCount uint16

已分配的对象的个数

elemsize uintptr

对象的大小

内存管理组件

内存分配由内存分配器完成。分配器由3种组件构成：mcache、mcentral、mheap。

mcache

首先我们需要先简单了解一个概念，Go调度器的基本结构，也就是G-P-M模型：

G: 表示Goroutine，每个Goroutine对应一个G结构体，G存储Goroutine的运行堆栈、状态以及任务函数，可重用。G并非执行体，每个G需要绑定到P才能被调度执行。
P: Processor，表示逻辑处理器，对G来说，P相当于CPU核，G只有绑定到P(在P的local runq中)才能被调度。对M来说，P提供了相关的执行环境(Context)，如内存分配状态(mcache)，任务队列(G)等，P的数量决定了系统内最大可并行的G的数量（前提：物理CPU核数 >= P的数量），P的数量由用户设置的GOMAXPROCS决定，但是不论GOMAXPROCS设置为多大，P的数量最大为256。
M: Machine，OS线程抽象，代表着真正执行计算的资源，在绑定有效的P后，进入schedule循环；而schedule循环的机制大致是从Global队列、P的Local队列以及wait队列中获取G，切换到G的执行栈上并执行G的函数，调用goexit做清理工作并回到M，如此反复。M并不保留G状态，这是G可以跨M调度的基础，M的数量是不定的，由Go Runtime调整，为了防止创建过多OS线程导致系统调度不过来，目前默认最大限制为10000个。

其中，P是一个虚拟资源，同时只能被一个协程访问，所以P中的数据不需要锁。为了分配对象时有更好的性能，各个P中都有各种sizeClass的mspan的缓存，这样就可以直接给G分配。其结构如下：

其中，对于67中sizeClass各有两个，一个用于分配包含指针的对象，一个用于分配不包含指针的对象，因此一个P中共有134个mspan的缓存。对于无指针对象的mspan在进行垃圾回收时，不需要进一步扫描它是否引用了其他活跃的对象。

mcache在初始化时没有任何mspan资源，在使用过程中会动态地从mcentral申请，之后缓存下来。当对象<=32KB时，使用mcache相应规格的mspan分配。

type mcache struct {

// The following members are accessed on every malloc,

// so they are grouped here for better caching.

next_sample int32 // trigger heap sample after allocating this many bytes

local_scan uintptr // bytes of scannable heap allocated

// Allocator cache for tiny objects w/o pointers.

// See "Tiny allocator" comment in malloc.go.

// tiny points to the beginning of the current tiny block, or

// nil if there is no current tiny block.

// tiny is a heap pointer. Since mcache is in non-GC'd memory,

// we handle it by clearing it in releaseAll during mark

// termination.

tiny uintptr

tinyoffset uintptr

local_tinyallocs uintptr // number of tiny allocs not counted in other stats

// The rest is not accessed on every malloc.

alloc [numSpanClasses]*mspan // spans to allocate from, indexed by spanClass

stackcache [_NumStackOrders]stackfreelist

// Local allocator stats, flushed during GC.

local_largefree uintptr // bytes freed for large objects (>maxsmallsize)

local_nlargefree uintptr // number of frees for large objects (>maxsmallsize)

local_nsmallfree [_NumSizeClasses]uintptr // number of frees for small objects (<=maxsmallsize)

// flushGen indicates the sweepgen during which this mcache

// was last flushed. If flushGen != mheap_.sweepgen, the spans

// in this mcache are stale and need to the flushed so they

// can be swept. This is done in acquirep.

flushGen uint32

}

/usr/local/go/src/runtime/mcache.go

其中numSpanClasses的就是134，而且mcache中也没有任何加锁信息。

mcentral

为所有mcache提供切分好的mspan资源，包含已分配出去的和未分配出去的。每个mcentral对应一种sizeClass的mspan，因此共有134个mcentral。其结构如下：

当P中没有特定大小的mspan时，就会向mcentral申请。mcentral由于是共享资源，因此需要有锁。

type mcentral struct {

	lock      mutex

	spanclass spanClass

	nonempty  mSpanList // list of spans with a free object, ie a nonempty free list

	empty     mSpanList // list of spans with no free objects (or cached in an mcache)

	// nmalloc is the cumulative count of objects allocated from

	// this mcentral, assuming all spans in mcaches are

	// fully-allocated. Written atomically, read under STW.

	nmalloc uint64

}

/usr/local/go/src/runtime/mcentral.go

mcentral的结构体源码很短，清晰地体现出了它的特点。其中的nmalloc用于累计已分配的mspan数量。

mcache向mcentral获取、归还mspan的流程：

获取：加锁；从nonempty中取出一个可用mspan；将其从nonempty链表删除；将其加入到empty链表；解锁。
归还：加锁；将mspan从empty链表删除；加入到nonempty链表；解锁。

mheap

代表Go程序持有的所有堆空间。当mcentral没有空闲的mspan时，会向mheap申请。若mheap也没有，会向操作系统申请。mheap主要用于大对象的内存分配，以及管理未切割的mspan，用于给mcentral切割成小对象。其结构如下：

mheap的源码如下：

type mheap struct {

lock mutex

free mTreap // free and non-scavenged spans

scav mTreap // free and scavenged spans

sweepgen uint32 // sweep generation, see comment in mspan

sweepdone uint32 // all spans are swept

sweepers uint32 // number of active sweepone calls

// allspans is a slice of all mspans ever created. Each mspan

// appears exactly once.

// The memory for allspans is manually managed and can be

// reallocated and move as the heap grows.

// In general, allspans is protected by mheap_.lock, which

// prevents concurrent access as well as freeing the backing

// store. Accesses during STW might not hold the lock, but

// must ensure that allocation cannot happen around the

// access (since that may free the backing store).

allspans []*mspan // all spans out there

// sweepSpans contains two mspan stacks: one of swept in-use

// spans, and one of unswept in-use spans. These two trade

// roles on each GC cycle. Since the sweepgen increases by 2

// on each cycle, this means the swept spans are in

// sweepSpans[sweepgen/2%2] and the unswept spans are in

// sweepSpans[1-sweepgen/2%2]. Sweeping pops spans from the

// unswept stack and pushes spans that are still in-use on the

// swept stack. Likewise, allocating an in-use span pushes it

// on the swept stack.

sweepSpans [2]gcSweepBuf

_ uint32 // align uint64 fields on 32-bit for atomics

// Proportional sweep

// These parameters represent a linear function from heap_live

// to page sweep count. The proportional sweep system works to

// stay in the black by keeping the current page sweep count

// above this line at the current heap_live.

// The line has slope sweepPagesPerByte and passes through a

// basis point at (sweepHeapLiveBasis, pagesSweptBasis). At

// any given time, the system is at (memstats.heap_live,

// pagesSwept) in this space.

// It's important that the line pass through a point we

// control rather than simply starting at a (0,0) origin

// because that lets us adjust sweep pacing at any time while

// accounting for current progress. If we could only adjust

// the slope, it would create a discontinuity in debt if any

// progress has already been made.

pagesInUse uint64 // pages of spans in stats mSpanInUse; R/W with mheap.lock

pagesSwept uint64 // pages swept this cycle; updated atomically

pagesSweptBasis uint64 // pagesSwept to use as the origin of the sweep ratio; updated atomically

sweepHeapLiveBasis uint64 // value of heap_live to use as the origin of sweep ratio; written with lock, read without

sweepPagesPerByte float64 // proportional sweep ratio; written with lock, read without

// TODO(austin): pagesInUse should be a uintptr, but the 386

// compiler can't 8-byte align fields.

// Page reclaimer state

// reclaimIndex is the page index in allArenas of next page to

// reclaim. Specifically, it refers to page (i %

// pagesPerArena) of arena allArenas[i / pagesPerArena].

// If this is >= 1<<63, the page reclaimer is done scanning

// the page marks.

// This is accessed atomically.

reclaimIndex uint64

// reclaimCredit is spare credit for extra pages swept. Since

// the page reclaimer works in large chunks, it may reclaim

// more than requested. Any spare pages released go to this

// credit pool.

// This is accessed atomically.

reclaimCredit uintptr

// scavengeCredit is spare credit for extra bytes scavenged.

// Since the scavenging mechanisms operate on spans, it may

// scavenge more than requested. Any spare pages released

// go to this credit pool.

// This is protected by the mheap lock.

scavengeCredit uintptr

// Malloc stats.

largealloc uint64 // bytes allocated for large objects

nlargealloc uint64 // number of large object allocations

largefree uint64 // bytes freed for large objects (>maxsmallsize)

nlargefree uint64 // number of frees for large objects (>maxsmallsize)

nsmallfree [_NumSizeClasses]uint64 // number of frees for small objects (<=maxsmallsize)

// arenas is the heap arena map. It points to the metadata for

// the heap for every arena frame of the entire usable virtual

// address space.

// Use arenaIndex to compute indexes into this array.

// For regions of the address space that are not backed by the

// Go heap, the arena map contains nil.

// Modifications are protected by mheap_.lock. Reads can be

// performed without locking; however, a given entry can

// transition from nil to non-nil at any time when the lock

// isn't held. (Entries never transitions back to nil.)

// In general, this is a two-level mapping consisting of an L1

// map and possibly many L2 maps. This saves space when there

// are a huge number of arena frames. However, on many

// platforms (even 64-bit), arenaL1Bits is 0, making this

// effectively a single-level map. In this case, arenas[0]

// will never be nil.

arenas [1 << arenaL1Bits]*[1 << arenaL2Bits]*heapArena

// heapArenaAlloc is pre-reserved space for allocating heapArena

// objects. This is only used on 32-bit, where we pre-reserve

// this space to avoid interleaving it with the heap itself.

heapArenaAlloc linearAlloc

// arenaHints is a list of addresses at which to attempt to

// add more heap arenas. This is initially populated with a

// set of general hint addresses, and grown with the bounds of

// actual heap arena ranges.

arenaHints *arenaHint

// arena is a pre-reserved space for allocating heap arenas

// (the actual arenas). This is only used on 32-bit.

arena linearAlloc

// allArenas is the arenaIndex of every mapped arena. This can

// be used to iterate through the address space.

// Access is protected by mheap_.lock. However, since this is

// append-only and old backing arrays are never freed, it is

// safe to acquire mheap_.lock, copy the slice header, and

// then release mheap_.lock.

allArenas []arenaIdx

// sweepArenas is a snapshot of allArenas taken at the

// beginning of the sweep cycle. This can be read safely by

// simply blocking GC (by disabling preemption).

sweepArenas []arenaIdx

// _ uint32 // ensure 64-bit alignment of central

// central free lists for small size classes.

// the padding makes sure that the mcentrals are

// spaced CacheLinePadSize bytes apart, so that each mcentral.lock

// gets its own cache line.

// central is indexed by spanClass.

central [numSpanClasses]struct {

mcentral mcentral

pad [cpu.CacheLinePadSize - unsafe.Sizeof(mcentral{})%cpu.CacheLinePadSize]byte

}

spanalloc fixalloc // allocator for span*

cachealloc fixalloc // allocator for mcache*

treapalloc fixalloc // allocator for treapNodes*

specialfinalizeralloc fixalloc // allocator for specialfinalizer*

specialprofilealloc fixalloc // allocator for specialprofile*

speciallock mutex // lock for special record allocators.

arenaHintAlloc fixalloc // allocator for arenaHints

unused *specialfinalizer // never set, just here to force the specialfinalizer type into DWARF

}

/usr/local/go/src/runtime/mheap.go

其中具备了，锁信息、spans信息、arena区信息、central信息。

内存分配规则

大致的分配规则如下：

<=16B的对象（小对象）使用mcache的tiny分配器分配；
(16B,32KB]的对象（一般对象），首先计算对象的规格大小，然后使用mcache中相应规格大小的mspan分配；
32KB以上的对象（大对象），直接从mheap上分配；

大致的分配流程可以从下图看出：

如果mcache没有相应规格大小的mspan，则向mcentral申请；

如果mcentral没有相应规格大小的mspan，则向mheap申请；

如果mheap也没有合适大小的mspan，则向操作系统申请。

参考资料

https://www.cnblogs.com/qcrao-2018/p/10453260.html

https://gocn.vip/article/355

https://swanspouse.github.io/2018/08/22/golang-memory-model/

https://www.cnblogs.com/zkweb/p/7880099.html

https://studygolang.com/articles/13344

Go-内存To Be的更多相关文章

故障重现(内存篇2)，JAVA内存不足导致频繁回收和swap引起的性能问题
背景起因: 记起以前的另一次也是关于内存的调优分享下有个系统平时运行非常稳定运行(没经历过大并发考验),然而在一次活动后,人数并发一上来后,系统开始卡. 我按经验开始调优,在每个关键步骤的加入如 ...
In-Memory：在内存中创建临时表和表变量
在Disk-Base数据库中,由于临时表和表变量的数据存储在tempdb中,如果系统频繁地创建和更新临时表和表变量,大量的IO操作集中在tempdb中,tempdb很可能成为系统性能的瓶颈.在SQL ...
In-Memory：内存优化表的事务处理
内存优化表(Memory-Optimized Table,简称MOT)使用乐观策略(optimistic approach)实现事务的并发控制,在读取MOT时,使用多行版本化(Multi-Row ve ...
试试SQLSERVER2014的内存优化表
试试SQLSERVER2014的内存优化表 SQL Server 2014中的内存引擎(代号为Hekaton)将OLTP提升到了新的高度. 现在,存储引擎已整合进当前的数据库管理系统,而使用先进内存技 ...
故障重现, JAVA进程内存不够时突然挂掉模拟
背景,服务器上的一个JAVA服务进程突然挂掉,查看产生了崩溃日志,如下: # Set larger code cache with -XX:ReservedCodeCacheSize= # This ...
死磕内存篇 --- JAVA进程和linux内存间的大小关系
运行个JAVA 用sleep去hold住 package org.hjb.test; public class TestOnly { public static void main(String[] ...
【知识必备】内存泄漏全解析，从此拒绝ANR，让OOM远离你的身边，跟内存泄漏say byebye
一.写在前面对于C++来说,内存泄漏就是new出来的对象没有delete,俗称野指针:而对于java来说,就是new出来的Object放在Heap上无法被GC回收:而这里就把我之前的一篇内存泄漏的总 ...
C++内存对齐总结
大家都知道,C++空类的内存大小为1字节,为了保证其对象拥有彼此独立的内存地址.非空类的大小与类中非静态成员变量和虚函数表的多少有关. 而值得注意的是,类中非静态成员变量的大小与编译器内存对齐的设置有 ...
java: web应用中不经意的内存泄露
前面有一篇讲解如何在spring mvc web应用中一启动就执行某些逻辑,今天无意发现如果使用不当,很容易引起内存泄露,测试代码如下: 1.定义一个类App package com.cnblogs. ...
查看w3wp进程占用的内存及.NET内存泄露,死锁分析
一基础知识在分析之前,先上一张图: 从上面可以看到,这个w3wp进程占用了376M内存,启动了54个线程. 在使用windbg查看之前,看到的进程含有 *32 字样,意思是在64位机器上已32位方 ...

随机推荐

Acwing143. 最大异或对
在给定的N个整数A1,A2……ANA1,A2……AN中选出两个进行xor(异或)运算,得到的结果最大是多少? 输入格式第一行输入一个整数N. 第二行输入N个整数A1A1-ANAN. 输出格式输出一 ...
git把本地代码上传（更新）到github上
# 初始化目录为本地仓库 git init # 添加所有文件到暂存去 git add . # 提交所有文件 git commit -m "init" # 添加远程仓库地址 git ...
web框架的本质（使用socket实现的最基础的web框架、使用wsgiref实现的web框架）
import socket def handle_request(client): data = client.recv(1024) client.send("HTTP/1.1 200 OK ...
javascript基础总汇
## javaScript是什么:1.JavaScript 运行在客户端(浏览器)的编程语言2.用来给HTML网页增加动态功能3.用来给HTML网页增加动态功能.4.Netscape在最初将其脚本语言 ...
Windows下的vue-devtools工具的安装
详细教程在这个链接里: https://www.cnblogs.com/xqmyhome/p/10972772.html
微信小程序wxss制作扭蛋机
小程序制作扭蛋机 2019-09-24 13:26:53 公司要制作活动小程序,其中有一个扭蛋机的效果实现抽奖的功能.在网上找了好久竟没有找到(不知道是不是我找代码的方式有问题).最后还是自己做一个吧 ...
Java编码技巧与代码优化
本文参考整理自https://mp.weixin.qq.com/s/-u6ytFRp-ZAqdLBsMmuDMw 对于在本文中有所疑问的点可以去该文章查看详情常量&变量直接赋值常量值, 禁 ...
第三讲JdbcRealm及Authentication Strategy
1.使用shiro框架来完成认证工作,默认情况下使用的是IniRealm.如果需要使用其他Realm,那么需要进行相关的配置. 2.ini配置文件讲解: [main] section是你配置应用程序的 ...
srs-librtmp pusher(push h264 raw)
Simple Live System Using SRS https://www.cnblogs.com/dong1/p/5100792.html 1.上面是推送文件,改成推送缓存封装了三个函数 i ...
layui树形表格支持非异步和异步加载
layui树形表格支持非异步和异步加载. 仓库地址:https://gitee.com/uniqid/ 使用示例如下: <div class="uui-admin-common-bod ...

Go-内存To Be