echo list | go tool pprof -alloc_space gateway http://10.2.1.93:8421/debug/pprof/heap > abc.log

echo list | go tool pprof -inuse_space gateway http://10.2.1.93:8421/debug/pprof/heap > inuse.log

https://deferpanic.com/blog/understanding-golang-memory-usage/

Understanding Go Lang Memory Usage

Mon, Dec 22, 2014

Warning: This is an intro to memory with the go language - you can deep dive down the rabbit hole as far as you want to go.

Most beginning go developers try out a simple hello world ala:

package main

import (

 "fmt"

 "time"

)

func main() {

 fmt.Println("hi")

 time.Sleep(30 * time.Second)

}

and then they go completely crazy.

138 f!*%!G of $%@# memory!? This laptop only has 16G!

Virtual vs Resident

Go manages memory differently than what you might be used to. It will reserve a large chunk right off the bat (VIRT) but your (RSS) is much closer to reality of what is in use.

What is the difference between RSS and VIRT ?

VIRT or the virtual address space size is the amount that a program has mapped in and is able to access.

RSS or the resident set size is the amount of memory actually in use.

If you are curious about how go actually goes about doing this check out:

https://github.com/golang/go/blob/master/src/runtime/malloc1.go

    // On a 64-bit machine, allocate from a single contiguous

    // reservation.

    // 128 GB (MaxMem) should be big enough for now.

    // Actually we reserve 136 GB (because the bitmap ends up being 8

    // GB)

It’s important to note that if you are using 32bit arch the memory reservation is done completely differently.

Garbage Collection

Now that we know the difference between resident and shared memory we can talk about how go does garbage collection to understand how our program is working.

Chances are you are writing some long lived daemon - be it a web app server or something more complex. Generally you will probably make quite a few allocations throughout it’s lifetime. Knowing how the memory is dealt with is essential.

Typically if you go 2 minutes without garbage collection it will get ran. If a span goes unused for 5 minutes the scavenger allows it to be released.

So if you suspect that your memory usage should be going back down give it ~ 7 minutes just to verify.

Be aware that currently the gc is non-compacting - what this really means is that if you have a single byte touching a page - the scavenger will be prevented from madvising it.

Last but not least - on go 1.3 goroutine stacks, which are 8k/pop, don’t get released - they get re-used later on. Don’t fret though - Go still has plenty of room for improvement in the GC department. So if your code is spawning a ton of goroutines and your RES is staying high this could be why.

So, now we know what to look at it from outside our program and we know what to expect from GC.

Analyzing Memory Usage.

Let’s take a small example to how we might look at our memory. In our example we’ll allocate 10 sets of 100 megabytes.

Then we’ll include a couple different ways of looking at the memory usage.

One method is by using the runtime package and looking at the ReadMemStats function.

The other method is using this super sweet web interface via the pprof package. This allows us to remotely grab our pprof data which we’ll explore shortly.

Yet another method which Dave Cheney mentioned we should mention is to use the gctrace debug environment variable.

Note: This was done on 64bit linux with go 1.4.

package main

import (

        "log"

        "net/http"

        _ "net/http/pprof"

        "runtime"

        "sync"

)

func bigBytes() *[]byte {

        s := make([]byte, 100000000)

        return &s

}

func main() {

        var wg sync.WaitGroup

        go func() {

                log.Println(http.ListenAndServe("localhost:6060", nil))

        }()

        var mem runtime.MemStats

        runtime.ReadMemStats(&mem)

        log.Println(mem.Alloc)

        log.Println(mem.TotalAlloc)

        log.Println(mem.HeapAlloc)

        log.Println(mem.HeapSys)

        for i := 0; i < 10; i++ {

                s := bigBytes()

                if s == nil {

                        log.Println("oh noes")

                }

        }

        runtime.ReadMemStats(&mem)

        log.Println(mem.Alloc)

        log.Println(mem.TotalAlloc)

        log.Println(mem.HeapAlloc)

        log.Println(mem.HeapSys)

        wg.Add(1)

        wg.Wait()

}

There are typically two options you might choose when using pprof to look at memory.

One option is ‘–alloc_space’ which tells you how many megabytes have been allocated.

The other – ‘–inuse_space’ tells you know how many are still in use.

We can launch pprof and point it at our in-app webserver to get the topk abusers.

Then if we want we can use list to see where some of that usage is coming from:

In Use

vagrant@vagrant-ubuntu-raring-64:~/blahdo$ go tool pprof -inuse_space

blahdo http://localhost:6060/debug/pprof/heap

Fetching profile from http://localhost:6060/debug/pprof/heap

Saved profile in

/home/vagrant/pprof/pprof.blahdo.localhost:6060.inuse_objects.inuse_space.025.pb.gz

Entering interactive mode (type "help" for commands)

(pprof) top5

190.75MB of 191.25MB total (99.74%)

Dropped 3 nodes (cum <= 0.96MB)

      flat  flat%   sum%        cum   cum%

  190.75MB 99.74% 99.74%   190.75MB 99.74%  main.main

         0     0% 99.74%   190.75MB 99.74%  runtime.goexit

         0     0% 99.74%   190.75MB 99.74%  runtime.main

(pprof) quit

Allocated

vagrant@vagrant-ubuntu-raring-64:~/blahdo$ go tool pprof -alloc_space

blahdo http://localhost:6060/de

bug/pprof/heap

Fetching profile from http://localhost:6060/debug/pprof/heap

Saved profile in

/home/vagrant/pprof/pprof.blahdo.localhost:6060.alloc_objects.alloc_space.027.pb.gz

Entering interactive mode (type "help" for commands)

(pprof) top5

572.25MB of 572.75MB total (99.91%)

Dropped 3 nodes (cum <= 2.86MB)

      flat  flat%   sum%        cum   cum%

  572.25MB 99.91% 99.91%   572.25MB 99.91%  main.main

         0     0% 99.91%   572.25MB 99.91%  runtime.goexit

         0     0% 99.91%   572.25MB 99.91%  runtime.main

Topk is nice but what is nicer is the list command where we can see where the actual damage is being done in context to the rest of the program.

(pprof) list

Total: 572.75MB

ROUTINE ======================== main.main in

/home/vagrant/blahdo/main.go

  572.25MB   572.25MB (flat, cum) 99.91% of Total

         .          .     23:   var mem runtime.MemStats

         .          .     24:   runtime.ReadMemStats(&mem)

         .          .     25:   log.Println(mem.Alloc)

         .          .     26:

         .          .     27:   for i := 0; i < 10; i++ {

  572.25MB   572.25MB     28:           s := bigBytes()

         .          .     29:           if s == nil {

         .          .     30:                   log.Println("oh noes")

         .          .     31:           }

         .          .     32:   }

         .          .     33:

Those of you following at home have probably noticed quite a few differences in the memory usage being reported – why is that?

Let’s look at ps =>

vagrant@vagrant-ubuntu-raring-64:~$ ps aux | grep blahdo

vagrant   4817  0.2 10.7 699732 330524 pts/1   Sl+  00:13   0:00 ./blahdo

Now let’s look at our log output =>

./vagrant@vagrant-ubuntu-raring-64:~/blahdo$ ./blahdo

2014/12/23 00:19:37 279672

2014/12/23 00:19:37 336152

2014/12/23 00:19:37 279672

2014/12/23 00:19:37 819200

2014/12/23 00:19:37 300209920

2014/12/23 00:19:37 1000420968

2014/12/23 00:19:37 300209920

2014/12/23 00:19:37 500776960

Finally - let’s look at at using gctrace:

vagrant@vagrant-ubuntu-raring-64:~/blahdo$ GODEBUG=gctrace=1 ./blahdo

gc1(1): 1+0+95+0 us, 0 -> 0 MB, 21 (21-0) objects, 2 goroutines, 15/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc2(1): 0+0+81+0 us, 0 -> 0 MB, 52 (53-1) objects, 3 goroutines, 20/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc3(1): 0+0+77+0 us, 0 -> 0 MB, 151 (169-18) objects, 4 goroutines, 25/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc4(1): 0+0+110+0 us, 0 -> 0 MB, 325 (393-68) objects, 4 goroutines, 33/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc5(1): 0+0+138+0 us, 0 -> 0 MB, 351 (458-107) objects, 4 goroutines, 40/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

2014/12/23 02:27:14 277960

2014/12/23 02:27:14 332680

2014/12/23 02:27:14 277960

2014/12/23 02:27:14 884736

gc6(1): 1+0+181+0 us, 0 -> 95 MB, 599 (757-158) objects, 6 goroutines, 52/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc7(1): 1+0+454+19 us, 95 -> 286 MB, 438 (759-321) objects, 6 goroutines, 52/0/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc8(1): 1+0+167+0 us, 190 -> 477 MB, 440 (762-322) objects, 6 goroutines, 54/1/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

gc9(1): 2+0+191+0 us, 190 -> 477 MB, 440 (765-325) objects, 6 goroutines, 54/1/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

2014/12/23 02:27:14 300206864

2014/12/23 02:27:14 1000417040

2014/12/23 02:27:14 300206864

2014/12/23 02:27:14 500842496

GC forced

gc10(1): 3+0+1120+22 us, 190 -> 286 MB, 455 (789-334) objects, 6 goroutines, 54/31/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

scvg0: inuse: 96, idle: 381, sys: 477, released: 0, consumed: 477 (MB)

GC forced

gc11(1): 2+0+270+0 us, 95 -> 95 MB, 438 (789-351) objects, 6 goroutines, 54/39/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

scvg1: 0 MB released

scvg1: inuse: 96, idle: 381, sys: 477, released: 0, consumed: 477 (MB)

GC forced

gc12(1): 85+0+353+1 us, 95 -> 95 MB, 438 (789-351) objects, 6 goroutines, 54/37/0 sweeps, 0(0) handoff, 0(0) steal, 0/0/0 yields

This is important because most ops tools will be looking at your application from the operating system’s point of view - not necessarily what is truly going on.

More options can be found in the runtime package

Short Answer:

RES - will show what the process has at that moment but it might not include anything that has not been paged in or has been paged out.
mem.Alloc - these are the bytes that were allocated and still in use
mem.TotalAlloc - what we allocated throughout the lifetime
mem.HeapAlloc - what’s being used on the heap right now
mem.HeapSys - this includes what is being used by the heap and what has been reclaimed but not given back out

Further - it is important to note that with pprof you are only getting a sampling - not the true values.

In general - when looking at this sort of stuff it’s best to not focus on the numbers but focus on the problem.

We at deferpanic believe in measuring everything but we feel “modern day” ops tools are horrible and focus on the effect of a problem but not the actual problem.

If your car won’t start you may think that it is the problem but it’s not. It’s not even the fact that the gastank is empty. The real problem is that you did not put gas into the gastank and now you are noticing a stream of consequences from the original problem.

If you were just monitoring the RES output from ps for a go binary - it might tell you that there’s a problem but you have no clue what the problem is until you start deep diving. We want to fix that.

Edit:

The next paragraph is left un-edited. It was not written to degrade ops or devops people. The intent was to show the difference between app level metrics and os level metrics. We realize this was not written well and apologize. We feel that existing ‘ops’ tools don’t give the developer the full information needed to fix their problems.

We also feel that existing app level metric tools leave a lot to desire.

Ops people play a very vital role and we are extremely thankful for all their work - indeed it’s the developers code that is messing things up - this is what we are looking at.

End Edit

Let the ops people have their 300+ graphs with a bajillion gauges and counters and meters and histograms. As people who actually write software we are more interested in finding the real solution by finding the real problem.

go tool proof的更多相关文章

HTTPS and the TLS handshake protocol阅读笔记
目的为能够透彻理解HTTPS报文交互过程,做此笔记. 本文大部分内容来自 : http://albertx.mx/blog/https-handshake/ http://www.cnblogs.c ...
JTAG 引脚自动识别 JTAG Finder, JTAG Pinout Tool, JTAG Pin Finder, JTAG pinout detector, JTAGULATOR, Easy-JTAG, JTAG Enumeration
JTAG Finder Figuring out the JTAG Pinouts on a Device is usually the most time-consuming and frustra ...
[免费了] SailingEase .NET Resources Tool （.NET 多语言资源编辑器）
这是我2010年左右,写 Winform IDE (http://www.cnblogs.com/sheng_chao/p/4387249.html)项目时延伸出的一个小项目. 最初是以共享软件的形式 ...
jBPM4.4 no jBPM DB schema: no JBPM4_EXECUTION table. Run the create.jbpm.schema target first in the install tool.
jBPM4.4 no jBPM DB schema: no JBPM4_EXECUTION table. Run the create.jbpm.schema target first in the ...
mtk flash tool，Win7 On VirtualBox
SP_Flash_Tool_exe_Windows_v5.1624.00.000 Win7 在 VirtualBox, 安裝 mtk flash tool, v5.1628 在燒錄時會 fail. v ...
使用Microsoft Web Application Stress Tool对web进行压力测试
Web压力测试是目前比较流行的话题,利用Web压力测试可以有效地测试一些Web服务器的运行状态和响应时间等等,对于Web服务器的承受力测试是个非常好的手法.Web 压力测试通常是利用一些工具,例如微软 ...
How to Use Android ADB Command Line Tool
Android Debug Bridge (adb) is a tool that lets you manage the state of an emulator instance or Andro ...
使用MAT(Memory Analyzer Tool)工具分析dump文件--转
原文地址:http://gao-xianglong.iteye.com/blog/2173140?utm_source=tuicool&utm_medium=referral 前言生产环境中 ...
Linux Cmd Tool 系列之—script & scriptreplay
Intro Sometime we want to record cmd and outputs in the interactive shell sessions. However history ...

随机推荐

ibatis动态查询条件
ibatis的调试相对困难,出错的时候主要依据是log4生成的log文件和出错提示,这方面要能比较熟练的看懂. 下面这个配置基本上包含了最复杂的功能:分页\搜索\排序\缓存\传值Hash表\返回has ...
win7系统cmd命令切换到指定文件夹目录
win7 系统下的cmd命令,直接cd命令切换盘符和以往有些不同,现在默认只能在当前盘符中改变目录,如果要改变盘符则需要多加一个/d命令.如下图所示:(对cd命令的帮助大家可借助help cd命令进 ...
[CareerCup] 1.3 Permutation String 字符串的排列
1.3 Given two strings, write a method to decide if one is a permutation of the other. 这道题给定我们两个字符串,让 ...
信息安全系统设计基础实验五 20135210&20135218
北京电子科技学院(BESTI) 实验报告课程:信息安全系统设计基础班级: 1352 姓名:程涵,姬梦馨学号:2013521 ...
replace 替换全部的正确姿势
本文同步自我的个人博客:http://www.52cik.com/2015/11/06/replace-all.html 关于字符串替换问题,其实是个很简单的问题,但却也不那么简单,至少对于很多新手而 ...
Xamarin.Form 实例: Discuz BBS 客户端源码分享
感谢台风, 这个十一长假让我好好的休息了一回, 睡觉到腰酸背疼, 看电影看到眼发红. 今天最后一天, 不敢出去逛, 不知道哪会还会下暴雨... 嗯嗯..这个项目其实在十一之前就开始了, 工作无聊,没有 ...
WPF开发时光之痕日记本——终于完工了。。晒晒截图（三）（已上传安装包）
由于是业余时间学习的 WPF 的相关开发且不怎么会使用 Blend 软件,所以开发这个客户端着实花费了我很长时间,比如文本编辑器的开发,最初是在 Simple.HtmlEditor 的基础上做的修改, ...
第十八课：js样式操作需要注意的问题
样式分为,外部样式(<link />),内部样式(<style></style>),行内样式(style:).再加上一个important对选择器权重的干扰. 大体 ...
DOM（六）事件类型
对于用户事件类型而言,最常用的是鼠标.键盘.浏览器. 1.鼠标事件: 鼠标的事件都频繁使用,下面例子就测试各种鼠标事件 <script language="javascript&quo ...
第五章：javascript：队列
队列是一种列表,不同的是队列只能在末尾插入元素,在队首删除元素.队列用于存储按顺序排列的数据.先进先出.这点和栈不一样,在栈中,最后入栈的元素反被优先处理.可以将队列想象成银行排队办理业务的人,排队在 ...

go tool proof

Understanding Go Lang Memory Usage

go tool proof的更多相关文章

随机推荐

热门专题