计算机体系结构 -内存优化vm+oom
http://www.cnblogs.com/dkblog/archive/2011/09/06/2168721.html
https://www.kernel.org/doc/Documentation/vm/ 内存设置参数位置:
[root@server1 vm]# pwd
/proc/sys/vm [root@server1 vm]# ls
block_dump extfrag_threshold memory_failure_recovery numa_zonelist_order stat_interval
compact_memory extra_free_kbytes min_free_kbytes oom_dump_tasks swappiness
dirty_background_bytes hugepages_treat_as_movable min_slab_ratio oom_kill_allocating_task unmap_area_factor
dirty_background_ratio hugetlb_shm_group min_unmapped_ratio overcommit_memory vfs_cache_pressure
dirty_bytes laptop_mode mmap_min_addr overcommit_ratio would_have_oomkilled
dirty_expire_centisecs legacy_va_layout nr_hugepages page-cluster zone_reclaim_mode
dirty_ratio lowmem_reserve_ratio nr_hugepages_mempolicy panic_on_oom
dirty_writeback_centisecs max_map_count nr_overcommit_hugepages percpu_pagelist_fraction
drop_caches memory_failure_early_kill nr_pdflush_threads scan_unevictable_pages
每个进程 OOM设置: [root@server1 ]# pwd
/proc/
[root@server1 ]# ls |grep oom
oom_adj
oom_score
oom_score_adj
/proc/slabinfo
/proc/buddyinfo
/proc/zoneinfo
/proc/meminfo [root@monitor /]# slabtop Active / Total Objects (% used) : 347039 / 361203 (96.1%)
Active / Total Slabs (% used) : 24490 / 24490 (100.0%)
Active / Total Caches (% used) : 88 / 170 (51.8%)
Active / Total Size (% used) : 98059.38K / 99927.38K (98.1%)
Minimum / Average / Maximum Object : 0.02K / 0.28K / 4096.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
115625 115344 99% 0.10K 3125 37 12500K buffer_head
73880 73437 99% 0.19K 3694 20 14776K dentry
42184 42180 99% 0.99K 10546 4 42184K ext4_inode_cache
20827 20384 97% 0.06K 353 59 1412K size-64
16709 13418 80% 0.05K 217 77 868K anon_vma_chain
15792 15708 99% 0.03K 141 112 564K size-32
11267 10323 91% 0.20K 593 19 2372K vm_area_struct
10806 10689 98% 0.64K 1801 6 7204K proc_inode_cache
9384 5232 55% 0.04K 102 92 408K anon_vma
7155 7146 99% 0.07K 135 53 540K selinux_inode_security
7070 7070 100% 0.55K 1010 7 4040K radix_tree_node
6444 6443 99% 0.58K 1074 6 4296K inode_cache
5778 5773 99% 0.14K 214 27 856K sysfs_dir_cache
3816 3765 98% 0.07K 72 53 288K Acpi-Operand
2208 2199 99% 0.04K 24 92 96K Acpi-Namespace
1860 1830 98% 0.12K 62 30 248K size-128
1440 1177 81% 0.19K 72 20 288K size-192
1220 699 57% 0.19K 61 20 244K filp
660 599 90% 1.00K 165 4 660K size-1024 [root@monitor xx]# cat /proc/meminfo |grep HugePage
AnonHugePages: kB
HugePages_Total:
HugePages_Free:
HugePages_Rsvd:
HugePages_Surp: 0 1.vi /etc/sysctl.conf
加入
vm.nr_hugepages = 10 2.sysctl -p
[root@monitor /]# cat /proc/meminfo |grep Huge
AnonHugePages: 2048 kB
HugePages_Total: 10
HugePages_Free: 10
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB 3.应用于应用程序
[root@monitor /]# mkdir /hugepages
[root@monitor /]# mount -t hugetlbfs none /hugepages [root@monitor /]# dd if=/dev/zero of=/hugepages/a.out bs=1M count=5
Hugetable page: Hugetlbfs support is built on top of multiple page size support that is provided by most modern
architectures
Users can use the huge page support in Linux kernel by either using the mmap system call or
standard Sysv shared memory system calls (shmget, shmat)
cat /proc/meminfo | grep HugePage
Improving TLB performance: Kernel must usually flush TLB entries upon a context switch
Use free, contiguous physical pages
Automatically via the buddy allocator
/proc/buddyinfo
Manually via hugepages (not pageable)
Linux supports large sized pages through the hugepages mechanism
Sometimes known as bigpages, largepages or the hugetlbfs filesystem
Consequences
TLB cache hit more likely
Reduces PTE visit count
Tuning TLB performance Check size of hugepages
x86info -a | grep “Data TLB”
dmesg
cat /proc/meminfo
Enable hugepages
1.In /etc/sysctl.conf
vm.nr_hugepages = n
2.Kernel parameter //操作系动起动时传参数
hugepages=n
Configure hugetlbfs if needed by application
mmap system call requires that hugetlbfs is mounted
mkdir /hugepages
mount -t hugetlbfs none /hugepages
shmat and shmget system calls do not require hugetlbfs
Trace every system call made by a program
strace -o /tmp/strace.out -p PID
grep mmap /tmp/strace.out Summarize system calls
strace -c -p PID or
strace -c COMMAND
strace command
Other uses
Investigate lock contentions
Identify problems caused by improper file permissions
Pinpoint IO problems
Strategies for using memory
使用内存优化 1.Reduce overhead for tiny memory objects
Slab cache
cat /proc/slabinfo echo 'ext4_inode_cache 108 54 8' >/proc/slabinfo
2.Reduce or defer service time for slower subsystems
Filesystem metadata: buffer cache , slab cache //缓存文件元数据
Disk IO: page cache //缓存数据
Interprocess communications: shared memory //共享内存
Network IO: buffer cache, arp cache, connection tracking
3.Considerations when tuning memory
How should pages be reclaimed to avoid pressure?
Larger writes are usually more efficient due to re-sorting
内存参数设置:
vm.min_free_kbytes:
1.因为内存耗近,系统会崩溃
2.因此保有空闲内存剩下,当进程请求内存分配,不足会把其他内存交换到SWAP中,从而便腾去足够空间去给请求
Tuning vm.min_free_kbytes only be necessary when an application regularly needs to allocate a large block of memory, then frees that same memory
使用情况:
It may well be the case that
the system has too little disk bandwidth,
too little CPU power, or
too little memory to handle its load Linux 提供了这样一个参数min_free_kbytes,用来确定系统开始回收内存的阀值,控制系统的空闲内存。值越高,内核越早开始回收内存,空闲内存越高。
http://www.cnblogs.com/itfriend/archive/2011/12/14/2287160.html
Consequences
Reduces service time for demand paging
Memory is not available for other useage
Can cause pressure on ZONE_NORMAL
Linux服务器内存使用量超过阈值,触发报警。 问题排查 首先,通过free命令观察系统的内存使用情况,显示如下: total used free shared buffers cached
Mem:
-/+ buffers/cache:
Swap:
其中,可以看出内存总量为24675796KB,已使用22617644KB,只剩余2058152KB。 然后,接着通过top命令,shift + M按内存排序后,观察系统中使用内存最大的进程情况,发现只占用了18GB内存,其他进程均很小,可忽略。 因此,还有将近4GB内存(22617644KB-18GB,约4GB)用到什么地方了呢? 进一步,通过cat /proc/meminfo发现,其中有将近4GB( KB)的Slab内存: ......
Mapped: kB
Slab: kB
PageTables: kB
......
Slab是用于存放内核数据结构缓存,再通过slabtop命令查看这部分内存的使用情况: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
% .21K 3494744K dentry_cache
% .09K 33404K buffer_head
% .74K 120832K ext3_inode_cache
发现其中大部分(大约3.5GB)都是用于了dentry_cache。 问题解决
drop_caches To free pagecache:
echo 1 > /proc/sys/vm/drop_caches [include buffer cache and page cache]
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches [ 说明dentris and inodes不在 buffer cache 与 page cache中]
To free slab objects and pagecache: [全部释放]
echo 3 > /proc/sys/vm/drop_caches
http://www.kernel.org/doc/Documentation/sysctl/vm.txt
注意:在清空缓存之前使用sync命令同步数据到磁盘
. 方法1需要用户具有root权限,如果不是root,
但有sudo权限,可以通过sysctl命令进行设置:
$sync
$sudo sysctl -w vm.drop_caches=3 $sudo sysctl -w vm.drop_caches= #recovery drop_caches 操作后可以通过 sudo sysctl -a | grep drop_caches查看是否生效。
物理内存过量使用是以swap为基础的: //数据库上不要用,因为被SWAP很慢
vm.overcommit_memory
= heuristic overcommit //系统自决定过量使用
= always overcommit //总是能够使用SWAP
= commit all RAM plus a percentage of swap (may be > )
= RAM+ SWAP*overcommit_ratio
[SWAP*overcommit_ratio]<实际虚拟内存SWAP
vm.overcommit_ratio: //可以超出物理内存的百分比,一般不要超过%50
Specified the percentage of physical memory allowed to be overcommited
when the vm.overcommit_memory is set to
View Committed_AS in /proc/meminfo
An estimate of how much RAM is required to avoid an out of memory (OOM) condition
for the current workload on a system
Slab cache Tiny kernel objects are stored in the slab
Extra overhead of tracking is better than using page/object
Example: filesystem metadata (dentry and inode caches )
Monitoring
/proc/slabinfo
slabtop
vmstat -m
Tuning a particular slab cache
echo “cache_name limit batchcount shared” > /proc/slabinfo
limit the maximum number of objects that will be cached for each CPU
batchcount the maximum number of global cache objects that will be transferred to the per-CPU cache when it becomes empty
shared the sharing behavior for Symmetric MultiProcessing (SMP) systems
arp cache ARP entries map hardware addresses to protocol addresses
1. Cached in /proc/net/arp
By default, the cache is limited to entries as a soft limit and entries as a hard limit 超过512会自动修简
2. Garbage collection removes stale or older entries [root@server1 proc]# cat /proc/net/arp
IP address HW type Flags HW address Mask Device
112.74.75.247 0x1 0x2 70:f9:6d:ee:67:af * eth1
10.24.223.247 0x1 0x2 70:f9:6d:ee:67:af * eth0 Insufficient ARP cache leads to
Intermittent timeouts between hosts
ARP thrashing
Too much ARP cache puts pressure on ZONE_NORMAL
List entries //显示缓存条目
ip neighbor list
Flush cache //清空缓存条目
ip neighbor flush dev ethX
Adjust where the gc will leave arp table alone
net.ipv4.neigh.default.gc_thresh1
default 128 //小于128条目不被清除 不管是否过期,不被GC清除 Soft upper limit //软限制: 超过512,超过5秒,被清除
net.ipv4.neigh.default.gc_thresh2
defalut
Becomes hard limit after seconds
Hard upper limit //硬限制
net.ipv4.neigh.default.gc_thresh3
Garbage collection frequency in seconds //每隔几秒钟执行清理 大于128,过期的条目 (5分钟过期)
net.ipv4.neigh.default.gc_interval
page cache:
A large percentage of paging activity is due to I/O
File reads: each page of file read from disk into memory
These pages form the page cache
Page cache is always checked for IO requests
Directory reads
Reading and writing regular files
Reading and writing via block device files, DISK IO
Accessing memory mapped files, mmap
Accessing swapped out pages
Pages in the page cache are associated with file data
Tuning page cache: View page cache allocation in /proc/meminfo
Tune length/size of memory
vm.lowmem_reserve_ratio
vm.vfs_cache_pressure
Tune arrival/completion rate
vm.page-cluster
vm.zone_reclaim_mode
vfs_cache_pressure
Controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects
1.At the default value of vfs_cache_pressure=
the kernel will attempt to reclaim dentries and inodes at a "fair" rate with respect to pagecache and swapcache reclaim
2.Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches
3.When vfs_cache_pressure=, the kernel will never reclaim dentries and inodes due to memory pressure and this can easily lead to out-of-memory conditions
4.Increasing vfs_cache_pressure beyond causes the kernel to prefer to reclaim dentries and inodes 0:不回收dentries和inodes;
1-99:倾向于不回收 dentries和 inodes
100: 当 page cache 和 swap cache回收时,回收dentries和inodes
100+: 倾向于回收dentries和 inode
page-cluster
1.page-cluster controls the number of pages which are written to swap in a single attempt
2.It is a logarithmic value - setting it to zero means "1 page", setting it to means "2 pages",
setting it to means "4 pages", etc
3.he default value is three (eight pages at a time)
4.There may be some small benefits in tuning this to a different value if your workload is swap-intensive 1.2的n次个页交换到SWAP,当系统需要大量使用SWAP时 (虚拟化,云计算环境)
zone_reclaim_mode:
Zone_reclaim_mode allows someone to set more or less aggressive approaches to reclaim memory
when a zone runs out of memory
If it is set to zero then no zone reclaim occurs
Allocations will be satisfied from other zones / nodes in the system
This is value ORed together of
= Zone reclaim on 回收打打
= Zone reclaim writes dirty pages out 回收写的脏页
= Zone reclaim swaps pages
Anonymous pages:
Anonymous pages can be another large consumer of data
Are not associated with a file, but instead contain:
Program data – arrays, heap allocations, etc //打开的文件再PAGECACHE中
Anonymous memory regions
Dirty memory mapped process private pages
IPC shared memory region pages
View summary usage
grep Anon /proc/meminfo
cat /proc/PID/statm
Anonymous pages = RSS - Shared
Anonymous pages are eligible for swap 匿名页不能交换到swap
计算机体系结构 -内存优化vm+oom的更多相关文章
- 【MDCC技术大咖秀】Android内存优化之OOM
大神分析的很全面,所以就转过来保存一份,转自:http://www.csdn.net/article/2015-09-18/2825737/1 以下为正文: Android的内存优化是性能优化中很重要 ...
- Android内存优化之OOM
内容大多都是和OOM有关的实践总结概要.理解错误或是偏差的地方,还请多包涵指正,谢谢!本人Q:1524447071 (一)Android的内存管理机制 Google在Android的官网上有这样一篇文 ...
- 计算机体系结构-内存调优IPC OOMK
man ipc [root@server1 proc]# man ipcIPC(2) Linux Programmer’s Manual ...
- Android避免OOM(内存优化)
Android内存优化是性能优化很重要的一部分,而如何避免OOM又是内存优化的核心. Android内存管理机制 android官网有一篇文章 Android是如何管理应用的进程与内存分配 Andro ...
- Linux性能优化之内存优化(二)
前言 不知道大家看完前面一章关于CPU优化,是否受到相应的启发呢?如果遇到任何问题,可以留言和一起探讨这方面的问题.接下来我们介绍一些关于内存方面的知识.内存管理软件包括虚拟内存系统.地址转换.交换. ...
- java虚拟机学习-JVM内存管理:深入Java内存区域与OOM(3)
概述 Java与C++之间有一堵由内存动态分配和垃圾收集技术所围成的高墙,墙外面的人想进去,墙里面的人却想出来. 对于从事C.C++程序开发的开发人员来说,在内存管理领域,他们即是拥有最高权力的皇帝又 ...
- KVM总结-KVM性能优化之内存优化
我们说完CPU方面的优化(http://blog.csdn.net/dylloveyou/article/details/71169463),接着继续第二块内容,也就是内存方面的优化.内存方面有以下四 ...
- 关于android性能,内存优化
转:http://www.starming.com/index.php?action=plugin&v=wave&tpl=union&ac=viewgrouppost& ...
- 2017版:KVM 性能优化之内存优化
我们说完CPU方面的优化,接着我们继续第二块内容,也就是内存方面的优化.内存方面有以下四个方向去着手: EPT 技术 大页和透明大页 KSM 技术 内存限制 1. EPT技术 EPT也就是扩展页表,这 ...
随机推荐
- Linux 多用户和多用户边界
1. 需求背景 2. 多用户的边界: 独立的工作目录 3. 多用户的边界:可操作/访问的资源 4. 多用户的边界: 可执行的操作 5. 多用户的特性标识: UID和GID -------------- ...
- hadoop namenode多次格式化后,导致datanode启动不了
jps hadoop namenode -format dfs directory : /home/hadoop/dfs --data --current/VERSION #Wed Jul :: CS ...
- gcc/g++命令认识
gcc & g++是gnu中最主要和最流行的c & c++编译器 . g++用来针对c++的处理命令,以.cpp为后缀,对于c语言后缀名一般为.c.这时候命令换做gcc即可. 下面以T ...
- uboot全局变量
一.global_data(include/asm-arm/global_data.h) typedef struct global_data { bd_t *bd; unsigned long fl ...
- Lua常用的数据结构表示
1.矩阵 Lua中有两种表示矩阵的方法,一是“数组的数组”.也就是说,table的每个元素是另一个table.例如,可以使用下面代码创建一个n行m列的矩阵:mt = {} -- cr ...
- 此集合已经采用方案 http 的地址。此集合中每个方案中最多只能包含一个地址。
错误信息:此集合已经采用方案 http 的地址.此集合中每个方案中最多只能包含一个地址.如果服务承载于 IIS 中,则可以通过将“system.serviceModel/serviceHostingE ...
- vijosP1902学姐的清晨问候
题目:https://vijos.org/p/1902 题解:sb题...扫一遍每个字母出现的次数即可 代码: #include<cstdio> #include<cstdlib&g ...
- Light OJ 1051 - Good or Bad
题目大意: 给你一个字符串,字符串由大写字母和‘?’组成,大写字母可以变成任意一个字母.现在我们定义字符串, 如果有超过三个连续的元音字母或者连续五个辅音字母,那么我们称这个字符串是“BAD”,否则称 ...
- dedecms网站如何做在线订单功能
做网站的时候经常会遇到做在线订单的这个功能,而且这个功能会在企业网站的建设中经常的遇到,今天51模板集就拿物流网的在线订单功能做一个详细的介绍. 第一步:自定义表单 打开后台:核心-->自定义表 ...
- Unity3d 超级采样抗锯齿 Super Sampling Anti-Aliasing
Super Sampling Anti-AliasingSSAA算是在众多抗锯齿算法中比较昂贵的一种了,年代也比较久远,但是方法比较简单,主要概括为两步1. 查找边缘2. 模糊边缘这是一种 ...