1. The Problem | 现象

When connect to the product environment database of my company, the Navicat shows "Too many connections", that's because the concurrency reaches the connection upper threshold. I has planned to attach the Keepalive tag to the connection string for resolving, but found that would cause failure (can't connect to the database) since it isn't supported if .net core in Linux environment (Linux manages the TCP long connection by OS, therefore the MySQL.Data.dll can't handle it.)

在我试图用Navicat连接公司的生产库时,爆出了“Too many connections”的提示,原因是并发连接达到了上限,曾经尝试在connection string中增加Keepalive的标签,但也失败了(会导致连接不上数据库),主要原因是.net core在linux环境中使用连接字符串做Keepalive是不受支持的,Linux在操作系统级来管理TCP的长连接,导致MySQL.Data.dll不能从自身去解决这一问题。

2. The Consideration of AWS | AWS的考量

We may see the default value of max_connections in parameter group when using AWS Aurora for MySQL, that might be GREATEST({log(DBInstanceClassMemory/805306368)*45},{log(DBInstanceClassMemory/8187281408)*1000}) for test environment and GREATEST({log(DBInstanceClassMemory/805306368,2)*45},{log(DBInstanceClassMemory/8187281408,2)*1000}) for product environment typically.

在使用AWS Aurora for MySQL数据库时,我们可以看到在参数组中max_connections参数默认的值为:GREATEST({log(DBInstanceClassMemory/805306368)*45},{log(DBInstanceClassMemory/8187281408)*1000})(测试环境)与GREATEST({log(DBInstanceClassMemory/805306368,2)*45},{log(DBInstanceClassMemory/8187281408,2)*1000})(生产环境)。

Actually it means as

实际上这表示的是如下的两个式子

\[MAX[(ln\frac{InstanceMemory}{805306368})\cdot45,(ln\frac{InstanceMemory}{8187281408})\cdot1000]
\]

And

\[MAX[(log_2\frac{InstanceMemory}{805306368})\cdot45,(log_2\frac{InstanceMemory}{8187281408})\cdot1000]
\]

What's the two constants meaning? and why presents by a logarithmic function? Let's see.
>为何定出这么两个常数,又为何使用对数函数来表示这个参数,我们来了解一下。

\[805306368 Bytes\div1024\div1024=768 MiB=0.75GiB
\]

\[8187281408 Bytes\div1024\div1024=7808 MiB=7.625GiB
\]

These two formulas follow a same form, i.e.:

这两个式子都遵循同一种基本形式,即:

\[k(log_n\frac{M}{C})
\]

As you see, it's a logarithmic function, the coefficient k is used to amplify the result, and in logarithmic function, the antilogarithm is monotonically increasing when the base is given. Consider the C is a constant, it means the Memory Size M increasing would causes the connection capacity increasing, this point is well fit to our cognition.

这是个对数函数,系数k用于放大对数部分计算出的数值,而在对数部分,真数是单调递增的。同时考虑到C是常量,则意味着随着内存大小的增加,所允许的最大连接数也随之增加,这十分符合我们的经验认知。

And then let's check a particular value.

If M equals C, the antilogarithm is 1, and the result is 0 whatever the base and the coefficient is, compare to the value of C, it tells us the memory size must greater then 0.75GiB at least, the formula can make sense.

然后我们看一个特殊值的情况。

当M等于C时,真数为1,此时无论底数和系数是多少,算式结果都为0,也就是说,只有实例的内存量在768MB以上时,这个式子才具有意义。

Well, base on the above information we can figure out the function graph as follow:

据此我们可以得出如下函数图像:

Fig.1 The Function Graph Corresponding to the Formula | 图1 AWS给出的计算公式对应的函数图象

The X-axis is memory size in GiB, and the Y-axis is the calculated max_connections by the formulas, the red line presents the cofficient is 45, and the green line presents the cofficient is 1000, the solid line presents the base is e(2.71828...) and the dash line presents the base is 2. Note that the formula indicates obtaining the relative large number as the result so we can only focus at the higher value in one of the line-type.

X轴是以GiB为单位的内存大小,Y轴是所支持的max_connection计算值;红色线是系数为45的式子,绿色线是系数为1000的式子;实线是底数为自然对数e的式子,虚线是底数为2的式子。留意在原式子中是取较大值作为结果的,所以我们只需要关注一种线性中较靠上(值较高)的部分。

Obviously either solid line or dash line they have an intersection which is on the memory size of 8.5GiB. It means that AWS considers about 8GiB memory is a threshold, when the memory size less than 8GiB and greater than 0.75GiB, the max_connection value just can float between 0 to 109 (e as base) and 0 to 157 (2 as base),

but once over the threshold, the max_connection increasing pretty fast follow the memory size incresing, when the memory size is made double to 16GiB, the max_connection reaches to approximate 750 and 1100 respectively.

Another inherent feature is, the increasing rate is decreased gradually when memory size is increasing linearly, this feature cause by the logarithmic formula.

显然地,无论实现还是虚线都有一个交点,交汇在内存大小大约为8.5GiB处。也就是说,AWS以大约8GiB内存作为运行的临界点,当内存大小小于8GiB但大于0.75GiB时,max_connection的值只介于0到109(以e为底数)和0到157(以2为底数)之间,而一旦内存大小超过8GiB,max_connection的值提升得相当快,当内存大小翻倍达到16GiB时,支持的max_connection就已经分别达到750和1100了。

另一个由对数函数带来的天然特性是,增长的速率会随内存大小的线性增长而逐渐降低。

## 3. The Caculation of Requirements | 需求的计算
How to determine a appropriate max_connection value? Is that AWS default policy suit for my situation?
>如何定出适当的max_connection值?AWS给出的默认方案适用于自己的情形吗?

A database typically loaded on a server instance and the server has its native upper limitation of memory size, a portion of memroy is supplied to databse using. And explore the MySQL running principle you may find that the MySQL requires a portion of memory for global running, as well as a portion of memory for query execution thread. Therefrom, each connection needs a certain size of memory. I.e.

\[Available\space RAM = Global\space Buffers + (Thread\space Buffers \times max\_connections)
\]

So,

\[max\_connections =\frac {Available\space RAM - Global\space Buffers} {Thread\space Buffers}
\]

通常来说,数据库被一台服务器加载来运行,服务器的内存总量是有上限的,其中一部分内存被提供给数据库使用。再探索MySQL的运行原理,可以发现对于MySQL,一部分内存被用于全局运行,另一部分内存被用于查询的执行线程。故此,每一个连接都会造成使用了一定的内存,即:

\[可使用内存 = 全局缓冲区使用内存 + (单线程缓冲区使用内存 \times max\_connections)
\]

得:

\[max\_connections =\frac {可使用内存 - 全局缓冲区使用内存} {单线程缓冲区使用内存}
\]

The Global Buffers include key_buffer_size, innodb_buffer_pool_size, innodb_log_buffer_size, innodb_additional_mem_pool_size, net_buffer_size, and query_cache_size. And the Thread Buffers include sort_buffer_size, myisam_sort_buffer_size, read_buffer_size, join_buffer_size, read_rnd_buffer_size, and thread_stack. All of the value can be checked by command:

全局缓冲区使用内存包括key_buffer_sizeinnodb_buffer_pool_sizeinnodb_log_buffer_sizeinnodb_additional_mem_pool_sizenet_buffer_sizequery_cache_size。单线程缓冲区使用内存包括sort_buffer_sizemyisam_sort_buffer_sizeread_buffer_sizejoin_buffer_sizeread_rnd_buffer_sizethread_stack。这些值可以用下面的命令获得:

SHOW VARIABLES
WHERE Variable_name LIKE '%buffer_size%'
OR Variable_name LIKE '%pool_size%'
OR Variable_name LIKE '%cache_size%'
OR Variable_name LIKE '%thread_stack%'

In my case, I obtain the values:

在我这边的服务器上,用了上述命令后得到:

| Variable | 变量 | Value | in MiB |

| ----- | ----- | ---- |

|key_buffer_size|16777216|16|

|innodb_buffer_pool_size|1586495488|1513|

|innodb_log_buffer_size|16777216|16|

|innodb_additional_mem_pool_size|N/A|N/A|

|net_buffer_size|N/A|N/A|

|query_cache_size|88158208|84|

|Global Buffers Total | 全局缓冲区使用内存总计|1708208128|1629|

||||

|sort_buffer_size|262144|0.25|

|myisam_sort_buffer_size|8388608|8|

|read_buffer_size|262144|0.25|

|join_buffer_size|262144|0.25|

|read_rnd_buffer_size|524288|0.5|

|thread_stack|262144|0.25|

|Thread Buffers Total | 单线程缓冲区使用内存总计|9961472|9.5|

I.e. each connection would use approximate 10MiB memory in default setting.

即,默认每个连接大约要使用10MiB的内存。

Refer to an actual situation, the following is the measuring of my server, the two lines present writer and reader (read-write separation), the Y-axis value presents free memory size, and each server has 4GiB memory in total.

参考实际情况,下图是我对我所使用的服务器的测定值,两根线分别表示了写入器和读取器,Y轴是空闲内存数量,每台服务器总共有4GiB的内存。

Fig.2 The Actual Free Memory Size | 图2 实际情形下的剩余内存

As you see, 1~1.5GiB memory is free, it means approximate 100 connections can be supported, and refer to the Figure 1, the 4GiB or more memory causing result is on 75 or higher max_connections, it is quiet matching between computed result and actual result. This conclution also well explain why AWS design the formula as that.

如图示的,1~1.5GiB的内存是空余内存,也就意味着能支持大约100个连接数,再参考图1,4GiB或更大的内存可以得到大约75或更高的连接数,这个计算值和实际情况是比较符合的,也很好地诠释了为何AWS把max_connection的式子设计成那样。

4. The Adjustment of Parameters | 参数的调节

Return to the logarithmic function basic form:

回到上述对数函数的基本形式:

\[k(log_n\frac{M}{C})
\]

Note that C is base point to determine how much memory to make the formula making sense. It means if you want to keep more fixed memory size supply for OS or Global Buffers running, you may raise this value, e.g. 1024MiB or upper value then allows the database can accept connection; as well as using the C as a watershed to separate common memory size and very large mempry size.

留意C是作为这个式子是否有意义的基准点而存在的,也就是说如果希望保留更多的固定内存容量来供给OS或全局缓冲区来运行,那么可以上调这个值,比如上调到1024MiB以上数据库才能开始接受连接;同时还可以用作一般内存容量和超大内存容量的分水岭。

Then consider the base n and the coffient k, in logarithmic function, the base presents the approximate rake ratio in a large range when k=1 and y>0, and k amplify this ratio. Therefore, when we hope to adjust the ratio relationship between memory size and max_connections, we may adjust the n and k, the larger n would cause even more memory size the max_connection still keep a relatively tiny increment, then attach the coffient k would make the calculation nearly linear, and vice versa.

然后考虑底数n和系数k,在对数函数中,底数在一个较大范围里当k=0且y>0时近似于表现斜率,而K则放大这个斜率。那么,如果我们希望调整内存与max_connections之间的比例关系,我们可以调节n和k,更大的n值会使得即使内存量增加很多,max_connection也只是增加很少一点,再结合k值,可以使得整个计算值趋于线性,反之亦然。

Talk About AWS Aurora for MySQL max_connections parameter Calculation | 浅谈AWS Aurora for MySQL数据库中 max_connections参数的计算的更多相关文章

  1. Python 基于python+mysql浅谈redis缓存设计与数据库关联数据处理

    基于python+mysql浅谈redis缓存设计与数据库关联数据处理 by:授客  QQ:1033553122 测试环境 redis-3.0.7 CentOS 6.5-x86_64 python 3 ...

  2. PDO浅谈之php连接mysql

    一.首先我们先说一下什么是pdo?  百科上说 PDO扩展为PHP访问数据库定义了一个轻量级的.一致性的接口,它提供了一个数据访问抽象层,这样,无论使用什么数据库,都可以通过一致的函数执行查询和获取数 ...

  3. 浅谈tidb事务与MySQL事务之间的区别

    MySQL是我们日常生活中常见的数据库,他的innodb存储引擎尤为常见,在事务方面使用的是扁平事务,即要么都执行,要么都回滚.而tidb数据库则使用的是分布式事务.两者都能保证数据的高一致性,但是在 ...

  4. 浅谈SQL Server、MySQL中char,varchar,nchar,nvarchar区别

    最近一次的面试中,被面试官问到varchar和nvarchar的区别,脑海里记得是定长和可变长度的区别,但却没能说出来.后来,在网上找了下网友总结的区别.在这里做个备忘录: 一,SQL Server中 ...

  5. MySQL分库分表浅谈

    一.分库分表类型 1.单库单表 所有数据都放在一个库,一张表. 2.单库多表 数据在一个库,单表水平切分多张表. 3.多库多表 数据库水平切分,表也水平切分. 二.分库分表查询 通过分库分表规则查找到 ...

  6. mysql中max_connections与max_user_connections使用区别

    问题描述:把max_connections和max_user_connections参数进行分析测试,顾名思义,max_connections就是负责数据库全局的连接数,max_user_connec ...

  7. (转)运维角度浅谈MySQL数据库优化

    转自:http://lizhenliang.blog.51cto.com/7876557/1657465 一个成熟的数据库架构并不是一开始设计就具备高可用.高伸缩等特性的,它是随着用户量的增加,基础架 ...

  8. Mysql优化系列(1)--Innodb引擎下mysql自身配置优化

    1.简单介绍InnoDB给MySQL提供了具有提交,回滚和崩溃恢复能力的事务安全(ACID兼容)存储引擎.InnoDB锁定在行级并且也在SELECT语句提供一个Oracle风格一致的非锁定读.这些特色 ...

  9. mysql下面的INSTALL-BINARY的内容,所有的mysql的配置内容都在这

    2.2 Installing MySQL on Unix/Linux Using Generic Binaries Oracle provides a set of binary distributi ...

随机推荐

  1. 架构C02-商业模式与架构设计

    商业模式与架构设计:A段架构与B段架构 <思考软件创新设计:A段架构师思考技术> A段架构师必须具备鲜活的创新思维,睿智的策略思考,犀利的洞察力和灵活的战术才能把握稍纵即逝的商机     ...

  2. 掌握SpringBoot-2.3的容器探针:基础篇

    欢迎访问我的GitHub 地址:https://github.com/zq2599/blog_demos 内容:原创文章分类汇总,及配套源码,涉及Java.Docker.K8S.DevOPS等 关于& ...

  3. Canvas绘制圆点线段

    最近一个小伙遇到一个需求,客户需要绘制圆点样式的线条. 大致效果是这样的: 思路一:计算并使用arc填充 他自己实现了一种思路,然后咨询我有没有更好的思路. 先看看他的思路是如何实现的,大致代码如下: ...

  4. SpringMVC+Mybatis初尝试

    一个月前简单学完了SpringMVC框架和Mybatis框架,一直没有使用过,今天主要用它做了个简单的学生管理系统,不过第一次用框架,实现的功能简单,比较low. 注:这里使用的数据库是SQLServ ...

  5. mysql中的四种常用的引擎

    MySQL常用的引擎有:InnoDB存储引擎.MyISAM存储引擎.MEMORY存储引擎.Archive存储引擎 InnoDB存储引擎 InnoDB是事务型数据库的首选引擎,支持事务安全表(ACID) ...

  6. foreach 集合又抛经典异常了,这次一定要刨根问底

    一:背景 1. 讲故事 最近同事在写一段业务逻辑的时候,程序跑起来总是报:集合已修改:可能无法执行枚举操作,硬是没有找到什么情况下会导致这个异常产生,就让我来找一下bug,其实这个异常在座的每个程序员 ...

  7. c常用函数-atoi 和 itoa

    atoi 和 itoa atoi的功能是把一个字符串转为整数 Action(){ int j; char *s=""; j = atoi(s); lr_output_message ...

  8. 附016.Kubernetes_v1.17.4高可用部署

    一 kubeadm介绍 1.1 概述 参考<附003.Kubeadm部署Kubernetes>. 1.2 kubeadm功能 参考<附003.Kubeadm部署Kubernetes& ...

  9. Python itchat.get_chatrooms() 抓取群聊不全的问题

    1 rooms = itchat.get_chatrooms() 2 f = codecs.open("3.txt","w","utf-8" ...

  10. 这一次搞懂Spring事务是如何传播的

    文章目录 前言 正文 事务切面的调用过程 事务的传播性概念 实例分析 总结 前言 上一篇分析了事务注解的解析过程,本质上是将事务封装为切面加入到AOP的执行链中,因此会调用到MethodIncepto ...