在ELK的数据库报警系统中,发现有台机器报出了下面的错误:

2018-12-04 18:55:26.842 CST,"XXX","XXX",21106,"XXX",5c065c3d.5272,4,"idle",2018-12-04 18:51:41 CST,117/0,0,ERROR,54000,"out of memory","Cannot enlarge string buffer containing 0 bytes by 1342177281 more bytes.",,,,,,,"enlargeStringInfo, stringinfo.c:268",""

当看到是发生了OOM时,以为是整个数据库实例存在了问题,线上检查发现数据库正常,后查阅资料了解到,pg对于一次执行的查询语句长度是有限制的,如果长度超过了1G,则会报出上面的错误。

上面日志中的1342177281 bytes是查询的长度。

在使用copy的时候,也常会报出类似的问题,此时就要根据报错,查看对应的行数是不是由于引号或转义问题导致了对应行没有恰当的结束,或者是一整行的内容大于了1G。

下面是翻阅pg9.6源码找到的相关内容:

结合注释,pg的源码很容易看懂。

src/include/utils/memutils.h

/*
* MaxAllocSize, MaxAllocHugeSize
* Quasi-arbitrary limits on size of allocations.
*
* Note:
* There is no guarantee that smaller allocations will succeed, but
* larger requests will be summarily denied.
*
* palloc() enforces MaxAllocSize, chosen to correspond to the limiting size
* of varlena objects under TOAST. See VARSIZE_4B() and related macros in
* postgres.h. Many datatypes assume that any allocatable size can be
* represented in a varlena header. This limit also permits a caller to use
* an "int" variable for an index into or length of an allocation. Callers
* careful to avoid these hazards can access the higher limit with
* MemoryContextAllocHuge(). Both limits permit code to assume that it may
* compute twice an allocation's size without overflow.
*/
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */

src/backend/lib/stringinfo.c

/*
* enlargeStringInfo
*
* Make sure there is enough space for 'needed' more bytes
* ('needed' does not include the terminating null).
*
* External callers usually need not concern themselves with this, since
* all stringinfo.c routines do it automatically. However, if a caller
* knows that a StringInfo will eventually become X bytes large, it
* can save some palloc overhead by enlarging the buffer before starting
* to store data in it.
*
* NB: because we use repalloc() to enlarge the buffer, the string buffer
* will remain allocated in the same memory context that was current when
* initStringInfo was called, even if another context is now current.
* This is the desired and indeed critical behavior!
*/
void
enlargeStringInfo(StringInfo str, int needed)
{
int newlen; /*
* Guard against out-of-range "needed" values. Without this, we can get
* an overflow or infinite loop in the following.
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.",
str->len, needed))); needed += str->len + 1; /* total space required now */ /* Because of the above test, we now have needed <= MaxAllocSize */ if (needed <= str->maxlen)
return; /* got enough space already */ /*
* We don't want to allocate just a little more space with each append;
* for efficiency, double the buffer size each time it overflows.
* Actually, we might need to more than double it if 'needed' is big...
*/
newlen = 2 * str->maxlen;
while (needed > newlen)
newlen = 2 * newlen; /*
* Clamp to MaxAllocSize in case we went past it. Note we are assuming
* here that MaxAllocSize <= INT_MAX/2, else the above loop could
* overflow. We will still have newlen >= needed.
*/
if (newlen > (int) MaxAllocSize)
newlen = (int) MaxAllocSize; str->data = (char *) repalloc(str->data, newlen); str->maxlen = newlen;
}

src/include/lib/stringinfo.h

下面是字符串存储用到的结构体:

/*-------------------------
* StringInfoData holds information about an extensible string.
* data is the current buffer for the string (allocated with palloc).
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData; typedef StringInfoData *StringInfo;

从存放字符串或二进制的结构体StringInfoData中,可以看出pg字符串类型不支持\u0000的原因,因为在pg中的字符串形式是C strings,是以\0结束的字符串,\0在ASCII中叫做NUL,Unicode编码表示为\u0000,八进制则为0x00,如果字符串中包含\0,pg会当做字符串的结束符。

pg中的字符串不支持其中包含NULL(\0x00),这个很明显是不同于NULL值的,NULL值pg是支持的。

在具体的使用中,可以将\u0000替换掉再导入pg数据库。

在其他数据库导入pg时,可以使用下面方式替换:

regexp_replace(stringWithNull, '\\u0000', '', 'g')

java程序中替换:

str.replaceAll('\u0000', '')

vim替换:

s/\x00//g;

参考:

src/backend/lib/stringinfo.c

src/include/lib/stringinfo.h

src/include/utils/memutils.h

https://en.wikipedia.org/wiki/Null-terminated_string

https://stackoverflow.com/questions/1347646/postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0?rq=1

Cannot enlarge string buffer containing XX bytes by XX more bytes的更多相关文章

  1. ORA-06502:PL/SQL :numberic or value error: character string buffer too small

    今天遇到一个错误提示:ORA-06502:PL/SQL :numberic or value error: character string buffer too small,一般对应的中文信息为:O ...

  2. String、String Buffer、String Builder

    对于String.String Buffer.String Builder:我一直都只知道String是字符串常量,后两者是字符串变量: String和String Buffer是线程安全的,Stri ...

  3. String Buffer和String Builder(String类深入理解)

      String在Java里面JDK1.8后它属于一个特殊的类,在创建一个String基本对象的时候,String会向“ 字符串常量池(String constant pool)” 进行检索是否有该数 ...

  4. 中文转unicode,中文转bytes,unicode转bytes java实现

    utf-8 utf-8格式的中文由三位字节组成. UTF-8的编码规则很简单,只有二条: 1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码.因此对于英语字母,UTF-8编 ...

  5. Exception: Operation xx of contract xx specifies multiple request body parameters to be serialized without any wrapper elements.

    Operation 'CreateProductCodeStock' of contract 'IChileService' specifies multiple request body param ...

  6. Linux内核中的Kconfig、xx.defconfig、xx.config、Makefile

    什么是Kconfig.xx.defconfig.xx.config.Makefile Kconfig: 一个文本形式的文件,其中主要作用是在内核配置时候,作为配置选项. xx.deconfig: Li ...

  7. (转)JVM内存分配 -Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m

    在linux环境下配置项目运行环境时,部署的人员都会分配一下内存,以保证程序正常的运行.其实在开发的时候(window系统),就已经涉及到内存分配了,只是这些参数有默认值,因此一直没有去重视它. 以M ...

  8. eclipse不能自动编译XX.java为XX.classs

    问题描述:eclipse不能自动编译XX.java为XX.classs 原因:今天下午写代码,因为需要引入jstl包,引入后发现原来项目中已经引入了,然后我又把包删除了,忘记删除java build ...

  9. Python3.x:报错POST data should be bytes, an iterable of bytes

    Python3.x:报错POST data should be bytes, an iterable of bytes 问题: python3.x:报错 POST data should be byt ...

随机推荐

  1. [shell] bash数组(for时排序)

    for处理时会自动把顺序按A-Z排序了 [root@XM-v106 ~]# bash b.sh A -> B -> C -> D -> E -> [root@XM-v10 ...

  2. No.110_第三次团队会议

    前端的易帜 前端在整个软件中有着举足轻重的地位.前端设计一般可以理解为视觉设计,前端开发则是前台代码的实现. 随着科技水平的提高和生产力的提高,人民对于审美的要求逐渐增高.在没有科技壁垒的情况下,是否 ...

  3. 06慕课网《进击Node.js基础(一)》作用域和上下文

    作用域 function(){}大括号中的内容是一个作用域; function 和 var 的声明会被提到作用域的最上面 function f(){ a = 2; var b = g(); //此处可 ...

  4. Javascript面向对象二

    Javascript面向对象二 可以通过指定原型属性来对所有的对象指定属性, Object.prototype.name="zhangsan"; Object.prototype. ...

  5. 图层损坏 E/ArcGIS﹕ The map or layer has been destroyed or recycled. 资源未释放

    看到论坛上有个网友和我一样的问题: The map or layer has been destroyed or recyled t Hello, I have a problem when the ...

  6. Apollo配置名词-学习1

    文章:Apollo分布式配置中心部署以及使用 部署环境为DEV(开发环境).FAT(测试环境).UAT(预生产).PRO(生产)

  7. HTTPS链式编程——AFNetworking 3.0

    1. HTTPS 证书认证(导入相关证书) #pragma mark - https认证 - (AFSecurityPolicy*)customSecurityPolicy { // 先导入证书 NS ...

  8. 解决在Mac上用pyenv安装python3失败的问题

    背景 前段时间在本地Mac系统上要跑一个python3写的压测脚本. Mac默认安装的是python2, 而且很多软件依赖的也是python2. 为了不影响现有系统其它软件, 当时安装了pyenv来实 ...

  9. 软工网络15团队作业8——Beta阶段敏捷冲刺(Day3)

    提供当天站立式会议照片一张 每个人的工作 1.讨论项目每个成员的昨天进展 赵铭: 还是在学习知晓云数据库怎么用 吴慧婷:这两天进一步进行界面设计,暂时完成了背单词界面的初步设计. 陈敏: 完成了背单词 ...

  10. 10th 规格说明书练习——吉林一日游

    活动规格说明书 吉林市一日游 版本:1.0 编订:王东涵 团队:2016级计算机技术全体同学 日期:2016-11-20 目录 1.引言 1.1 编写目的 1.2 背景 1.3 定义 1.4 参考资料 ...