在ELK的数据库报警系统中,发现有台机器报出了下面的错误:

2018-12-04 18:55:26.842 CST,"XXX","XXX",21106,"XXX",5c065c3d.5272,4,"idle",2018-12-04 18:51:41 CST,117/0,0,ERROR,54000,"out of memory","Cannot enlarge string buffer containing 0 bytes by 1342177281 more bytes.",,,,,,,"enlargeStringInfo, stringinfo.c:268",""

当看到是发生了OOM时,以为是整个数据库实例存在了问题,线上检查发现数据库正常,后查阅资料了解到,pg对于一次执行的查询语句长度是有限制的,如果长度超过了1G,则会报出上面的错误。

上面日志中的1342177281 bytes是查询的长度。

在使用copy的时候,也常会报出类似的问题,此时就要根据报错,查看对应的行数是不是由于引号或转义问题导致了对应行没有恰当的结束,或者是一整行的内容大于了1G。

下面是翻阅pg9.6源码找到的相关内容:

结合注释,pg的源码很容易看懂。

src/include/utils/memutils.h

/*
* MaxAllocSize, MaxAllocHugeSize
* Quasi-arbitrary limits on size of allocations.
*
* Note:
* There is no guarantee that smaller allocations will succeed, but
* larger requests will be summarily denied.
*
* palloc() enforces MaxAllocSize, chosen to correspond to the limiting size
* of varlena objects under TOAST. See VARSIZE_4B() and related macros in
* postgres.h. Many datatypes assume that any allocatable size can be
* represented in a varlena header. This limit also permits a caller to use
* an "int" variable for an index into or length of an allocation. Callers
* careful to avoid these hazards can access the higher limit with
* MemoryContextAllocHuge(). Both limits permit code to assume that it may
* compute twice an allocation's size without overflow.
*/
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */

src/backend/lib/stringinfo.c

/*
* enlargeStringInfo
*
* Make sure there is enough space for 'needed' more bytes
* ('needed' does not include the terminating null).
*
* External callers usually need not concern themselves with this, since
* all stringinfo.c routines do it automatically. However, if a caller
* knows that a StringInfo will eventually become X bytes large, it
* can save some palloc overhead by enlarging the buffer before starting
* to store data in it.
*
* NB: because we use repalloc() to enlarge the buffer, the string buffer
* will remain allocated in the same memory context that was current when
* initStringInfo was called, even if another context is now current.
* This is the desired and indeed critical behavior!
*/
void
enlargeStringInfo(StringInfo str, int needed)
{
int newlen; /*
* Guard against out-of-range "needed" values. Without this, we can get
* an overflow or infinite loop in the following.
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.",
str->len, needed))); needed += str->len + 1; /* total space required now */ /* Because of the above test, we now have needed <= MaxAllocSize */ if (needed <= str->maxlen)
return; /* got enough space already */ /*
* We don't want to allocate just a little more space with each append;
* for efficiency, double the buffer size each time it overflows.
* Actually, we might need to more than double it if 'needed' is big...
*/
newlen = 2 * str->maxlen;
while (needed > newlen)
newlen = 2 * newlen; /*
* Clamp to MaxAllocSize in case we went past it. Note we are assuming
* here that MaxAllocSize <= INT_MAX/2, else the above loop could
* overflow. We will still have newlen >= needed.
*/
if (newlen > (int) MaxAllocSize)
newlen = (int) MaxAllocSize; str->data = (char *) repalloc(str->data, newlen); str->maxlen = newlen;
}

src/include/lib/stringinfo.h

下面是字符串存储用到的结构体:

/*-------------------------
* StringInfoData holds information about an extensible string.
* data is the current buffer for the string (allocated with palloc).
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData; typedef StringInfoData *StringInfo;

从存放字符串或二进制的结构体StringInfoData中,可以看出pg字符串类型不支持\u0000的原因,因为在pg中的字符串形式是C strings,是以\0结束的字符串,\0在ASCII中叫做NUL,Unicode编码表示为\u0000,八进制则为0x00,如果字符串中包含\0,pg会当做字符串的结束符。

pg中的字符串不支持其中包含NULL(\0x00),这个很明显是不同于NULL值的,NULL值pg是支持的。

在具体的使用中,可以将\u0000替换掉再导入pg数据库。

在其他数据库导入pg时,可以使用下面方式替换:

regexp_replace(stringWithNull, '\\u0000', '', 'g')

java程序中替换:

str.replaceAll('\u0000', '')

vim替换:

s/\x00//g;

参考:

src/backend/lib/stringinfo.c

src/include/lib/stringinfo.h

src/include/utils/memutils.h

https://en.wikipedia.org/wiki/Null-terminated_string

https://stackoverflow.com/questions/1347646/postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0?rq=1

Cannot enlarge string buffer containing XX bytes by XX more bytes的更多相关文章

  1. ORA-06502:PL/SQL :numberic or value error: character string buffer too small

    今天遇到一个错误提示:ORA-06502:PL/SQL :numberic or value error: character string buffer too small,一般对应的中文信息为:O ...

  2. String、String Buffer、String Builder

    对于String.String Buffer.String Builder:我一直都只知道String是字符串常量,后两者是字符串变量: String和String Buffer是线程安全的,Stri ...

  3. String Buffer和String Builder(String类深入理解)

      String在Java里面JDK1.8后它属于一个特殊的类,在创建一个String基本对象的时候,String会向“ 字符串常量池(String constant pool)” 进行检索是否有该数 ...

  4. 中文转unicode,中文转bytes,unicode转bytes java实现

    utf-8 utf-8格式的中文由三位字节组成. UTF-8的编码规则很简单,只有二条: 1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码.因此对于英语字母,UTF-8编 ...

  5. Exception: Operation xx of contract xx specifies multiple request body parameters to be serialized without any wrapper elements.

    Operation 'CreateProductCodeStock' of contract 'IChileService' specifies multiple request body param ...

  6. Linux内核中的Kconfig、xx.defconfig、xx.config、Makefile

    什么是Kconfig.xx.defconfig.xx.config.Makefile Kconfig: 一个文本形式的文件,其中主要作用是在内核配置时候,作为配置选项. xx.deconfig: Li ...

  7. (转)JVM内存分配 -Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m

    在linux环境下配置项目运行环境时,部署的人员都会分配一下内存,以保证程序正常的运行.其实在开发的时候(window系统),就已经涉及到内存分配了,只是这些参数有默认值,因此一直没有去重视它. 以M ...

  8. eclipse不能自动编译XX.java为XX.classs

    问题描述:eclipse不能自动编译XX.java为XX.classs 原因:今天下午写代码,因为需要引入jstl包,引入后发现原来项目中已经引入了,然后我又把包删除了,忘记删除java build ...

  9. Python3.x:报错POST data should be bytes, an iterable of bytes

    Python3.x:报错POST data should be bytes, an iterable of bytes 问题: python3.x:报错 POST data should be byt ...

随机推荐

  1. Python序列之元组 (tuple)

    作者博文地址:http://www.cnblogs.com/spiritman/ Python的元组与列表类似,同样可通过索引访问,支持异构,任意嵌套.不同之处在于元组的元素不能修改.元组使用小括号, ...

  2. C++ 函数 内联函数

    内联函数的功能和预处理宏的功能相似,在介绍内联函数之前,先介绍一下预处理宏.宏是简单字符替换,最常见的用法:定义了一个代表某个值的全局符号.定义可调用带参数的宏.作为一种约定,习惯上总是用大写字母来定 ...

  3. 互评Beta版本——Thunder组爱阅app(探路者团队测评)

    基于NABCD评论作品,及改进建议 每个小组评论其他小组beta发布的作品. 1.根据(不限于)NABCD评论作品的选题;   N(Need,需求):在Beta中加入了书友QQ群,以及反馈建议,更好的 ...

  4. 20162325 金立清 S2 W9 C18

    20162325 2017-2018-2 <程序设计与数据结构>第9周学习总结 教材学习内容概要 堆是一棵完全二叉树,其中每个元素大于等于其所有子结点的值. 向堆中添加一个元素的方法是,首 ...

  5. React环境配置(第一个React项目)

    使用Webpack构建React项目 1. 使用NPM配置React环境 NPM及React安装自行百度 首先创建一个文件夹,the_first_React 进入到创建好的目录,npm init,然后 ...

  6. 先做一个用来测试的chrome浏览器插件

    如何制作chrome插件 在项目汇报中,有同学提到了想要了解如何制作插件,特写该篇博客供大家查阅~ 一个简单的插件需要manifest.json.popup.html.popup.js.content ...

  7. grunt入门讲解3:实例讲解使用 Gruntfile 配置任务

    这个Gruntfile 实例使用到了5个 Grunt 插件: grunt-contrib-uglify      grunt-contrib-qunitgrunt-contrib-concatgrun ...

  8. Java实现小学四则运算练习

     Github项目地址:https://github.com/feser-xuan/Arithmetic.git 1.需求分析 软件基本功能要求如下: 程序可接收一个输入参数n,然后随机产生n道加减乘 ...

  9. Littleproxy的使用

    介绍 LittleProxy是一个用Java编写的高性能HTTP代理,它基于Netty事件的网络库之上.它非常稳定,性能良好,并且易于集成到的项目中. 项目页面:https://github.com/ ...

  10. 微信小程序 功能函数 将对象的键添加到数组 (函数深入)

    // 将对象的键添加到数组 var arr = Object.keys(site); //英文 https://developer.mozilla.org/en-US/docs/Web/JavaScr ...