在ELK的数据库报警系统中,发现有台机器报出了下面的错误:

2018-12-04 18:55:26.842 CST,"XXX","XXX",21106,"XXX",5c065c3d.5272,4,"idle",2018-12-04 18:51:41 CST,117/0,0,ERROR,54000,"out of memory","Cannot enlarge string buffer containing 0 bytes by 1342177281 more bytes.",,,,,,,"enlargeStringInfo, stringinfo.c:268",""

当看到是发生了OOM时,以为是整个数据库实例存在了问题,线上检查发现数据库正常,后查阅资料了解到,pg对于一次执行的查询语句长度是有限制的,如果长度超过了1G,则会报出上面的错误。

上面日志中的1342177281 bytes是查询的长度。

在使用copy的时候,也常会报出类似的问题,此时就要根据报错,查看对应的行数是不是由于引号或转义问题导致了对应行没有恰当的结束,或者是一整行的内容大于了1G。

下面是翻阅pg9.6源码找到的相关内容:

结合注释,pg的源码很容易看懂。

src/include/utils/memutils.h

/*
* MaxAllocSize, MaxAllocHugeSize
* Quasi-arbitrary limits on size of allocations.
*
* Note:
* There is no guarantee that smaller allocations will succeed, but
* larger requests will be summarily denied.
*
* palloc() enforces MaxAllocSize, chosen to correspond to the limiting size
* of varlena objects under TOAST. See VARSIZE_4B() and related macros in
* postgres.h. Many datatypes assume that any allocatable size can be
* represented in a varlena header. This limit also permits a caller to use
* an "int" variable for an index into or length of an allocation. Callers
* careful to avoid these hazards can access the higher limit with
* MemoryContextAllocHuge(). Both limits permit code to assume that it may
* compute twice an allocation's size without overflow.
*/
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */

src/backend/lib/stringinfo.c

/*
* enlargeStringInfo
*
* Make sure there is enough space for 'needed' more bytes
* ('needed' does not include the terminating null).
*
* External callers usually need not concern themselves with this, since
* all stringinfo.c routines do it automatically. However, if a caller
* knows that a StringInfo will eventually become X bytes large, it
* can save some palloc overhead by enlarging the buffer before starting
* to store data in it.
*
* NB: because we use repalloc() to enlarge the buffer, the string buffer
* will remain allocated in the same memory context that was current when
* initStringInfo was called, even if another context is now current.
* This is the desired and indeed critical behavior!
*/
void
enlargeStringInfo(StringInfo str, int needed)
{
int newlen; /*
* Guard against out-of-range "needed" values. Without this, we can get
* an overflow or infinite loop in the following.
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.",
str->len, needed))); needed += str->len + 1; /* total space required now */ /* Because of the above test, we now have needed <= MaxAllocSize */ if (needed <= str->maxlen)
return; /* got enough space already */ /*
* We don't want to allocate just a little more space with each append;
* for efficiency, double the buffer size each time it overflows.
* Actually, we might need to more than double it if 'needed' is big...
*/
newlen = 2 * str->maxlen;
while (needed > newlen)
newlen = 2 * newlen; /*
* Clamp to MaxAllocSize in case we went past it. Note we are assuming
* here that MaxAllocSize <= INT_MAX/2, else the above loop could
* overflow. We will still have newlen >= needed.
*/
if (newlen > (int) MaxAllocSize)
newlen = (int) MaxAllocSize; str->data = (char *) repalloc(str->data, newlen); str->maxlen = newlen;
}

src/include/lib/stringinfo.h

下面是字符串存储用到的结构体:

/*-------------------------
* StringInfoData holds information about an extensible string.
* data is the current buffer for the string (allocated with palloc).
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData; typedef StringInfoData *StringInfo;

从存放字符串或二进制的结构体StringInfoData中,可以看出pg字符串类型不支持\u0000的原因,因为在pg中的字符串形式是C strings,是以\0结束的字符串,\0在ASCII中叫做NUL,Unicode编码表示为\u0000,八进制则为0x00,如果字符串中包含\0,pg会当做字符串的结束符。

pg中的字符串不支持其中包含NULL(\0x00),这个很明显是不同于NULL值的,NULL值pg是支持的。

在具体的使用中,可以将\u0000替换掉再导入pg数据库。

在其他数据库导入pg时,可以使用下面方式替换:

regexp_replace(stringWithNull, '\\u0000', '', 'g')

java程序中替换:

str.replaceAll('\u0000', '')

vim替换:

s/\x00//g;

参考:

src/backend/lib/stringinfo.c

src/include/lib/stringinfo.h

src/include/utils/memutils.h

https://en.wikipedia.org/wiki/Null-terminated_string

https://stackoverflow.com/questions/1347646/postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0?rq=1

Cannot enlarge string buffer containing XX bytes by XX more bytes的更多相关文章

  1. ORA-06502:PL/SQL :numberic or value error: character string buffer too small

    今天遇到一个错误提示:ORA-06502:PL/SQL :numberic or value error: character string buffer too small,一般对应的中文信息为:O ...

  2. String、String Buffer、String Builder

    对于String.String Buffer.String Builder:我一直都只知道String是字符串常量,后两者是字符串变量: String和String Buffer是线程安全的,Stri ...

  3. String Buffer和String Builder(String类深入理解)

      String在Java里面JDK1.8后它属于一个特殊的类,在创建一个String基本对象的时候,String会向“ 字符串常量池(String constant pool)” 进行检索是否有该数 ...

  4. 中文转unicode,中文转bytes,unicode转bytes java实现

    utf-8 utf-8格式的中文由三位字节组成. UTF-8的编码规则很简单,只有二条: 1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码.因此对于英语字母,UTF-8编 ...

  5. Exception: Operation xx of contract xx specifies multiple request body parameters to be serialized without any wrapper elements.

    Operation 'CreateProductCodeStock' of contract 'IChileService' specifies multiple request body param ...

  6. Linux内核中的Kconfig、xx.defconfig、xx.config、Makefile

    什么是Kconfig.xx.defconfig.xx.config.Makefile Kconfig: 一个文本形式的文件,其中主要作用是在内核配置时候,作为配置选项. xx.deconfig: Li ...

  7. (转)JVM内存分配 -Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m

    在linux环境下配置项目运行环境时,部署的人员都会分配一下内存,以保证程序正常的运行.其实在开发的时候(window系统),就已经涉及到内存分配了,只是这些参数有默认值,因此一直没有去重视它. 以M ...

  8. eclipse不能自动编译XX.java为XX.classs

    问题描述:eclipse不能自动编译XX.java为XX.classs 原因:今天下午写代码,因为需要引入jstl包,引入后发现原来项目中已经引入了,然后我又把包删除了,忘记删除java build ...

  9. Python3.x:报错POST data should be bytes, an iterable of bytes

    Python3.x:报错POST data should be bytes, an iterable of bytes 问题: python3.x:报错 POST data should be byt ...

随机推荐

  1. 微软职位内部推荐-Software Engineer II_VS

    微软近期Open的职位: Job Title: Software Engineer II Division: Visual Studio China – Developer Division Work ...

  2. python数据分析画图体验

    对于numpy的函数,pands等,不是很熟,我来copy一下code,敲击一下,找找感觉. 默认的导入包import numpy as npimport matplotlib.pyplot as p ...

  3. 模块-Memcached、Redis

    目录 Mecache 安装 使用 Redis 安装 Python操作Redis 操作模式 连接池 操作 String Hash List Set sort set 其他常用操作 管道 发布订阅 sen ...

  4. Scrum Meeting 10.27

    1.会议内容: 姓名 今日任务 明日任务 预估时间(h) 徐越 配置SQLserver 学习本地和服务器之间的通信 4 卞忠昊 找上届代码的bug 学习安卓布局(layout)的有关知识,研究上届学长 ...

  5. Daily Srum 10.28

    这两天我们和其他两组进行了一次会议,主要讨论的是用什么框架来搭建这个平台.在线系统的那一组希望我们用nutch.solr.hbase这一套工具,这对于我们两组来说是一次挑战,毕竟我们一开始用的是关系型 ...

  6. angularJS1笔记-(17)-ng-bind-html指令

    angular不推荐大家在绑定数据的时候绑定html,但是如果你非要这么干也并不是不可以的.举个例子: <!DOCTYPE html> <html lang="en&quo ...

  7. iOS- 本地文本容错搜索引擎2-->如何实现英文(英文首字母,汉语拼音)对中文的搜索?

      1.前言 先闲说几句,最近北京的雾霾真是大,呛的我这攻城师都抗不住了.各位攻城师们一定要爱护好自己的身体!空气好时,少坐多动. 如果条件好的话,最好让你们BOSS搞个室内空气净化器.因为那几天一般 ...

  8. quartz任务管理

    导入quartz相关jar包后,要执行任务的类须实现Job接口 package quartz; import org.quartz.Job; import org.quartz.JobExecutio ...

  9. Scrum 5.0(继4.0)

    一,组员任务完成情况 首页设计初步完成但是需要优化界面,只能简单的输出信息和在首页进行登录.界面极其简单. 鸡汤版面设计有困难,问题在于用何种形式来管理用户的数据上传,但是经过小组间的讨论确定设计方向 ...

  10. 如何选择mysql存储引擎

    一.MySQL的存储引擎 完整的引擎说明还是看官方文档:http://dev.mysql.com/doc/refman/5.6/en/storage-engines.html 这里介绍一些主要的引擎 ...