python 编码方式大全 fr = open(filename

7.8.3. Standard Encodings

Python comes with a number of codecs built-in, either implemented as C functions or with dictionaries as mapping tables. The following table lists the codecs by name, together with a few common aliases, and the languages for which the encoding is likely used. Neither the list of aliases nor the list of languages is meant to be exhaustive. Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; therefore, e.g. 'utf-8' is a valid alias for the 'utf_8' codec.

Many of the character sets support the same languages. They vary in individual characters (e.g. whether the EURO SIGN is supported or not), and in the assignment of characters to code positions. For the European languages in particular, the following variants typically exist:

an ISO 8859 codeset
a Microsoft Windows code page, which is typically derived from an 8859 codeset, but replaces control characters with additional graphic characters
an IBM EBCDIC code page
an IBM PC code page, which is ASCII compatible

Codec	Aliases	Languages
ascii	646, us-ascii	English
big5	big5-tw, csbig5	Traditional Chinese
big5hkscs	big5-hkscs, hkscs	Traditional Chinese
cp037	IBM037, IBM039	English
cp424	EBCDIC-CP-HE, IBM424	Hebrew
cp437	437, IBM437	English
cp500	EBCDIC-CP-BE, EBCDIC-CP-CH, IBM500	Western Europe
cp720		Arabic
cp737		Greek
cp775	IBM775	Baltic languages
cp850	850, IBM850	Western Europe
cp852	852, IBM852	Central and Eastern Europe
cp855	855, IBM855	Bulgarian, Byelorussian, Macedonian, Russian, Serbian
cp856		Hebrew
cp857	857, IBM857	Turkish
cp858	858, IBM858	Western Europe
cp860	860, IBM860	Portuguese
cp861	861, CP-IS, IBM861	Icelandic
cp862	862, IBM862	Hebrew
cp863	863, IBM863	Canadian
cp864	IBM864	Arabic
cp865	865, IBM865	Danish, Norwegian
cp866	866, IBM866	Russian
cp869	869, CP-GR, IBM869	Greek
cp874		Thai
cp875		Greek
cp932	932, ms932, mskanji, ms-kanji	Japanese
cp949	949, ms949, uhc	Korean
cp950	950, ms950	Traditional Chinese
cp1006		Urdu
cp1026	ibm1026	Turkish
cp1140	ibm1140	Western Europe
cp1250	windows-1250	Central and Eastern Europe
cp1251	windows-1251	Bulgarian, Byelorussian, Macedonian, Russian, Serbian
cp1252	windows-1252	Western Europe
cp1253	windows-1253	Greek
cp1254	windows-1254	Turkish
cp1255	windows-1255	Hebrew
cp1256	windows-1256	Arabic
cp1257	windows-1257	Baltic languages
cp1258	windows-1258	Vietnamese
euc_jp	eucjp, ujis, u-jis	Japanese
euc_jis_2004	jisx0213, eucjis2004	Japanese
euc_jisx0213	eucjisx0213	Japanese
euc_kr	euckr, korean, ksc5601, ks_c-5601, ks_c-5601-1987, ksx1001, ks_x-1001	Korean
gb2312	chinese, csiso58gb231280, euc- cn, euccn, eucgb2312-cn, gb2312-1980, gb2312-80, iso- ir-58	Simplified Chinese
gbk	936, cp936, ms936	Unified Chinese
gb18030	gb18030-2000	Unified Chinese
hz	hzgb, hz-gb, hz-gb-2312	Simplified Chinese
iso2022_jp	csiso2022jp, iso2022jp, iso-2022-jp	Japanese
iso2022_jp_1	iso2022jp-1, iso-2022-jp-1	Japanese
iso2022_jp_2	iso2022jp-2, iso-2022-jp-2	Japanese, Korean, Simplified Chinese, Western Europe, Greek
iso2022_jp_2004	iso2022jp-2004, iso-2022-jp-2004	Japanese
iso2022_jp_3	iso2022jp-3, iso-2022-jp-3	Japanese
iso2022_jp_ext	iso2022jp-ext, iso-2022-jp-ext	Japanese
iso2022_kr	csiso2022kr, iso2022kr, iso-2022-kr	Korean
latin_1	iso-8859-1, iso8859-1, 8859, cp819, latin, latin1, L1	West Europe
iso8859_2	iso-8859-2, latin2, L2	Central and Eastern Europe
iso8859_3	iso-8859-3, latin3, L3	Esperanto, Maltese
iso8859_4	iso-8859-4, latin4, L4	Baltic languages
iso8859_5	iso-8859-5, cyrillic	Bulgarian, Byelorussian, Macedonian, Russian, Serbian
iso8859_6	iso-8859-6, arabic	Arabic
iso8859_7	iso-8859-7, greek, greek8	Greek
iso8859_8	iso-8859-8, hebrew	Hebrew
iso8859_9	iso-8859-9, latin5, L5	Turkish
iso8859_10	iso-8859-10, latin6, L6	Nordic languages
iso8859_11	iso-8859-11, thai	Thai languages
iso8859_13	iso-8859-13, latin7, L7	Baltic languages
iso8859_14	iso-8859-14, latin8, L8	Celtic languages
iso8859_15	iso-8859-15, latin9, L9	Western Europe
iso8859_16	iso-8859-16, latin10, L10	South-Eastern Europe
johab	cp1361, ms1361	Korean
koi8_r		Russian
koi8_u		Ukrainian
mac_cyrillic	maccyrillic	Bulgarian, Byelorussian, Macedonian, Russian, Serbian
mac_greek	macgreek	Greek
mac_iceland	maciceland	Icelandic
mac_latin2	maclatin2, maccentraleurope	Central and Eastern Europe
mac_roman	macroman	Western Europe
mac_turkish	macturkish	Turkish
ptcp154	csptcp154, pt154, cp154, cyrillic-asian	Kazakh
shift_jis	csshiftjis, shiftjis, sjis, s_jis	Japanese
shift_jis_2004	shiftjis2004, sjis_2004, sjis2004	Japanese
shift_jisx0213	shiftjisx0213, sjisx0213, s_jisx0213	Japanese
utf_32	U32, utf32	all languages
utf_32_be	UTF-32BE	all languages
utf_32_le	UTF-32LE	all languages
utf_16	U16, utf16	all languages
utf_16_be	UTF-16BE	all languages (BMP only)
utf_16_le	UTF-16LE	all languages (BMP only)
utf_7	U7, unicode-1-1-utf-7	all languages
utf_8	U8, UTF, utf8	all languages
utf_8_sig		all languages

7.8.4. Python Specific Encodings

A number of predefined codecs are specific to Python, so their codec names have no meaning outside Python. These are listed in the tables below based on the expected input and output types (note that while text encodings are the most common use case for codecs, the underlying codec infrastructure supports arbitrary data transforms rather than just text encodings). For asymmetric codecs, the stated purpose describes the encoding direction.

The following codecs provide unicode-to-str encoding [1] and str-to-unicode decoding [2], similar to the Unicode text encodings.

Codec	Aliases	Purpose
idna		Implements RFC 3490, see also `encodings.idna`
mbcs	dbcs	Windows only: Encode operand according to the ANSI codepage (CP_ACP)
palmos		Encoding of PalmOS 3.5
punycode		Implements RFC 3492
raw_unicode_escape		Produce a string that is suitable as raw Unicode literal in Python source code
rot_13	rot13	Returns the Caesar-cypher encryption of the operand
undefined		Raise an exception for all conversions. Can be used as the system encoding if no automatic coercion between byte and Unicode strings is desired.
unicode_escape		Produce a string that is suitable as Unicode literal in Python source code
unicode_internal		Return the internal representation of the operand

New in version 2.3: The idna and punycode encodings.

The following codecs provide str-to-str encoding and decoding [2].

Codec	Aliases	Purpose	Encoder/decoder
base64_codec	base64, base-64	Convert operand to multiline MIME base64 (the result always includes a trailing `'\n'`)	`base64.encodestring()`,`base64.decodestring()`
bz2_codec	bz2	Compress the operand using bz2	`bz2.compress()`, `bz2.decompress()`
hex_codec	hex	Convert operand to hexadecimal representation, with two digits per byte	`binascii.b2a_hex()`, `binascii.a2b_hex()`
quopri_codec	quopri, quoted-printable, quotedprintable	Convert operand to MIME quoted printable	`quopri.encode()` with `quotetabs=True`,`quopri.decode()`
string_escape		Produce a string that is suitable as string literal in Python source code
uu_codec	uu	Convert the operand using uuencode	`uu.encode()`, `uu.decode()`
zlib_codec	zip, zlib	Compress the operand using gzip	`zlib.compress()`, `zlib.decompress()`

[1]	str objects are also accepted as input in place of unicode objects. They are implicitly converted to unicode by decoding them using the default encoding. If this conversion fails, it may lead to encoding operations raising `UnicodeDecodeError`.

[2]	(1, 2) unicode objects are also accepted as input in place of str objects. They are implicitly converted to str by encoding them using the default encoding. If this conversion fails, it may lead to decoding operations raising `UnicodeEncodeError`.

python 编码方式大全 fr = open(filename_r,encoding='cp852')的更多相关文章

【python】python编码方式,chardet编码识别库
环境: python3.6 需求: 针对于打开一个文件,可以读取到文本的编码方式,根据默认的文件编码方式来获取文件,就不会出现乱码. 针对这种需求,python中有这个方式可以很好的解决: 解决策略: ...
python编码错误
初学python,遇到的最难忘的坑没有之一.这个问题起码困扰了我一周.在我写了一段代码之后经常遇见这样的报错. 本质原因是我用的python2,在编码流派中python2是比较奇葩的一派,不随大流.所 ...
系统编码，文件编码，python编码
系统编码,可以通过locale命令查看(LINUX)https://wiki.archlinux.org/index.php/Locale_(简体中文), centos7 配置文件在/etc/prof ...
vim 编码方式的设置
和所有的流行文本编辑器一样,Vim 可以很好的编辑各种字符编码的文件,这当然包括UCS-2.UTF-8 等流行的 Unicode 编码方式.然而不幸的是,和很多来自 Linux 世界的软件一样,这需要 ...
base64编码方式
一.编码的两大方式: 在python3.x中,字符串编码分为unicode和bytes两大类编码方式. 直接书写s='中国人',这种方式定义的编码方式为unicode,是通用的方式. 另一种是byte ...
Python中的幽灵—编码方式
首先要搞懂本地操作系统编码与系统编码的区别: 本地操作系统编码方式与操作系统有关,Linux默认编码方式为utf-8,Windows默认编码方式为gbk: 系统编码方式与编译器or解释器有关,Pyth ...
python批量修改文件内容及文件编码方式的处理
最近公司在做tfs迁移,后面要用新的ip地址去访问tfs 拉取代码 ,所以原来发布脚本中.bat类型的脚本中的的ip地址需要更换简单说下我们发布脚本层级目录 :每个服务站点下都会有一个发布脚本 . ...
python笔记二（数据类型和变量、编码方式、字符串的编码、字符串的格式化）
一.数据类型 python可以直接处理的数据类型有:整数.浮点数.字符串.布尔值.空值. 整数浮点数字符串:双引号内嵌套单引号,可以输出 i'm ok. 也可以用\来实现,\n 换行 \t tab ...
python文件（概念、基本操作、常用操作、文本文件的编码方式）
文件目标文件的概念文件的基本操作文件/文件夹的常用操作文本文件的编码方式 01. 文件的概念 1.1 文件的概念和作用计算机的文件,就是存储在某种长期储存设备上的一段数据长期存储 ...

随机推荐

c++变量声明、定义，const变量
变量声明和定义的主要区别: 声明不分配存储空间,定义分配存储空间. 变量可以声明多次,但只能定义一次(一个变量只能在一个源文件中定义) 声明通常放在头文件(.h)中,定义放在源文件(.cpp)中变量 ...
spring浏览器国际化的原理
We will add two languages support to our application: English and German. Depending on the locale se ...
Ansible 从MySQL数据库添加或删除用户
mysql_user - 从MySQL数据库添加或删除用户. 概要要求(在执行模块的主机上) 选项例子笔记状态支持概要从MySQL数据库添加或删除用户. 要求(在执行模块的主机上) My ...
sudoers的权限被改，又忘记了root密码，又不能重启。这么做。
报下面这个错 sudo: /etc/sudoers is world writablesudo: no valid sudoers sources found, quittingsudo: unabl ...
fiddler 抓取逍遥安卓模拟器 https包
1.打开fiddler,进行相关设置 Tools--Fiddler Options 接下来进行客户端网络配置 1 查看电脑ip地址,ipconfig 逍遥游模拟器中使用自带的浏览器,访问192.168 ...
IDEA artifacts Web Application:Exploded Web Application:Archive
首先,artifacts是maven中的一个概念,表示项目/modules如何打包,比如jar,war,war exploded,ear等打包形式,一个项目或者说module有了artifacts 就 ...
ubuntu部署jenkins
https://www.cnblogs.com/lozz/p/9962316.html 1.安装 wget -q -O - https://pkg.jenkins.io/debian/jenkins- ...
关于easyui-datagrid数据表格, 分页取出数据
在制作数据表格的时候有一个这样的属性, pagination是否显示分页列表, 分页显示的时候需要分别从数据库中取数据, 每页显示几行, 即只从数据库取出几行数据来显示, 具体代码如下 1, 显示页面 ...
算法笔记_067:蓝桥杯练习算法训练安慰奶牛（Java）
目录 1 问题描述 2 解决方案 1 问题描述问题描述 Farmer John变得非常懒,他不想再继续维护供奶牛之间供通行的道路.道路被用来连接N个牧场,牧场被连续地编号为1到N.每一个牧场都是 ...
关于session报错问题。
刚开始一直报500错误,页面不提示,也没想着去查看日志文件.好几天了,一看日志,发现是这个问题.问了一下,是session的问题. 2017/07/25 16:57:49 [error] 2300#0 ...

python 编码方式大全 fr = open(filename_r,encoding='cp852')

7.8.3. Standard Encodings

7.8.4. Python Specific Encodings

python 编码方式大全 fr = open(filename_r,encoding='cp852')的更多相关文章

随机推荐

热门专题