python 2 encode and decode
a Unicode string is a sequence of code points, which are numbers from 0 to 0x10ffff. This sequence needs to be represented as a set of bytes (meaning, values from 0–255) in memory. The rules for translating a Unicode string into a sequence of bytes are called an encoding.
UTF-8 is probably the most commonly supported encoding. UTF stands for “Unicode Transformation Format”, and the ‘8’ means that 8-bit numbers are used in the encoding
utf-8 是最普遍支持的编码。utf 表示“统一编码转换格式”, 8 表示8位编码统一编码转换格式
Python’s 8-bit strings have a .decode([encoding], [errors])
method that interprets the string using the given encoding
The unicode()
constructor has the signature unicode(string[, encoding, errors])
. All of its arguments should be 8-bit strings. The first argument is converted to Unicode using the specified encoding; if you leave off the encoding
argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors
unicode()构造体有个标志函数unicode(string[, encoding, errors]).所有参数都应是8比特字符串。使用指定的编码将第一个参数转换为Unicode;如果去掉编码参数,则使用ASCII编码进行转换,因此大于127的字符将被视为错误。
