Python核心技术与实战笔记

基础篇

Jupyter Notebook

优点

整合所有的资源
交互性编程体验
零成本重现结果

实践站点

列表与元组

列表和元组，都是 一个可以放置任意数据类型的有序集合。

l = [1, 2, 'hello', 'world'] # 列表中同时含有 int 和 string 类型的元素

l

[1, 2, 'hello', 'world']

tup = ('jason', 22) # 元组中同时含有 int 和 string 类型的元素

tup

('jason', 22)

列表是动态的，长度大小不固定，可以随意地增加、删减或者改变元素 (mutable)
元组是静态的，场地大小固定，无法增加删除或者改变 (immutable)
都支持负数索引；
都支持切片操作；
都可以随意嵌套；
两者可以通过 list() 和 tuple() 函数相互转换；

列表和元组存储方式的差异

由于列表是动态的，所以它需要存储指针，来指向对应的元素。增加/删除的时间复杂度均为 O(1)。

l = []

l.__sizeof__() // 空列表的存储空间为 40 字节

40

l.append(1)

l.__sizeof__()

72 // 加入了元素 1 之后，列表为其分配了可以存储 4 个元素的空间 (72 - 40)/8 = 4

l.append(2)

l.__sizeof__()

72 // 由于之前分配了空间，所以加入元素 2，列表空间不变

l.append(3)

l.__sizeof__()

72 // 同上

l.append(4)

l.__sizeof__()

72 // 同上

l.append(5)

l.__sizeof__()

104 // 加入元素 5 之后，列表的空间不足，所以又额外分配了可以存储 4 个元素的空间

使用场景

如果存储的数据和数量不变，那么肯定选用元组更合适
如果存储的数据和数量是可变的，那么则用列表更合适

区别

列表是动态的，长度可变，可以随意的增加、删除或者改变元素；列表的存储空间略大于元组，性能略逊于元组；
元组是静态的，长度大小固定，不可对元素进行增加、删除、修改操作，元组相对于列表更加的轻量级、性能稍优；

思考题

# 创建空列表

# option A：list()是一个function call，Python的function call会创建stack，并且进行一系列参数检查的操作，比较expensive

empty_list = list()

# option B：[]是一个内置的C函数，可以直接被调用，因此效率高

empty_list = []

字典与集合

字典是一系列无序元素的组合，其长度大小可变，元素可以任意的删除和改变，相比于列表和元组，字典的性能更优，特别是对于查找、添加和删除操作，字典都能在常数时间复杂度内完成。而集合和字典基本相同，唯一的区别，就是集合没有件和值的配对，是一系列无序的、唯一的元素组合。

# 定义字典

d = {'name': 'jason', 'age': 20}

# 增加元素对'gender': 'male'

d['gender'] = 'male'

# 增加元素对'dob': '1999-02-01'

d['dob'] = '1999-02-01'

d

{'name': 'jason', 'age': 20, 'gender': 'male', 'dob': '1999-02-01'}

# 更新键'dob'对应的值

d['dob'] = '1998-01-01'

# 删除键为'dob'的元素对

d.pop('dob')

'1998-01-01'

d

{'name': 'jason', 'age': 20, 'gender': 'male'}

# 定义集合

s = {1, 2, 3}

# 增加元素 4 到集合

s.add(4)

s

{1, 2, 3, 4}

# 从集合中删除元素 4

s.remove(4)

s

{1, 2, 3}

d = {'b': 1, 'a': 2, 'c': 10}

# 根据字典键的升序排序

d_sorted_by_key = sorted(d.items(), key=lambda x: x[0])

 # 根据字典值的升序排序

d_sorted_by_value = sorted(d.items(), key=lambda x: x[1])

可以使用 get(key,default) 函数来进行字典索引。如果键不存在，调用该函数可以返回一个默认的值。

集合不支持索引操作，因为集合本质上是一个哈希表，和列表不一样。

字典和集合性能

字典和集合是进行性能高度优化的数据结构，特别是对于查找、添加和删除操作。

字典和集合的工作原理

字典和集合的内部结构都是一张哈希表

对于字典而言，这张哈希表存储了哈希值，键和值这三个元素
对于集合而言，区别就是哈希表内没有键和值的配对，只有单一的元素了

插入操作

每次向字典或集合插入一个元素时，Python 会首先计算键的哈希值（hash(key)），再和 mask = PyDicMinSize - 1 做与操作，计算这个元素应该插入哈希表的位置 index = hash(key) & mask。如果哈希表中此位置是空的，那么这个元素就会被插入其中。而如果此位置已被占用，Python 便会比较两个元素的哈希值和键是否相等。

若两者都相等，则表明这个元素已经存在，如果值不同，则更新值。
若两者中有一个不相等，这种情况我们通常称为哈希冲突（hash collision），意思是两个元素的键不相等，但是哈希值相等。这种情况下，Python 便会继续寻找表中空余的位置，直到找到位置为止。

查找操作

先通过哈希值找到目标位置，然后比较哈希表这个位置中元素的哈希值和键，与需要查找的元素是否相等，如果相等，则直接返回，否则继续查找，知道为空或抛出异常为止

删除操作

暂时对这个位置得到元素赋予一个特殊的值，等到重新调整哈希表的大小时，再将其删除。

字符串

Python 中字符串使用单引号、双引号或三引号表示，三者意义相同，并没有什么区别。其中，三引号的字符串通常用在多行字符串的场景。
Python 中字符串是不可变的（前面所讲的新版本 Python 中拼接操作’+='是个例外）。因此，随意改变字符串中字符的值，是不被允许的。
Python 新版本（2.5+）中，字符串的拼接变得比以前高效了许多，你可以放心使用。
Python 中字符串的格式化（string.format,f）常常用在输出、日志的记录等场景。

输入与输出

输入输出基础

生产环境中使用强制转换时，请记得加上 try except

文件输入输出

所有 I/O 都应该进行错误处理。因为 I/O 操作可能会有各种各样的情况出现，而一个健壮（robust）的程序，需要能应对各种情况的发生，而不应该崩溃（故意设计的情况除外）。

JSON 序列化与实战

json.dumps() 函数，接受 python 的基本数据类型，然后将其序列化为 string;
json.loads() 函数，接受一个合法字符串，然后将其序列化为 python 的基本数据类型；

条件与循环

在条件语句中，if 可以单独使用，但是 elif 和 else 必须和 if 同时搭配使用；而 If 条件语句的判断，除了 boolean 类型外，其他的最好显示出来。
在 for 循环中，如果需要同时访问索引和元素，你可以使用 enumerate() 函数来简化代码。
写条件与循环时，合理利用+continue+或者+break+来避免复杂的嵌套，是十分重要的。
要注意条件与循环的复用，简单功能往往可以用一行直接完成，极大地提高代码质量与效率。

异常处理

异常，通常是指程序运行的过程中遇到了错误，终止并退出。我们通常使用 try except 语句去处理异常，这样程序就不会被终止，仍能继续执行。
处理异常时，如果有必须执行的语句，比如文件打开后必须关闭等等，则可以放在 finally block 中。
异常处理，通常用在你不确定某段代码能否成功执行，也无法轻易判断的情况下，比如数据库的连接、读取等等。正常的 flow-control 逻辑，不要使用异常处理，直接用条件语句解决就可以了。

自定义函数

Python 中函数的参数可以接受任意的数据类型，使用起来需要注意，必要时请在函数开头加入数据类型的检查；
和其他语言不同，Python 中函数的参数可以设定默认值；
嵌套函数的使用，能保证数据的隐私性，提高程序运行效率；
合理地使用闭包，则可以简化程序的复杂度，提高可读性；

匿名函数

优点：

减少代码的重复性；
模块化代码；

map(function,iterable)

表示对 iterable 中的每个元素，都运用 function 这个函数，最后返回一个新的可遍历的集合。

def square(x):

    return x**2

squared = map(square, [1, 2, 3, 4, 5]) # [2， 4， 6， 8， 10]

filter(function,iterable)

表示对 iterable 中的每个元素，都使用 function 判断，并返回 True 或者 False，最后将返回 True 的元素组成一个新的可遍历的集合。

l = [1, 2, 3, 4, 5]

new_list = filter(lambda x: x % 2 == 0, l) # [2, 4]

reduce(function,iterable)

规定它有两个参数，表示对 iterable 中的每个元素以及上一次调用后的结果，运用 function 进行计算，所以最后返回的是一个单独的数值。

l = [1, 2, 3, 4, 5]

product = reduce(lambda x, y: x * y, l) # 1*2*3*4*5 = 120

面向对象

基本概念

类：一群有着相似性的事物的集合；
对象：集合中的一个事物；
属性：对象的某个静态特征；
函数：对象某个动态能力

三要素：

继承
封装
多态

模块化编程

通过绝对路径和相对路径，我们可以 import 模块；
在大型工程中模块化非常重要，模块的索引要通过绝对路径来做，而绝对路径从程序的根目录开始；
记着巧用 if __name__ == "__main__" 来避开 import 时执行；

进阶篇

Python 对象的比较、拷贝

比较操作符 == 表示比较对象间的值是否相等，而 is 表示比较对象的标识是否相等，即它们是否指向同一个内存地址。
比较操作符 is 效率优于 ==，因为 is 操作符无法被重载，执行 is 操作只是简单的获取对象的 ID，并进行比较；而 == 操作符则会递归地遍历对象的所有值，并逐一比较。
浅拷贝中的元素，是原对象中子对象的引用，因此，如果原对象中的元素是可变的，改变其也会影响拷贝后的对象，存在一定的副作用。
深度拷贝则会递归地拷贝原对象中的每一个子对象，因此拷贝后的对象和原对象互不相关。另外，深度拷贝中会维护一个字典，记录已经拷贝的对象及其 ID，来提高效率并防止无限递归的发生。

值传递与引用传递

常见的参数传递有 2 种：

值传递：通常就是拷贝对象的值，然后传递给函数里的新变量，原变量和新变量之间相互独立，互不影响
引用传递：通常是指把参数的引用传给新的变量，这样，原变量和新变量就会指向同一块内存地址。

准确来说， python 的参数传递是 赋值传递 ，或者叫做对象的 引用传递 ，python 里所有的数据类型都是对象，所以参数传递时，只是让新变量与原变量指向相同的对象而已，并不存在值传递或引用传递一说。

需要注意的是，这里的赋值或对象的引用传递，不是指一个具体的内存地址，二十指一个具体的对象。

如果对象是可变的，当其改变时，所有指向这个对象的变量都会改变；
如果对象不可变，简单的赋值只能改变其中一个变量的值，其余变量则不受影响；

装饰器

函数也是对象

def func(message):

    print('Got a message: {}'.format(message))

send_message = func

send_message('hello world')

# 输出

Got a message: hello world

函数可以作为函数参数

def get_message(message):

    return 'Got a message: ' + message

def root_call(func, message):

    print(func(message))

root_call(get_message, 'hello world')

# 输出

Got a message: hello world

函数可以嵌套函数

def func(message):

    def get_message(message):

        print('Got a message: {}'.format(message))

    return get_message(message)

func('hello world')

# 输出

Got a message: hello world

函数的返回值也可以是函数对象(闭包)

def func_closure():

    def get_message(message):

        print('Got a message: {}'.format(message))

    return get_message

send_message = func_closure()

send_message('hello world')

# 输出

Got a message: hello world

简单使用装饰器

def my_decorator(func):

    def wrapper():

        print('wrapper of decorator')

        func()

    return wrapper

def greet():

    print('hello world')

greet = my_decorator(greet)

greet()

# 输出

wrapper of decorator

hello world

更优雅的写法

def my_decorator(func):

    def wrapper():

        print('wrapper of decorator')

        func()

    return wrapper

@my_decorator

def greet():

    print('hello world')

greet()

带参数的装饰器

def my_decorator(func):

    def wrapper(message):

        print('wrapper of decorator')

        func(message)

    return wrapper

@my_decorator

def greet(message):

    print(message)

greet('hello world')

# 输出

wrapper of decorator

hello world

带自定义参数的装饰器

def repeat(num):

    def my_decorator(func):

        def wrapper(*args, **kwargs):

            for i in range(num):

                print('wrapper of decorator')

                func(*args, **kwargs)

        return wrapper

    return my_decorator

@repeat(4)

def greet(message):

    print(message)

greet('hello world')

# 输出：

wrapper of decorator

hello world

wrapper of decorator

hello world

wrapper of decorator

hello world

wrapper of decorator

hello world

上述 green() 函数被装饰以后，它的元信息会发生改变，可勇敢 greet__name__ 来查看。可通过内置装饰器来解决这个问题

import functools

def my_decorator(func):

    @functools.wraps(func)

    def wrapper(*args, **kwargs):

        print('wrapper of decorator')

        func(*args, **kwargs)

    return wrapper

@my_decorator

def greet(message):

    print(message)

greet.__name__

# 输出

'greet'

类装饰器

class Count:

    def __init__(self, func):

        self.func = func

        self.num_calls = 0

    def __call__(self, *args, **kwargs):

        self.num_calls += 1

        print('num of calls is: {}'.format(self.num_calls))

        return self.func(*args, **kwargs)

@Count

def example():

    print("hello world")

example()

# 输出

num of calls is: 1

hello world

example()

# 输出

num of calls is: 2

hello world

装饰器支持嵌套使用

@decorator1

@decorator2

@decorator3

def func():

    ...

# 等价于

decorator1(decorator2(decorator3(func)))

装饰器使用场景：

身份认证
日志记录
输入合理性检查
缓存（LRU cache）

metaclass

metaclass 是 Python 黑魔法级别的语言特性，它可以改变正常 Python 类型的创建过程。

所有 Python 的用户定义类，都是 type 这个类的实例
用户自定义类，只不过是 type 类的 __ call __ 运算符重载
metaclass 是 type 的子类，通过替换 type 的 __ call __ 运算符重载机制，超越变形正常的类

class Mymeta(type):

    def __init__(self, name, bases, dic):

        super().__init__(name, bases, dic)

        print('===>Mymeta.__init__')

        print(self.__name__)

        print(dic)

        print(self.yaml_tag)

    def __new__(cls, *args, **kwargs):

        print('===>Mymeta.__new__')

        print(cls.__name__)

        return type.__new__(cls, *args, **kwargs)

    def __call__(cls, *args, **kwargs):

        print('===>Mymeta.__call__')

        obj = cls.__new__(cls)

        cls.__init__(cls, *args, **kwargs)

        return obj

class Foo(metaclass=Mymeta):

    yaml_tag = '!Foo'

    def __init__(self, name):

        print('Foo.__init__')

        self.name = name

    def __new__(cls, *args, **kwargs):

        print('Foo.__new__')

        return object.__new__(cls)

foo = Foo('foo')

迭代器和生成器

容器时可迭代对象，可迭代对象调用 iter() 函数，可以得到一个迭代器。迭代器可以通过 next() 函数来得到下一个元素，从而支持遍历
生成器时一种特殊的迭代器，合理使用生成器，可以降低内存占用、优化程序结构、提高程序速度
生成器在 Python 2 的版本上，是协程的一种重要实现方式；而 Python 3.5 引入的 async、await 语法糖，生成器实现协程的方式就已经落后了。

协程

协程是实现并发编程的一种方式

协程和多线程的区别，主要在于两点，一是协程为单线程；二是协程由用户决定，在哪些地方交出控制权，切换到下一个任务
协程的写法更加简洁清晰；把 async/await 语法和 create_task 结合起来用，对于中小级别的并发需求已经毫无压力

生产者/消费者模型

import asyncio

import random

async def consumer(queue, id):

    while True:

        val = await queue.get()

        print('{} get a val: {}'.format(id, val))

        await asyncio.sleep(1)

async def producer(queue, id):

    for i in range(5):

        val = random.randint(1, 10)

        await queue.put(val)

        print('{} put a val: {}'.format(id, val))

        await asyncio.sleep(1)

async def main():

    queue = asyncio.Queue()

    consumer_1 = asyncio.create_task(consumer(queue, 'consumer_1'))

    consumer_2 = asyncio.create_task(consumer(queue, 'consumer_2'))

    producer_1 = asyncio.create_task(producer(queue, 'producer_1'))

    producer_2 = asyncio.create_task(producer(queue, 'producer_2'))

    await asyncio.sleep(10)

    consumer_1.cancel()

    consumer_2.cancel()

    await asyncio.gather(consumer_1, consumer_2, producer_1, producer_2, return_exceptions=True)

%time asyncio.run(main())

########## 输出 ##########

producer_1 put a val: 5

producer_2 put a val: 3

consumer_1 get a val: 5

consumer_2 get a val: 3

producer_1 put a val: 1

producer_2 put a val: 3

consumer_2 get a val: 1

consumer_1 get a val: 3

producer_1 put a val: 6

producer_2 put a val: 10

consumer_1 get a val: 6

consumer_2 get a val: 10

producer_1 put a val: 4

producer_2 put a val: 5

consumer_2 get a val: 4

consumer_1 get a val: 5

producer_1 put a val: 2

producer_2 put a val: 8

consumer_1 get a val: 2

consumer_2 get a val: 8

Wall time: 10 s

并发编程之 Futures

区别并发和并行

并发通常应用与 I/O 操作频繁的场景，比如你要从网站上下载多个文件， I/O 操作的时间可能比 CPU 运行处理的时间长得多，通过线程和任务之间互相切换的方式实现，但同一时刻，只允许有一个线程或任务执行
并行更多应用于 CPU heavy 的场景，比如 MapReduce 中的并行计算，为了加快运算速度，一般会用多台机器，多个处理器来完成。可以让多个进程完全同步同时的执行

Python 中之所以同一时刻只运行一个线程运行，其实是由于全局解释锁的存在。但对 I/O 操作而言，当其被 block 的时候，全局解释器锁便会被释放，使气体线程继续执行。

import concurrent.futures

import requests

import threading

import time

def download_one(url):

    resp = requests.get(url)

    print('Read {} from {}'.format(len(resp.content), url))

# 版本 1

def download_all(sites):

    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:

        executor.map(download_one, sites)

# 版本 2

def download_all(sites):

    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:

        to_do = []

        for site in sites:

            future = executor.submit(download_one, site)

            to_do.append(future)

        for future in concurrent.futures.as_completed(to_do):

            future.result()

def main():

    sites = [

        'https://en.wikipedia.org/wiki/Portal:Arts',

        'https://en.wikipedia.org/wiki/Portal:History',

        'https://en.wikipedia.org/wiki/Portal:Society',

        'https://en.wikipedia.org/wiki/Portal:Biography',

        'https://en.wikipedia.org/wiki/Portal:Mathematics',

        'https://en.wikipedia.org/wiki/Portal:Technology',

        'https://en.wikipedia.org/wiki/Portal:Geography',

        'https://en.wikipedia.org/wiki/Portal:Science',

        'https://en.wikipedia.org/wiki/Computer_science',

        'https://en.wikipedia.org/wiki/Python_(programming_language)',

        'https://en.wikipedia.org/wiki/Java_(programming_language)',

        'https://en.wikipedia.org/wiki/PHP',

        'https://en.wikipedia.org/wiki/Node.js',

        'https://en.wikipedia.org/wiki/The_C_Programming_Language',

        'https://en.wikipedia.org/wiki/Go_(programming_language)'

    ]

    start_time = time.perf_counter()

    download_all(sites)

    end_time = time.perf_counter()

    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))

if __name__ == '__main__':

    main()

## 输出

Read 151021 from https://en.wikipedia.org/wiki/Portal:Mathematics

Read 129886 from https://en.wikipedia.org/wiki/Portal:Arts

Read 107637 from https://en.wikipedia.org/wiki/Portal:Biography

Read 224118 from https://en.wikipedia.org/wiki/Portal:Society

Read 184343 from https://en.wikipedia.org/wiki/Portal:History

Read 167923 from https://en.wikipedia.org/wiki/Portal:Geography

Read 157811 from https://en.wikipedia.org/wiki/Portal:Technology

Read 91533 from https://en.wikipedia.org/wiki/Portal:Science

Read 321352 from https://en.wikipedia.org/wiki/Computer_science

Read 391905 from https://en.wikipedia.org/wiki/Python_(programming_language)

Read 180298 from https://en.wikipedia.org/wiki/Node.js

Read 56765 from https://en.wikipedia.org/wiki/The_C_Programming_Language

Read 468461 from https://en.wikipedia.org/wiki/PHP

Read 321417 from https://en.wikipedia.org/wiki/Java_(programming_language)

Read 324039 from https://en.wikipedia.org/wiki/Go_(programming_language)

Download 15 sites in 0.19936635800002023 seconds

并发编程之 Asyncio

import asyncio

import aiohttp

import time

async def download_one(url):

    async with aiohttp.ClientSession() as session:

        async with session.get(url) as resp:

            print('Read {} from {}'.format(resp.content_length, url))

async def download_all(sites):

    tasks = [asyncio.create_task(download_one(site)) for site in sites]

    await asyncio.gather(*tasks)

def main():

    sites = [

        'https://en.wikipedia.org/wiki/Portal:Arts',

        'https://en.wikipedia.org/wiki/Portal:History',

        'https://en.wikipedia.org/wiki/Portal:Society',

        'https://en.wikipedia.org/wiki/Portal:Biography',

        'https://en.wikipedia.org/wiki/Portal:Mathematics',

        'https://en.wikipedia.org/wiki/Portal:Technology',

        'https://en.wikipedia.org/wiki/Portal:Geography',

        'https://en.wikipedia.org/wiki/Portal:Science',

        'https://en.wikipedia.org/wiki/Computer_science',

        'https://en.wikipedia.org/wiki/Python_(programming_language)',

        'https://en.wikipedia.org/wiki/Java_(programming_language)',

        'https://en.wikipedia.org/wiki/PHP',

        'https://en.wikipedia.org/wiki/Node.js',

        'https://en.wikipedia.org/wiki/The_C_Programming_Language',

        'https://en.wikipedia.org/wiki/Go_(programming_language)'

    ]

    start_time = time.perf_counter()

    asyncio.run(download_all(sites))

    end_time = time.perf_counter()

    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))

if __name__ == '__main__':

    main()

## 输出

Read 63153 from https://en.wikipedia.org/wiki/Java_(programming_language)

Read 31461 from https://en.wikipedia.org/wiki/Portal:Society

Read 23965 from https://en.wikipedia.org/wiki/Portal:Biography

Read 36312 from https://en.wikipedia.org/wiki/Portal:History

Read 25203 from https://en.wikipedia.org/wiki/Portal:Arts

Read 15160 from https://en.wikipedia.org/wiki/The_C_Programming_Language

Read 28749 from https://en.wikipedia.org/wiki/Portal:Mathematics

Read 29587 from https://en.wikipedia.org/wiki/Portal:Technology

Read 79318 from https://en.wikipedia.org/wiki/PHP

Read 30298 from https://en.wikipedia.org/wiki/Portal:Geography

Read 73914 from https://en.wikipedia.org/wiki/Python_(programming_language)

Read 62218 from https://en.wikipedia.org/wiki/Go_(programming_language)

Read 22318 from https://en.wikipedia.org/wiki/Portal:Science

Read 36800 from https://en.wikipedia.org/wiki/Node.js

Read 67028 from https://en.wikipedia.org/wiki/Computer_science

Download 15 sites in 0.062144195078872144 seconds

Asyncio 是单线程的，但其内部 event loop 的机制，可以让它并发地运行多个不同的任务，并且比多线程享有更大的自主控制权。

Asyncio 中的任务，在运行过程中不会被打断，因此不会出现 race condition 的情况。尤其是在 I/O 操作 heavy 的场景下， Asyncio 比多线程的运行效率更高，因此 Asyncio 内部任务切换的损耗，远比线程切换的损耗要小，并且 Asyncio 可以开启的任务数量，也比多线程中的线程数列多得多。

但需要注意的是，很多情况下，使用 Asyncio 需要特定第三方库的支持，而如果 I/O 操作很快，并不 heavy ，建议使用多线程来解决问题。

GIL（全局解释器锁）

CPython 引进 GIL 主要是由于：

设计者为了规避类似于内存管理这样的复杂的竞争风险问题
因为 CPython 大量使用 C 语言库，但大部分 C 语言库都不是原生线程安全的（线程安全会降低性能和增加复杂度）

GIL 的设计，主要是为了方便 CPython 解释器层面的编写者，而不是 Python 应用层面的程序员。

可以使用 import dis 的方式将代码编译成 bytecode

垃圾回收机制

垃圾回收是 Python 自带的机制，用于自动释放不会再用到的内存空间
引用计数是其最简单的实现，不过切记，这只是充分非必要条件，因为循环引用需要通过不可达判定，来确定是否可以回收
Python 的自动回收算法包括标记清除和分代收集，主要针对的是循环引用的垃圾收集
调试内存泄漏方便，objgraph 是很好的可视化分析工具

编程规范

阅读者的体验 > 编程者的体验 > 机器的体验

学会合理分解代码，提高代码可读性
合理利用 assert(线上环境禁止使用)
启用上下文管理器和 with 语句精简代码
单元测试
pdf & cprofile