转载:https://blog.ionelmc.ro/2015/02/09/understanding-python-metaclasses/

None of the existing articles [1] give a comprehensive explanation of how metaclasses work in Python so I'm making my own. Metaclasses are a controversial topic [2] in Python, many users avoid them and I think this is largely caused by the arbitrary workflow and lookup rules which are not well explained. There are few key concepts that you need to understand to efficiently work with metaclasses.

If you haven't heard of metaclasses at all, they present lots of interesting opportunities for reducing boilerplate and nicer APIs. To make this as brief as possible I'm going to assume the creative types [3] will read this. This means I'm going to skip the whole very subjective “use metaclasses for this but not for that” conundrum.

This is written with Python 3 in mind, if there something specific to Python 2 worth mentioning it's going to be in the footnotes [4], in itty-bitty font size :-)

A quick overview

A high level explanation is necessary before we get down to the details.

A class is an object, and just like any other object, it's an instance of something: a metaclass. The default metaclass is type. Unfortunately, due to backwards compatibility, type is a bit confusing: it can also be used as a function that return the class [13] of an object:

>>> class Foobar:
... pass
...
>>> type(Foobar)
<class 'type'>
>>> foo = Foobar()
>>> type(foo)
<class '__main__.Foobar'>

If you're familiar with the isinstance builtin then you'll know this:

>>> isinstance(foo, Foobar)
True
>>> isinstance(Foobar, type)
True

To put this in picture:

But lets go back to making classes ...

Simple metaclass use

We can use type directly to make a class, without any class statement:

>>> MyClass = type('MyClass', (), {})
>>> MyClass
<class '__main__.MyClass'>

The class statement isn't just syntactic sugar, it does some extra things, like setting an adequate __qualname__ and __doc__ properties or calling __prepare__.

We can make a custom metaclass:

>>> class Meta(type):
... pass

And then we can use it [15]:

>>> class Complex(metaclass=Meta):
... pass
>>> type(Complex)
<class '__main__.Meta'>

Now we got a rough idea of what we'll be dealing with ...

Magic methods

One distinctive feature of Python is magic methods: they allow the programmer to override behavior for various operators and behavior of objects. To override the call operator you'd do this:

>>> class Funky:
... def __call__(self):
... print("Look at me, I work like a function!")
>>> f = Funky()
>>> f()
Look at me, I work like a function!

Metaclasses rely on several magic methods so it's quite useful to know a bit more about them.

The slots

When you define a magic method in your class the function will end up as a pointer in a struct that describes the class, in addition to the entry in __dict__. That struct [7] has a field for each magic method. For some reason these fields are called type slots.

Now there's another feature, implemented via the __slots__ attribute [17]. A class with __slots__ will create instances that don't have a __dict__ (they use a little bit less memory). A side-effect of this is that instances cannot have other fields than what was specified in __slots__: if you try to set an unexpected field you'll get an exception.

For the scope of this article when slots are mentioned it will mean the type slots, not__slots__.

Object attribute lookup

Now this is something that's easy to get wrong because of the many slight differences to old-style objects in Python 2. [5]

Assuming Class is the class and instance is an instance of Class, evaluatinginstance.foobar roughly equates to this:

  • Call the type slot for Class.__getattribute__ (tp_getattro). The default does this: [10]

    • Does Class.__dict__ have a foobar item that is a data descriptor [8]?

      • If yes, return the result ofClass.__dict__['foobar'].__get__(instance, Class)[6]
    • Does instance.__dict__ have a foobar item in it?
      • If yes, return instance.__dict__['foobar'].
    • Does Class.__dict__ have a foobar item that is not a data descriptor [9]?
      • If yes, return the result ofClass.__dict__['foobar'].__get__(instance, klass)[6]
    • Does Class.__dict__ have a foobar item?
      • If yes, return the result of Class.__dict__['foobar'].
  • If the attribute still wasn't found, and there's a Class.__getattr__, callClass.__getattr__('foobar').

Still not clear? Perhaps a diagram normal attribute lookup helps:

To avoid creating confusions with the “.” operator doing crazy things I've used “:” in this diagram to signify the location.

Class attribute lookup

Because classes needs to be able support the classmethod and staticmethod properties [6]when you evaluate something like Class.foobar the lookup is slightly different than what would happen when you evaluate instance.foobar.

Assuming Class is an instance of Metaclass, evaluating Class.foobar roughly equates to this:

  • Call the type slot for Metaclass.__getattribute__ (tp_getattro). The default does this: [11]

    • Does Metaclass.__dict__ have a foobar item that is a data descriptor [8]?

      • If yes, return the result ofMetaclass.__dict__['foobar'].__get__(Class, Metaclass)[6]
    • Does Class.__dict__ have a foobar item that is a descriptor (of any kind)?
      • If yes, return the result of Class.__dict__['foobar'].__get__(None,Class)[6]
    • Does Class.__dict__ have a foobar item in it?
      • If yes, return Class.__dict__['foobar'].
    • Does Metaclass.__dict__ have a foobar item that is not a data descriptor[9]?
      • If yes, return the result ofMetaclass.__dict__['foobar'].__get__(Class, Metaclass)[6]
    • Does Metaclass.__dict__ have any foobar item?
      • If yes, return Metaclass.__dict__['foobar'].
  • If the attribute still wasn't found, and there's a Metaclass.__getattr__, callMetaclass.__getattr__('foobar').

The whole shebang would look like this in a diagram:

To avoid creating confusions with the “.” operator doing crazy things I've used “:” in this diagram to signify the location.

Magic method lookup

For magic methods the lookup is done on the class, directly in the big struct with the slots: [12]

  • Does the object's class have a slot for that magic method (roughly object->ob_type->tp_<magicmethod> in C code)? If yes, use it. If it's NULL then the operation is not supported.

In C internals parlance:

  • object->ob_type is the class of the object.
  • ob_type->tp_<magicmethod> is the type slot.

This looks much simpler, however, the type slots are filled with wrappers around your functions, so descriptors work as expected: [12]

>>> class Magic:
... @property
... def __repr__(self):
... def inner():
... return "It works!"
... return inner
...
>>> repr(Magic())
'It works!'

Thats it. Does that mean there are places that don't follow those rules and lookup the slot differently? Sadly yes, read on ...

The __new__ method

One of the most common point of confusion with both classes and metaclasses is the__new__ method. It has some very special conventions.

The __new__ method is the constructor (it returns the new instance) while __init__ is just a initializer (the instance is already created when __init__ is called).

Suppose have a class like this: [20]

class Foobar:
def __new__(cls):
return super().__new__(cls)

Now if you recall the previous section, you'd expect that __new__ would be looked up on the metaclass, but alas, it wouldn't be so useful that way [19] so it's looked upstatically.

When the Foobar class wants this magic method it will be looked up on the same object (the class), not on a upper level like all the other magic methods. This is very important to understand, because both the class and the metaclass can define this method:

  • Foobar.__new__ is used to create instances of Foobar
  • type.__new__ is used to create the Foobar class (an instance of type in the example)

The __prepare__ method

This method is called before the class body is executed and it must return a dictionary-like object that's used as the local namespace for all the code from the classbody. It was added in Python 3.0, see PEP-3115.

If your __prepare__ returns an object x then this:

class Class(metaclass=Meta):
a = 1
b = 2
c = 3

Will make the following changes to x:

x['a'] = 1
x['b'] = 2
x['c'] = 3

This x object needs to look like a dictionary. Note that this x object will end up as an argument to Metaclass.__new__ and if it's not an instance of dict you need to convert it before calling super().__new__[21]

Interestingly enough this method doesn't have __new__'s special lookup. It appears it doesn't have it's own type slot and it's looked up via the class attribute lookup, if you read back a bit. [6]

Putting it all together

To start things off, a diagram of how instances are constructed:

How to read this swim lane diagram:

  • The horizontal lanes is the place where you define the functions.
  • Solid lines mean a function call.
    • A line from Metaclass.__call__ to Class.__new__ meansMetaclass.__call__ will call Class.__new__.
  • Dashed lines means something is returned.
    • Class.__new__ returns the instance of Class.
    • Metaclass.__call__ returns whatever Class.__new__ returned (and if it returned an instance of Class it will also call Class.__init__ on it). [16]
  • The number in the red circle signifies the call order.

Creating a class is quite similar:

Few more notes:

  • Metaclass.__prepare__ just returns the namespace object (a dictionary-like object as explained before).
  • Metaclass.__new__ returns the Class object.
  • MetaMetaclass.__call__ returns whatever Metaclass.__new__ returned (and if it returned an instance of Metaclass it will also call Metaclass.__init__ on it). [16]

So you see, metaclasses allow you to customize almost every part of an object life-cycle.

Metaclasses are callables

If you look again at the diagrams, you'll notice that making an instance goes throughMetaclass.__call__. This means you can use any callable as the metaclass:

>>> class Foo(metaclass=print):  # pointless, but illustrative
... pass
...
Foo () {'__module__': '__main__', '__qualname__': 'Foo'}
>>> print(Foo)
None

If you use a function as the metaclass then subclasses won't inherit your functionmetaclass, but the type of whatever that function returned.

Subclasses inherit the metaclass

One advantage compared to class decorators is the fact that subclasses inherit the metaclass.

This is a consequence of the fact that Metaclass(...) returns an object which usually has Metaclass as the __class__.

Restrictions with multiple metaclasses

In the same tone of classes allowing you to have multiple baseclasses, each one of those baseclasses may have a different metaclass. But with a twist: everything has to be linear - the inheritance tree must have a single leaf.

For example, this is not accepted because there would be two leafs (Meta1 and Meta2):

>>> class Meta1(type):
... pass
...
>>> class Meta2(type):
... pass
...
>>> class Base1(metaclass=Meta1):
... pass
...
>>> class Base2(metaclass=Meta2):
... pass
...
>>> class Foobar(Base1, Base2):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

This will work (and will use the leaf as the metaclass):

>>> class Meta(type):
... pass
...
>>> class SubMeta(Meta):
... pass
...
>>> class Base1(metaclass=Meta):
... pass
...
>>> class Base2(metaclass=SubMeta):
... pass
...
>>> class Foobar(Base1, Base2):
... pass
...
>>> type(Foobar)
<class '__main__.SubMeta'>

The method signatures

There are still few important details missing, like the method signatures. Lets look at class and metaclass with all the important stuff implemented.

Note the extra **kwargs - those are the extra keywords arguments you can pass in theclass statement. [22]

>>> class Meta(type):
... @classmethod
... def __prepare__(mcs, name, bases, **kwargs):
... print(' Meta.__prepare__(mcs=%s, name=%r, bases=%s, **%s)' % (
... mcs, name, bases, kwargs
... ))
... return {}

As mentioned before, __prepare__ can return objects that are not dict instances, so you need to make sure your __new__ handles that. [21]

...     def __new__(mcs, name, bases, attrs, **kwargs):
... print(' Meta.__new__(mcs=%s, name=%r, bases=%s, attrs=[%s], **%s)' % (
... mcs, name, bases, ', '.join(attrs), kwargs
... ))
... return super().__new__(mcs, name, bases, attrs)

It's uncommon to see __init__ being implemented in a metaclass because it's not that powerful - the class is already constructed when __init__ is called. It roughly equates to having a class decorator with the difference that __init__ would get run when making subclasses, while class decorators are not called for subclasses.

...     def __init__(cls, name, bases, attrs, **kwargs):
... print(' Meta.__init__(cls=%s, name=%r, bases=%s, attrs=[%s], **%s)' % (
... cls, name, bases, ', '.join(attrs), kwargs
... ))
... return super().__init__(name, bases, attrs)

The __call__ method will be called when you make instances of Class.

...     def __call__(cls, *args, **kwargs):
... print(' Meta.__call__(cls=%s, args=%s, kwargs=%s)' % (
... cls, args, kwargs
... ))
... return super().__call__(*args, **kwargs)
...

Using Meta, note the extra=1[22]

>>> class Class(metaclass=Meta, extra=1):
... def __new__(cls, myarg):
... print(' Class.__new__(cls=%s, myarg=%s)' % (
... cls, myarg
... ))
... return super().__new__(cls)
...
... def __init__(self, myarg):
... print(' Class.__init__(self=%s, myarg=%s)' % (
... self, myarg
... ))
... self.myarg = myarg
... return super().__init__()
...
... def __str__(self):
... return "<instance of Class; myargs=%s>" % (
... getattr(self, 'myarg', 'MISSING'),
... )
Meta.__prepare__(mcs=<class '__main__.Meta'>, name='Class', bases=(),
**{'extra': 1})
Meta.__new__(mcs=<class '__main__.Meta'>, name='Class', bases=(),
attrs=[__qualname__, __new__, __init__, __str__, __module__],
**{'extra': 1})
Meta.__init__(cls=<class '__main__.Class'>, name='Class', bases=(),
attrs=[__qualname__, __new__, __init__, __str__, __module__],
**{'extra': 1})

Note that Meta.__call__ is called when we make instance of Class:

>>> Class(1)
Meta.__call__(cls=<class '__main__.Class'>, args=(1,), kwargs={})
Class.__new__(cls=<class '__main__.Class'>, myarg=1)
Class.__init__(self=<instance of Class; myargs=MISSING>, myarg=1)
<instance of Class; myargs=1>

Ending notes

Now that I wrote this all down, it seems there's lots of thing going on, more than I'd like. I also suspect I've missed some significant details :-)

In a future article the practical applications of all this metaclass theory will be tackled …

[1]

I've read these articles, they are great at piquing the curiosity but they only introduce the concept of a class factory and skip the much needed details of when __call____new__ or __init__ are called, what order, and from which objects.

These one do briefly mention the details about call order, but I just don't like the way the articles are organized. Also, no nice diagrams. All in all, they don't explain it well.

I'm not saying they are bad, they just don't resonate in my head.

A notable resource is David Beazley's PyCon 2013 awesome Python 3 Metaprogramming tutorial, that you can just watch over and over. However it's still introductory material and because it's a video, searching for information is quite hard.

There's also docs.python.org/3/reference/datamodel.html, which is thorough, but it's easy to get lost in the trillion of details presented there.

[2] It can be argued that the problem is the language having all sorts of unnecessary complications around the object model but that's not an argument against learning how it works. Take the bad with the good, as they say. You can't say “don't use metaclasses” just cause you don't like this particular feature of the language or don't want to spend the time to learn how it works.
[3]
creative type

A programmer willing to go out of his comfort zone, learn new things and challenge common practices.

(I made that up)

If you're reading this then you probably want metaclasses. Or at least to understand them.

[4] The object model in Python 2 is just terribly bloated. Save yourself the trouble and use Python 3 :-)
[5] New-style objects make this especially confusing, because different lookup rules for magic methods.
[6] (1234567)

Class.__dict__ is not a real dict but a special dict proxy that will also look in all the base classes.

For clarity I've simplified things to Class.__dict__['foobar'] but in reality the __dict__ proxy object is not used, but instead all the logic is inlined: attributes are looked up on the class of the object in the order of the __mro__ via typeobject.c:_PyType_Lookup (called from object.c: _PyObject_GenericGetAttrWithDict).

Also, in case you dared to read the C code and got suspicious: _PyObject_GenericGetAttrWithDictcalls _PyType_Lookup quite early, because it might need to use the data descriptor [8] before checking the instance's __dict__.

[7]
struct
A struct in the C programming language (and many derivatives) is a complex data type declaration that defines a physically grouped list of variables to be placed under one name in a block of memory, allowing the different variables to be accessed via a single pointer, or the struct declared name which returns the same address.

Source: wikipedia.

[8] (12345) A data descriptor is an object with both a __get__ and __set__ methods. See reference.
[9] (12)

Functions implement __get__ - that's how you get the “bound method”. However, functions are non-data descriptors [8].

A normal function:

>>> def foo(*args):
... print(args)
...
>>> foo
<function foo at 0x00000000047AD6A8>
>>> foo(1)
(1,)

We can turn it into a method:

>>> foo.__get__("stuff", None)
<bound method str.foo of 'stuff'>
>>> foo.__get__("stuff", None)(1)
('stuff', 1)
[10] The the default __getattribute__ implementation for objects is in object.c:slot_tp_getattr_hook. It calls in turn Most of the lookup workflow is implemented in object.c:PyObject_GenericGetAttr andobject.c: _PyObject_GenericGetAttrWithDict to do bulk of the lookup.
[11]

For lookup in types (classes), see: typeobject.c:type_getattro and typeobject.c:PyType_Type.

Make a note of this:

>>> class Desc:
... def __init__(self, value):
... self.value = value
...
... def __get__(self, instance, owner):
... return "%r from descriptor" % self.value
...
>>> class Foo:
... pass
...
>>> Foo.a = Desc('a')
>>> Foo.a
"'a' from descriptor"

And that's how staticmethod and classmethod descriptors work if you try to access them via class.

Note that property (which is also a descriptor) will work differently:

>>> Foo.x = property(lambda: "whatever")
>>> Foo.x
<property object at 0x0000000003767B88>

Looks odd, it's like the descriptor doesn't work there. Why? Because that's how property.__get__works:

>>> prop = property(lambda: "whatever")
>>> prop.__get__(None, 123) is prop
True

Whoa. It just returns itself if there's no instance. And there wouldn't be any when accessing an attribute via the class.

[12] (12)

In the source you're going to see things like object->ob_type->tp_call (source). No fancy schmancy lookup rules.

In case you were wondering how it works with multiple inheritance, when a class is created it will copy the slots from its bases, see type_new (which calls PyType_Ready). In other words, lookups are precomputed - a stark contrast to how normal methods are looked up. Descriptors are also handled in that phase, see update_one_slot.

All in all, there's an incredible amount of logic necessary to maintain the slots table. I suspect there are significant performance gains here (compared to old-style [13] classes that don't have slots).

[13] (12)

There were acute differences between types (classes implemented in C) and classes (implemented in pure Python) in Python 2 and especially when using old-style classes but it's more streamlined and consistent now in Python 3.

In other words, in Python 3 class and type mean the same thing. Or at least in theory ...

Understanding Python metaclasses的更多相关文章

  1. Python metaclasses

    metaclasses元类:就像对象是类的实例一样,类是它的元类的实例.调用元类可以创建类. metaclass使用type来创建类,type可以被继承生成新的元类. 这个和C#的反射很相似. 下面是 ...

  2. 从底层理解Python的执行

    摘要:是否想在Python解释器的内部晃悠一圈?是不是想实现一个Python代码执行的追踪器?没有基础?不要怕,这篇文章让你初窥Python底层的奥妙. [编者按]下面博文将带你创建一个字节码级别的追 ...

  3. 转: 理解Python的With语句

    Python’s with statement provides a very convenient way of dealing with the situation where you have ...

  4. 从底层简析Python程序的执行过程

    摘要:是否想在Python解释器的内部晃悠一圈?是不是想实现一个Python代码执行的追踪器?没有基础?不要怕,这篇文章让你初窥Python底层的奥妙. [编者按]下面博文将带你创建一个字节码级别的追 ...

  5. Python高级特性: 12步轻松搞定Python装饰器

    12步轻松搞定Python装饰器 通过 Python 装饰器实现DRY(不重复代码)原则:  http://python.jobbole.com/84151/   基本上一开始很难搞定python的装 ...

  6. Python 装饰器使用指南

    装饰器是可调用的对象,其参数是另一个函数(被装饰的函数). 1 装饰器基础知识 首先看一下这段代码 def deco(fn): print "I am %s!" % fn.__na ...

  7. Protocol Buffer Basics: Python

    原文https://developers.google.com/protocol-buffers/docs/pythontutorial Protocol Buffer Basics: Python ...

  8. 理解 python 装饰器

    变量 name = 'world' x = 3 变量是代表某个值的名字 函数 def hello(name): return 'hello' + name hello('word) hello wor ...

  9. [转] Python中的装饰器(decorator)

    想理解Python的decorator首先要知道在Python中函数也是一个对象,所以你可以 将函数复制给变量 将函数当做参数 返回一个函数 函数在Python中和变量的用法一样也是一等公民,也就是高 ...

随机推荐

  1. officetohtml

    http://blog.csdn.net/mcpang/article/details/6817643

  2. Java模板引擎 HTTL

    新一代java模板引擎典范 Beetl http://www.oschina.net/p/httl HTTL(Hyper-Text Template Language)是一个高性能的开源JAVA模板引 ...

  3. MySQL中SSL配置

    http://wenku.baidu.com/link?url=Tl71LnP-mqf-HExIRLWviUINgkfHMbd4hL2WGhuUHQlDwcw3QVfuTgcB6CiIMgvszY9W ...

  4. ettercap ARP dns 欺骗

    1.arp 这个简单,太熟了.略过1     2.dns   根据arp欺骗的步骤. 多了个etter.dns文件 找到它:locate etter.dns 进入后添加dns正向解析     启动,选 ...

  5. 剑指offer系列55---最小的k个数

    [题目] 输入n个整数,找出其中最小的K个数.例如输入4,5,1,6,2,7,3,8这8个数字,则最小的4个数字是1,2,3,4,. *[思路]排序,去除k后的数. package com.exe11 ...

  6. Redis-sentinel监控

    Sentinel介绍 Redis的 Sentinel 系统用于管理多个Redis服务器, 该系统执行以下三个任务: 监控(Monitoring) 提醒(Notification) 自动故障迁移(Aut ...

  7. 预定义宏__GNUC__和_MSC_VER

    一.预定义__GNUC__宏 1 __GNUC__ 是gcc编译器编译代码时预定义的一个宏.需要针对gcc编写代码时, 可以使用该宏进行条件编译. 2 __GNUC__ 的值表示gcc的版本.需要针对 ...

  8. 理解和配置 Linux 下的 OOM Killer

    原文:http://www.vpsee.com/2013/10/how-to-configure-the-linux-oom-killer/ 最近有位 VPS 客户抱怨 MySQL 无缘无故挂掉,还有 ...

  9. android学习笔记32——资源

    Android应用资源 资源分类: 1.无法直接访问的原生资源,保存于asset目录下 2.可通过R资源清单类访问的资源,保存于res目录下 资源的类型以及存储方式 android要求在res目录下用 ...

  10. android学习笔记23——菜单

    菜单在桌面应用程序中使用非常广泛,由于手机屏幕的制约,菜单在手机应用中减少不少. android应用中的菜单默认是不可见的,只有当用户单击手机上“Menu”键时,系统才会显示该应用关联的采用项. an ...