【Python之路】特别篇--Celery

Celery介绍和基本使用

　　Celery 是一个分布式异步消息队列，通过它可以轻松的实现任务的异步处理

举几个实例场景中可用的例子:

你想对100台机器执行一条批量命令，可能会花很长时间，但你不想让你的程序等着结果返回，而是给你返回一个任务ID,你过一段时间只需要拿着这个任务id就可以拿到任务执行结果，在任务执行ing进行时，你可以继续做其它的事情。
你想做一个定时任务，比如每天检测一下你们所有客户的资料，如果发现今天是客户的生日，就给他发个短信祝福

Celery 在执行任务时需要通过一个消息中间件来接收和发送任务消息，以及存储任务结果，一般使用rabbitMQ or Redis

Celery有以下优点：

简单：一单熟悉了celery的工作流程后，配置和使用还是比较简单的
高可用：当任务执行失败或执行过程中发生连接中断，celery 会自动尝试重新执行任务
快速：一个单进程的celery每分钟可处理上百万个任务
灵活：几乎celery的各个组件都可以被扩展及自定制

Celery基本工作流程图：

Celery安装使用

　　1、Celery的默认broker是RabbitMQ, 仅需配置一行就可以

broker_url = 'amqp://my_user:my_password@localhost:5672//'

　　2、Redis做broker

broker_url = 'redis://localhost:6379'
broker_url = 'redis://:my_password@localhost:port'

　　如果想获取每个任务的执行结果，还需要配置一下把任务结果存在哪

result_backend = 'redis://localhost:6379'

一、创建一个celery application 用来定义你的任务列表

①.创建一个任务 tasks.py

from celery import Celery
 
app = Celery('celery_test',
             broker='redis://localhost',
             backend='redis://localhost')
 
@app.task
def add(x,y):
    print("running...",x,y)
    return x+y

②.启动Celery Worker来开始监听并执行任务

$ celery -A tasks worker --loglevel=info [debug]    # tasks 为 tasks文件路径!
$ celery -A tasks worker -l info

③.调用任务

>>> from tasks import add
>>> add.delay(4, 4)

worker终端会显示收到一个任务，此时你想看任务结果的话，需要在调用任务时　赋值个变量

>>> result = add.delay(4, 4)
 
>>> result.ready()               # 返回执行状态
>>> result.get(timeout=1)        # 超时报错
>>> result.get(propagate=False)  # 程序执行过程出错报异常
>>> result.traceback             # 获取异常信息

注:任务结果需要是可以json转化的，celery代码修改后，worker需要重启

二、在项目中使用celery

可以把celery配置成一个应用

目录格式如下

proj/__init__.py
    /celery.py
    /tasks.py

proj/celery.py内容

from __future__ import absolute_import, unicode_literals
from celery import Celery
 
app = Celery('proj',
             broker='redis://192.168.18.147',
             backend = 'redis://192.168.18.147',
             include=['my_proj.tasks'])
 
# Optional configuration, see the application user guide.
app.conf.update(
    result_expires=3600,
)
 
if __name__ == '__main__':
    app.start()

proj/tasks.py中的内容

from __future__ import absolute_import, unicode_literals
from .celery import app
 
@app.task
def add(x, y):
    return x + y
 
@app.task
def mul(x, y):
    return x * y
 
@app.task
def xsum(numbers):
    return sum(numbers)

启动worker 方式一

$ celery -A proj worker -l info

启动worker 方式二 后台启动

celery multi start w1 -A proj -l info    # w1 自定义名字
celery multi restart w1 -A proj -l info
celery multi stop w1 -A proj -l info

三、Celery 定时任务

celery支持定时任务，设定好任务的执行时间，celery就会定时自动帮你执行，这个定时任务模块叫celery beat

periodic_task.py

from celery import Celery
from celery.schedules import crontab
 
app = Celery()
 
@app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
    # Calls test('hello') every 10 seconds.
    sender.add_periodic_task(10.0, test.s('hello'), name='add every 10')
 
    # Calls test('world') every 30 seconds
    sender.add_periodic_task(30.0, test.s('world'), expires=10)
 
    # Executes every Monday morning at 7:30 a.m.
    sender.add_periodic_task(
        crontab(hour=7, minute=30, day_of_week=1),
        test.s('Happy Mondays!'),
    )
 
@app.task
def test(arg):
    print(arg)

add_periodic_task 会添加一条定时任务

上面是通过调用函数添加定时任务，也可以像写配置文件一样的形式添加，下面是每30s执行的任务

app.conf.beat_schedule = {
    'add-every-30-seconds': {
        'task': 'tasks.add',
        'schedule': 30.0,
        'args': (16, 16)
    },
}
app.conf.timezone = 'UTC'

　　任务添加好了，需要让celery单独启动一个进程来定时发起这些任务，

　　注意，这里是发起任务，不是执行，这个进程只会不断的去检查你的任务计划，每发现有任务需要执行了，就发起一个任务调用消息，交给celery worker去执行

启动任务调度器 celery beat

$ celery -A  periodic_task beat

启动celery worker来执行任务

$ celery -A periodic_task worker

更复杂的定时配置　

from celery.schedules import crontab
 
app.conf.beat_schedule = {
    # Executes every Monday morning at 7:30 a.m.
    'add-every-monday-morning': {
        'task': 'tasks.add',
        'schedule': crontab(hour=7, minute=30, day_of_week=1),
        'args': (16, 16),
    },
}

上面的这条意思是每周1的早上7.30执行tasks.add任务

还有更多定时配置方式如下：

Example	Meaning
crontab()	Execute every minute.
crontab(minute=0, hour=0)	Execute daily at midnight.
crontab(minute=0, hour='*/3')	Execute every three hours: midnight, 3am, 6am, 9am, noon, 3pm, 6pm, 9pm.
crontab(minute=0,hour='0,3,6,9,12,15,18,21')	Same as previous.
crontab(minute='*/15')	Execute every 15 minutes.
crontab(day_of_week='sunday')	Execute every minute (!) at Sundays.
crontab(minute='',hour='',day_of_week='sun')	Same as previous.
crontab(minute='*/10',hour='3,17,22',day_of_week='thu,fri')	Execute every ten minutes, but only between 3-4 am, 5-6 pm, and 10-11 pm on Thursdays or Fridays.
crontab(minute=0,hour='/2,/3')	Execute every even hour, and every hour divisible by three. This means: at every hour except: 1am, 5am, 7am, 11am, 1pm, 5pm, 7pm, 11pm
crontab(minute=0, hour='*/5')	Execute hour divisible by 5. This means that it is triggered at 3pm, not 5pm (since 3pm equals the 24-hour clock value of “15”, which is divisible by 5).
crontab(minute=0, hour='*/3,8-17')	Execute every hour divisible by 3, and every hour during office hours (8am-5pm).
crontab(0, 0,day_of_month='2')	Execute on the second day of every month.
crontab(0, 0,day_of_month='2-30/3')	Execute on every even numbered day.
crontab(0, 0,day_of_month='1-7,15-21')	Execute on the first and third weeks of the month.
crontab(0, 0,day_of_month='11', month_of_year='5')	Execute on the eleventh of May every year.
crontab(0, 0,month_of_year='*/3')	Execute on the first month of every quarter.

上面能满足你绝大多数定时任务需求了，甚至还能根据潮起潮落来配置定时任务，

具体看 http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#solar-schedules 　　

四、Django项目中使用celery

目录格式:

CeleryTest/
    CeleryTest/__init__.py
              /celery.py
              /setting.py
    app/task.py
        ....
       /views.py

celery.py内容

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
 
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'CeleryTest.settings')
 
app = Celery('CeleryTest')
 
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
 
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
 
@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))

__init__.py内容

from __future__ import absolute_import, unicode_literals
 
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
 
__all__ = ['celery_app']

setting.py内容

CELERY_BROKER_URL = 'redis://192.168.18.147'
CELERY_RESULT_BACKEND = 'redis://192.168.18.147'

tasks.py内容 (必须在各app根目录下,不能随意命名)

from __future__ import absolute_import, unicode_literals
from celery import shared_task
 
@shared_task
def add(x, y):
    return x + y
 
@shared_task
def mul(x, y):
    return x * y
 
@shared_task
def xsum(numbers):
    return sum(numbers)

views.py调用celery tasks

from app01 import tasks
from celery.result import AsyncResult
 
def index(request):
 
    res = tasks.add.delay(9,8)
    print("start running task")
    return HttpResponse(res.task_id)
 
def get_data(request,task_id):
 
    result = AsyncResult(task_id)
    return HttpResponse(result.status)

AsyncResult 根据返回的id获取结果

调用worker

$:~/..../CeleryTest$ celery -A CeleryTest worker -l info

五、django中使用计划任务功能　

1.安装package

$ pip install django-celery-beat

2.setting中注册app

INSTALLED_APPS = (
        ...,
        'django_celery_beat',
)

3.生成数据库表

$ python manage.py migrate

4. Django-Admin 创建任务

5.开启任务调度器

$ celery -A proj beat -l info -S django

在admin页面里，有3张表

配置完长这样

此时启动你的celery beat 和worker，会发现每隔2分钟，beat会发起一个任务消息让worker执行scp_task任务

注意，经测试，每添加或修改一个任务，celery beat都需要重启一次，要不然新的配置不会被celery beat进程读到