一、简单介绍ceilometer

这里长话短说, ceilometer是用来采集openstack下面各种资源的在某一时刻的资源值,比如云硬盘的大小等。下面是官网现在的架构图

这里除了ceilometer的架构图,还有另外三个组件:

  • Panko 用来存储事件的, 后面用来实现cloudkitty事件秒级计费也是我的工作之一,目前实现来一部分,有时间单独在写一篇博文。
  • gnocchi是用来存储ceilometer的计量数据,之前的版本是存在mongo中, 不过随着计量数据的不断累计, 查询性能变得极低, 因此openstack后面推出来gnocchi项目,gnocchi的存储后端支持redis,file,ceph等等。这一块也是我负责,目前已经实现了, 有时间也可以写一篇文章。
  • Aodh 是用来告警的。

这里需要注意ceilometer 主要有两个agent:

  • 一个是polling 主要是定时调用相应的openstack接口来获取计量数据,
  • 一个是notification 主要是用来监听openstack的事件消息,然后转换成相应的计量数据,

两者的计量数据, 最后通过定义的pipeline,传递给gnocchi暴露出来的rest API ,后面由gnocchi来做聚合处理以及存储

下面来看一下,具体的官网的数据采集和处理转发的架构图

再来看一下数据的处理过程

这里,我觉得官方文档的架构图描述得非常好, 我就不在多说来。

二、源码分析

  说实话, 目测openstack估计是最大的python项目了,真的是一个庞然大物。第一次接触的时候,完全不知所措。不过看看就强一点了, 虽然有很多地方还是懵逼。看openstack下面的项目的话,其实有些文件很重要比如setup.py, 里面配置了项目的入口点。这篇文章,我主要分析polling这一块是如何实现的, 其他的地方类似。

  • ceilometer polling-agent启动的地方

    # ceilometer/cmd/polling.py
    1 def main():
    conf = cfg.ConfigOpts()
    conf.register_cli_opts(CLI_OPTS)
    service.prepare_service(conf=conf)
    sm = cotyledon.ServiceManager()
    sm.add(create_polling_service, args=(conf,))
    oslo_config_glue.setup(sm, conf)
    sm.run() # 前面几行是读取配置文件, 然后通过cotyledon这个库add一个polling的service,最后run 起来。 cotyledon这个库简单看了一下,可以用来启动进程任务
    def create_polling_service(worker_id, conf):
    return manager.AgentManager(worker_id,
    conf,
    conf.polling_namespaces,
    conf.pollster_list)
    # create_polling_service 返回了一个polling agent polling-namespaces的默认值为choices=['compute', 'central', 'ipmi'],
    
    
  • polling AgentManager # ceilometer/agent/manager.py
 class AgentManager(service_base.PipelineBasedService):

     def __init__(self, worker_id, conf, namespaces=None, pollster_list=None, ):

         namespaces = namespaces or ['compute', 'central']
pollster_list = pollster_list or []
group_prefix = conf.polling.partitioning_group_prefix # features of using coordination and pollster-list are exclusive, and
# cannot be used at one moment to avoid both samples duplication and
# samples being lost
if pollster_list and conf.coordination.backend_url:
raise PollsterListForbidden() super(AgentManager, self).__init__(worker_id, conf) def _match(pollster):
"""Find out if pollster name matches to one of the list."""
return any(fnmatch.fnmatch(pollster.name, pattern) for
pattern in pollster_list) if type(namespaces) is not list:
namespaces = [namespaces] # we'll have default ['compute', 'central'] here if no namespaces will
# be passed
extensions = (self._extensions('poll', namespace, self.conf).extensions
for namespace in namespaces)
# get the extensions from pollster builder
extensions_fb = (self._extensions_from_builder('poll', namespace)
for namespace in namespaces)
if pollster_list:
extensions = (moves.filter(_match, exts)
for exts in extensions)
extensions_fb = (moves.filter(_match, exts)
for exts in extensions_fb) self.extensions = list(itertools.chain(*list(extensions))) + list(
itertools.chain(*list(extensions_fb))) if self.extensions == []:
raise EmptyPollstersList() discoveries = (self._extensions('discover', namespace,
self.conf).extensions
for namespace in namespaces)
self.discoveries = list(itertools.chain(*list(discoveries)))
self.polling_periodics = None self.partition_coordinator = coordination.PartitionCoordinator(
self.conf)
self.heartbeat_timer = utils.create_periodic(
target=self.partition_coordinator.heartbeat,
spacing=self.conf.coordination.heartbeat,
run_immediately=True) # Compose coordination group prefix.
# We'll use namespaces as the basement for this partitioning.
namespace_prefix = '-'.join(sorted(namespaces))
self.group_prefix = ('%s-%s' % (namespace_prefix, group_prefix)
if group_prefix else namespace_prefix) self.notifier = oslo_messaging.Notifier(
messaging.get_transport(self.conf),
driver=self.conf.publisher_notifier.telemetry_driver,
publisher_id="ceilometer.polling") self._keystone = None
self._keystone_last_exception = None def run(self):
super(AgentManager, self).run()
self.polling_manager = pipeline.setup_polling(self.conf)
self.join_partitioning_groups()
self.start_polling_tasks()
self.init_pipeline_refresh() 1 初始化函数里面通过 ExtensionManager加载setup里面定义的各个指标的entry point 包括discover和poll,
discover就是调用openstack的api来get 资源,
poll 就是将discover获取到资源转换成相应的sample(某一时刻的指标值)
2 如果有多个agent 还会创建一个定时器来做心跳检测

3 定义收集到的数据通过消息队列转发送到哪里去 (oslo_messaging.Notifier)

4 之后通过run方法启动polling agent

# setup.py
ceilometer.discover.compute =
local_instances = ceilometer.compute.discovery:InstanceDiscovery

ceilometer.poll.compute =
    disk.read.requests = ceilometer.compute.pollsters.disk:ReadRequestsPollster
disk.write.requests = ceilometer.compute.pollsters.disk:WriteRequestsPollster
disk.read.bytes = ceilometer.compute.pollsters.disk:ReadBytesPollster
disk.write.bytes = ceilometer.compute.pollsters.disk:WriteBytesPollster
disk.read.requests.rate = ceilometer.compute.pollsters.disk:ReadRequestsRatePollster
  ...... 
  • 设置 polling 比如多长的时间间隔去获取资源的指标

     def setup_polling(conf):
    """Setup polling manager according to yaml config file."""
    cfg_file = conf.polling.cfg_file
    return PollingManager(conf, cfg_file)
    class PollingManager(ConfigManagerBase):
    """Polling Manager Polling manager sets up polling according to config file.
    """ def __init__(self, conf, cfg_file):
    """Setup the polling according to config. The configuration is supported as follows: {"sources": [{"name": source_1,
    "interval": interval_time,
    "meters" : ["meter_1", "meter_2"],
    "resources": ["resource_uri1", "resource_uri2"],
    },
    {"name": source_2,
    "interval": interval_time,
    "meters" : ["meter_3"],
    },
    ]}
    } The interval determines the cadence of sample polling Valid meter format is '*', '!meter_name', or 'meter_name'.
    '*' is wildcard symbol means any meters; '!meter_name' means
    "meter_name" will be excluded; 'meter_name' means 'meter_name'
    will be included. Valid meters definition is all "included meter names", all
    "excluded meter names", wildcard and "excluded meter names", or
    only wildcard. The resources is list of URI indicating the resources from where
    the meters should be polled. It's optional and it's up to the
    specific pollster to decide how to use it. """
    super(PollingManager, self).__init__(conf)
    try:
    cfg = self.load_config(cfg_file)
    except (TypeError, IOError):
    LOG.warning(_LW('Unable to locate polling configuration, falling '
    'back to pipeline configuration.'))
    cfg = self.load_config(conf.pipeline_cfg_file)
    self.sources = []
    if 'sources' not in cfg:
    raise PollingException("sources required", cfg)
    for s in cfg.get('sources'):
    self.sources.append(PollingSource(s)) # 根据下面的配置文件 etc/ceilometer/polling.yaml 初始化配置
    ---
    sources:
    - name: all_pollsters
    interval: 600
    meters:
    - "*"
  • 将每个discovery根据相应的group id 加入到同一个组里面去
         def join_partitioning_groups(self):
    self.groups = set([self.construct_group_id(d.obj.group_id)
    for d in self.discoveries])
    # let each set of statically-defined resources have its own group
    static_resource_groups = set([
    self.construct_group_id(utils.hash_of_set(p.resources))
    for p in self.polling_manager.sources
    if p.resources
    ])
    self.groups.update(static_resource_groups) if not self.groups and self.partition_coordinator.is_active():
    self.partition_coordinator.stop()
    self.heartbeat_timer.stop() if self.groups and not self.partition_coordinator.is_active():
    self.partition_coordinator.start()
    utils.spawn_thread(self.heartbeat_timer.start) for group in self.groups:
    self.partition_coordinator.join_group(group)
  • 开启polling task
         def start_polling_tasks(self):
    # allow time for coordination if necessary
    delay_start = self.partition_coordinator.is_active() # set shuffle time before polling task if necessary
    delay_polling_time = random.randint(
    0, self.conf.shuffle_time_before_polling_task) data = self.setup_polling_tasks() # Don't start useless threads if no task will run
    if not data:
    return # One thread per polling tasks is enough
    self.polling_periodics = periodics.PeriodicWorker.create(
    [], executor_factory=lambda:
    futures.ThreadPoolExecutor(max_workers=len(data))) for interval, polling_task in data.items():
    delay_time = (interval + delay_polling_time if delay_start
    else delay_polling_time) @periodics.periodic(spacing=interval, run_immediately=False)
    def task(running_task):
    self.interval_task(running_task) utils.spawn_thread(utils.delayed, delay_time,
    self.polling_periodics.add, task, polling_task) utils.spawn_thread(self.polling_periodics.start, allow_empty=True)
    # 根据之前的polling.yaml和从setup文件动态加载的extensions生成一个个task
    def setup_polling_tasks(self):
    polling_tasks = {}
    for source in self.polling_manager.sources:
    polling_task = None
    for pollster in self.extensions:
    if source.support_meter(pollster.name):
    polling_task = polling_tasks.get(source.get_interval())
    if not polling_task:
    polling_task = self.create_polling_task()
    polling_tasks[source.get_interval()] = polling_task
    polling_task.add(pollster, source)
    return polling_tasks 之后通过periodics 和polling.yaml定义的间隔周期性的执行任务
    def interval_task(self, task):
    # NOTE(sileht): remove the previous keystone client
    # and exception to get a new one in this polling cycle.
    self._keystone = None
    self._keystone_last_exception = None task.poll_and_notify()
    def poll_and_notify(self):
    """Polling sample and notify."""
    cache = {}
    discovery_cache = {}
    poll_history = {}
    for source_name in self.pollster_matches:
    for pollster in self.pollster_matches[source_name]:
    key = Resources.key(source_name, pollster)
    candidate_res = list(
    self.resources[key].get(discovery_cache))
    if not candidate_res and pollster.obj.default_discovery:
    candidate_res = self.manager.discover(
    [pollster.obj.default_discovery], discovery_cache) # Remove duplicated resources and black resources. Using
    # set() requires well defined __hash__ for each resource.
    # Since __eq__ is defined, 'not in' is safe here.
    polling_resources = []
    black_res = self.resources[key].blacklist
    history = poll_history.get(pollster.name, [])
    for x in candidate_res:
    if x not in history:
    history.append(x)
    if x not in black_res:
    polling_resources.append(x)
    poll_history[pollster.name] = history # If no resources, skip for this pollster
    if not polling_resources:
    p_context = 'new ' if history else ''
    LOG.info(_LI("Skip pollster %(name)s, no %(p_context)s"
    "resources found this cycle"),
    {'name': pollster.name, 'p_context': p_context})
    continue LOG.info(_LI("Polling pollster %(poll)s in the context of "
    "%(src)s"),
    dict(poll=pollster.name, src=source_name))
    try:
    polling_timestamp = timeutils.utcnow().isoformat()
    samples = pollster.obj.get_samples(
    manager=self.manager,
    cache=cache,
    resources=polling_resources
    )
    sample_batch = [] for sample in samples:
    # Note(yuywz): Unify the timestamp of polled samples
    sample.set_timestamp(polling_timestamp)
    sample_dict = (
    publisher_utils.meter_message_from_counter(
    sample, self._telemetry_secret
    ))
    if self._batch:
    sample_batch.append(sample_dict)
    else:
    self._send_notification([sample_dict]) if sample_batch:
    self._send_notification(sample_batch) except plugin_base.PollsterPermanentError as err:
    LOG.error(_LE(
    'Prevent pollster %(name)s from '
    'polling %(res_list)s on source %(source)s anymore!')
    % ({'name': pollster.name, 'source': source_name,
    'res_list': err.fail_res_list}))
    self.resources[key].blacklist.extend(err.fail_res_list)
    except Exception as err:
    LOG.error(_LE(
    'Continue after error from %(name)s: %(error)s')
    % ({'name': pollster.name, 'error': err}),
    exc_info=True)
    # 循环调用discovery的extensions的 discover方法,获取资源, 之后调用polling的extensions的get_samples方法将资源转换成相应的指标对象sample
    之后将消息发送到消息队列里面去。然后由ceilometer的notification agnet 获取,之后在做进一步的转换发送给gnocchi

polling agent 的基本过程就是这样的,后面就是notification agent 的处理

ceilometer 源码分析(polling)(O版)的更多相关文章

  1. 一步步实现windows版ijkplayer系列文章之六——SDL2源码分析之OpenGL ES在windows上的渲染过程

    一步步实现windows版ijkplayer系列文章之一--Windows10平台编译ffmpeg 4.0.2,生成ffplay 一步步实现windows版ijkplayer系列文章之二--Ijkpl ...

  2. 一步步实现windows版ijkplayer系列文章之三——Ijkplayer播放器源码分析之音视频输出——音频篇

    一步步实现windows版ijkplayer系列文章之一--Windows10平台编译ffmpeg 4.0.2,生成ffplay 一步步实现windows版ijkplayer系列文章之二--Ijkpl ...

  3. 一步步实现windows版ijkplayer系列文章之二——Ijkplayer播放器源码分析之音视频输出——视频篇

    一步步实现windows版ijkplayer系列文章之一--Windows10平台编译ffmpeg 4.0.2,生成ffplay 一步步实现windows版ijkplayer系列文章之二--Ijkpl ...

  4. JVM源码分析之警惕存在内存泄漏风险的FinalReference(增强版)

    概述 JAVA对象引用体系除了强引用之外,出于对性能.可扩展性等方面考虑还特地实现了四种其他引用:SoftReference.WeakReference.PhantomReference.FinalR ...

  5. 一个由正则表达式引发的血案 vs2017使用rdlc实现批量打印 vs2017使用rdlc [asp.net core 源码分析] 01 - Session SignalR sql for xml path用法 MemCahe C# 操作Excel图形——绘制、读取、隐藏、删除图形 IOC,DIP,DI,IoC容器

    1. 血案由来 近期我在为Lazada卖家中心做一个自助注册的项目,其中的shop name校验规则较为复杂,要求:1. 英文字母大小写2. 数字3. 越南文4. 一些特殊字符,如“&”,“- ...

  6. Duilib源码分析(三)XML解析器—CMarkup

    上一节介绍了控件构造器CDialogBuilder,接下来将分析其XML解析器CMarkup: CMarkup:xml解析器,目前内置支持三种编码格式:UTF8.UNICODE.ASNI,默认为UTF ...

  7. YARN DistributedShell源码分析与修改

    YARN DistributedShell源码分析与修改 YARN版本:2.6.0 转载请注明出处:http://www.cnblogs.com/BYRans/ 1 概述 2 YARN Distrib ...

  8. jquery2源码分析系列

    学习jquery的源码对于提高前端的能力很有帮助,下面的系列是我在网上看到的对jquery2的源码的分析.等有时间了好好研究下.我们知道jquery2开始就不支持IE6-8了,从jquery2的源码中 ...

  9. [源码]String StringBuffer StringBudlider(2)StringBuffer StringBuilder源码分析

      纵骑横飞 章仕烜   昨天比较忙 今天把StringBuffer StringBulider的源码分析 献上   在讲 StringBuffer StringBuilder 之前 ,我们先看一下 ...

随机推荐

  1. lvm xfs 扩容

    lvresize -L 300M /dev/vg1/lv1 #重新设定大小 e2fsck -f /dev/vg1/lv1 #检查磁盘错误 (针对ext4执行) resize2fs /dev/vg1/l ...

  2. 【Java123】JDBC数据库连接池建立

    需求场景:多SQL任务多线程并行执行 解决方案:建立JDBC数据库连接池,将线程与连接一对一绑定 https://www.cnblogs.com/panxuejun/p/5920845.html ht ...

  3. VMware虚拟机安装Mac OS X

    安装mac系统学习网站来源:http://blog.csdn.net/hamber_bao/article/details/51335834 1.下载安装VMware workstation (1)首 ...

  4. Jython的应用

    今天本文围绕主要内容是jython是什么.安装.简单实用. 另外说说我为什么研究jython,研究它是有一个目的的,目的是将python代码转化为jar包以供安卓方面那边人脸识别,虽说目前人脸识别像阿 ...

  5. 【转】对H264进行RTP封包原理

    1. 引言     H.264/AVC 是ITU-T 视频编码专家组(VCEG)和ISO/IEC 动态图像专家组(MPEG )联合组成的联合视频组(JVT)共同努力制订的新一代视频编码标准,它最大的优 ...

  6. 愤怒的小鸟【$DP$优化】

    卡常的状压\(DP\),愤怒的小鸟. 其实本来是个很水的状压\(DP\),但因为最后三个点\(n=18\),成功地把我的不可能达到的下界为\(\Omega(2^nn^2)\),紧确的上界为\(O(2^ ...

  7. Log4J日志配置与Juit测试

    一.Log4j简介 Log4j有三个主要的组件:Loggers(记录器).Appenders(输出源)和Layouts(布局). 这里可简单理解为日志类别,日志要输出的地方和日志以何种形式输出.综合使 ...

  8. bapi获取物料的可用数量及MRP信息(MD04)

    需求:在报表里面添加可用数量字段,数据来源于MD04. 使用到的bapi:MD_STOCK_REQUIREMENTS_LIST_API 这个bapi可以查看到MRP信息以及可用数量. bapi需要的参 ...

  9. MAC mojave版本系统 破解软件

    MAC mojave版本系统 破解软件 1 :Adobe Photoshop CC 2019.0.1 Mac中文破解版 2 :Office 2019 16.19 Mac中文破解版 3 :Paralle ...

  10. BZOJ4145_The Prices_KEY

    题目传送门 看到M<=16经典状态压缩的数据范围,考虑题目. 一道类似于背包的题目. 设f[i][j]表示前i个商店,物品购买状态为j. 先将f[i][j]加上w[i](到i的路费),转移一次, ...