Splunk Enterprise architecture and processes

This topic discusses the internal architecture and processes of Splunk Enterprise at a high level. If you're looking for information about third-party components used in Splunk Enterprise, see the credits section in the Release notes.

Splunk Enterprise Processes

A Splunk Enterprise server installs a process on your host, splunkd.

splunkd is a distributed C/C++ server that accesses, processes and indexes streaming IT data. It also handles search requests. splunkd processes and indexes your data by streaming it through a series of pipelines, each made up of a series of processors.

  • Pipelines are single threads inside the splunkd process, each configured with a single snippet of XML.
  • Processors are individual, reusable C or C++ functions that act on the stream of IT data that passes through a pipeline. Pipelines can pass data to one another through queues.

Architecture diagram

注意:负载均衡,副本!

Splunk Architecture

A Bit About Architecture

Splunk is a high performance, scalable software server written in C/C++ and Python. It indexes and searches logs and other IT data in real time. Splunk works with data generated by any application, server or device. The Splunk Developer API is accessible via REST, SOAP or the command line. After downloading, installing and starting Splunk, you'll find two Splunk Server processes running on your host, splunkd and splunkweb.

    • splunkd is a distributed C/C++ server that accesses, processes and indexes streaming IT data and also handles search requests. splunkd processes and indexes your data by streaming it through a series of pipelines, each made up of a series of processors. Pipelines are single threads inside the splunkd process, each configured with a single snippet of XML. Processors are individual, reusable C/C++ or Python functions that act on the stream of IT data passing through a pipeline. Pipelines can pass data to one another via queues. splunkd supports a command line interface for searching and viewing results.
  • splunkweb is a Python-based application server providing the Splunk Web user interface. It allows users to search and navigate IT data stored by Splunk servers and to manage your Splunk deployment through the browser interface. splunkweb communicates with your web browser via REST and communicates with splunkd via SOAP.

    • Splunk's Data Store manages the original raw data in compressed format as well as the indexes into the data. Data can be deleted or archived based on retention period or maximum data store size.
    • Splunk Servers can communicate with one another via Splunk-2-Splunk, a TCP-based protocol, to forward data from one server to another and to distribute searches across multiple servers.
    • Bundles are files that contain configuration settings including, user accounts, Splunks, Live Splunks, Data Inputs and Processing Properties to easily create specific Splunk environments.
  • Modules are files that add new functionality to Splunk by adding to or modifying existing processors and pipelines.

About forwarding and receiving

You can forward data from one Splunk instance to another Splunk server or even to a non-Splunk system. The Splunk instance that performs theforwarding is typically a smaller footprint version of Splunk, called a forwarder.

A Splunk instance that receives data from one or more forwarders is called a receiver. The receiver is usually a Splunk indexer, but can also be another forwarder, as described here.

This diagram shows three forwarders sending data to a single Splunk receiver (an indexer), which then indexes the data and makes it available for searching:

Splunk Enterprise architecture——转发器本质上是日志收集client附加负载均衡,indexer是分布式索引,外加一个集中式管理协调的中心节点的更多相关文章

  1. Jexus是一款Linux平台上的高性能WEB服务器和负载均衡网关

    什么是Jexus Jexus是一款Linux平台上的高性能WEB服务器和负载均衡网关,以支持ASP.NET.ASP.NET CORE.PHP为特色,同时具备反向代理.入侵检测等重要功能.可以这样说,J ...

  2. 动态DNS——本质上是IP变化,将任意变换的IP地址绑定给一个固定的二级域名。不管这个线路的IP地址怎样变化,因特网用户还是可以使用这个固定的域名 这样看的话,p2p可以用哇

    动态域名是因应网络远程访问的需要而产生的一项应用技术.因为没有固定IP,只能运用二级域名来应对经常变化的IP,动态域名的由来因此而产生. 它当前主要应用在:路由器.网络摄像机.带网络监控的硬盘录像机. ...

  3. Linux下Rsyslog日志远程集中式管理

    Rsyslog简介 Rsyslog的全称是 rocket-fast system for log,它提供了高性能,高安全功能和模块化设计.rsyslog能够接受从各种各样的来源,将其输入,输出的结果到 ...

  4. Dubbo入门到精通学习笔记(二十):MyCat在MySQL主从复制的基础上实现读写分离、MyCat 集群部署(HAProxy + MyCat)、MyCat 高可用负载均衡集群Keepalived

    文章目录 MyCat在MySQL主从复制的基础上实现读写分离 一.环境 二.依赖课程 三.MyCat 介绍 ( MyCat 官网:http://mycat.org.cn/ ) 四.MyCat 的安装 ...

  5. Centos7搭建集中式日志系统

    在CentOS7中,Rsyslong是一个集中式的日志收集系统,可以运行在TCP或者UDP的514端口上.   目录 开始之前 配置接收日志的主机 配置发送日志的主机 日志回滚 附件:创建日志接收模板 ...

  6. flume集群日志收集

    一.Flume简介 Flume是一个分布式的.高可用的海量日志收集.聚合和传输日志收集系统,支持在日志系统中定制各类数据发送方(如:Kafka,HDFS等),便于收集数据.其核心为agent,agen ...

  7. ELK日志收集平台部署

    需求背景 由于公司的后台服务有三台,每当后台服务运行异常,需要看日志排查错误的时候,都必须开启3个ssh窗口进行查看,研发们觉得很不方便,于是便有了统一日志收集与查看的需求. 这里,我用ELK集群,通 ...

  8. 理解OpenShift(6):集中式日志处理

    理解OpenShift(1):网络之 Router 和 Route 理解OpenShift(2):网络之 DNS(域名服务) 理解OpenShift(3):网络之 SDN 理解OpenShift(4) ...

  9. SpringBoot+kafka+ELK分布式日志收集

    一.背景 随着业务复杂度的提升以及微服务的兴起,传统单一项目会被按照业务规则进行垂直拆分,另外为了防止单点故障我们也会将重要的服务模块进行集群部署,通过负载均衡进行服务的调用.那么随着节点的增多,各个 ...

随机推荐

  1. P4180 【模板】严格次小生成树[BJWC2010]

    P4180 [模板]严格次小生成树[BJWC2010] 倍增(LCA)+最小生成树 施工队挖断学校光缆导致断网1天(大雾) 考虑直接枚举不在最小生成树上的边.但是边权可能与最小生成树上的边相等,这样删 ...

  2. 01: 企业微信API开发前准备

    目录:企业微信API其他篇 01: 企业微信API开发前准备 02:消息推送 03: 通讯录管理 04:应用管理 目录: 1.1 术语介绍 1.2 开发步骤 1.1 术语介绍返回顶部 参考文档:htt ...

  3. 01: requests模块

    目录: 1.1 requests模块简介 1.2 使用requests模块发送get请求 1.3 使用requests模块发送post请求 1.4 requests.request()参数介绍 1.1 ...

  4. C++ compile Microsoft Visual C++ Static and Dynamic Libraries

    出处:http://www.codeproject.com/Articles/85391/Microsoft-Visual-C-Static-and-Dynamic-Libraries 出处:http ...

  5. VC++ 删除一个文件目录下的所有文件以及目录

    BOOL DoRemoveDirectory(CString chrDirName); BOOL ReleaseDirectory(CString chrDirName) { BOOL bRemove ...

  6. P4013 数字梯形问题 网络流二十四题

    P4013 数字梯形问题 题目描述 给定一个由 nn 行数字组成的数字梯形如下图所示. 梯形的第一行有 m 个数字.从梯形的顶部的 m 个数字开始,在每个数字处可以沿左下或右下方向移动,形成一条从梯形 ...

  7. How do I update a GitHub forked repository?

    I recently forked a project and applied several fixes. I then created a pull request which was then ...

  8. 用python + hadoop streaming 编写分布式程序(一) -- 原理介绍,样例程序与本地调试

    相关随笔: Hadoop-1.0.4集群搭建笔记 用python + hadoop streaming 编写分布式程序(二) -- 在集群上运行与监控 用python + hadoop streami ...

  9. Ubuntu16.04 无法连接WiFi

    在安装完 ns-3.25 之后,着手开始准备 Eclipse 的安装,打开了 Firefox游览器 准备上网的时候,发现网络没有正常连接. 刚刚开始怀疑的是,并没有连接上网络. 于是打开了终端,pin ...

  10. Linux——bash应用技巧简单学习笔记

    本人是看的lamp兄弟连的视频,学习的知识做一下简单,如有错误尽情拍砖. 命令补齐 命令补齐允许用户输入文件名起始的若干个字 母后,按<Tab>键补齐文件名. 命令历史 命令历史允许用户浏 ...