docker搭建clickhouse集群

//需要先搭建zookeeper集群。

机器1：

sudo docker run -d \

--name clickhouse --ulimit nofile=262144:262144 \

-p 8123:8123 -p 9000:9000 -p 9009:9009 \

-v /etc/clickhouse-server/config.xml:/etc/clickhouse-server/config.xml \

-v /var/log/clickhouse-server/:/var/log/clickhouse-server/ \

--add-host pandora-cpu-1.novalocal:172.16.12.130 \

--add-host pandora-cpu-2.novalocal:172.16.12.154 \

--add-host pandora-cpu-3.novalocal:172.16.12.163 \

--hostname pandora-cpu-1.novalocal \

yandex/clickhouse-server

机器2：

sudo docker run -d \

--name clickhouse --ulimit nofile=262144:262144 \

-p 8123:8123 -p 9000:9000 -p 9009:9009 \

-v /etc/clickhouse-server/config.xml:/etc/clickhouse-server/config.xml \

-v /var/log/clickhouse-server/:/var/log/clickhouse-server/ \

--add-host pandora-cpu-1.novalocal:172.16.12.130 \

--add-host pandora-cpu-2.novalocal:172.16.12.154 \

--add-host pandora-cpu-3.novalocal:172.16.12.163 \

--hostname pandora-cpu-2.novalocal \

yandex/clickhouse-server

机器3：

sudo docker run -d \

--name clickhouse --ulimit nofile=262144:262144 \

-p 8123:8123 -p 9000:9000 -p 9009:9009 \

-v /etc/clickhouse-server/config.xml:/etc/clickhouse-server/config.xml \

-v /var/log/clickhouse-server/:/var/log/clickhouse-server/ \

--add-host pandora-cpu-1.novalocal:172.16.12.130 \

--add-host pandora-cpu-2.novalocal:172.16.12.154 \

--add-host pandora-cpu-3.novalocal:172.16.12.163 \

--hostname pandora-cpu-3.novalocal \

yandex/clickhouse-server

停止删除命令

sudo docker stop clickhouse && sudo docker rm clickhouse

/etc/clickhouse-server/config.xml

<?xml version="1.0"?>

<!--

  NOTE: User and query level settings are set up in "users.xml" file.

  If you have accidentally specified user-level settings here, server won't start.

  You can either move the settings to the right place inside "users.xml" file

   or add <skip_check_for_incorrect_settings>1</skip_check_for_incorrect_settings> here.

-->

<yandex>

    <logger>

        <!-- Possible levels: https://github.com/pocoproject/poco/blob/poco-1.9.4-release/Foundation/include/Poco/Logger.h#L105 -->

        <level>trace</level>

        <log>/var/log/clickhouse-server/clickhouse-server.log</log>

        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>

        <size>1000M</size>

        <count>10</count>

        <!-- <console>1</console> --> <!-- Default behavior is autodetection (log to console if not daemon mode and is tty) -->

        <!-- Per level overrides (legacy):

        For example to suppress logging of the ConfigReloader you can use:

        NOTE: levels.logger is reserved, see below.

        -->

        <!--

        <levels>

          <ConfigReloader>none</ConfigReloader>

        </levels>

        -->

        <!-- Per level overrides:

        For example to suppress logging of the RBAC for default user you can use:

        (But please note that the logger name maybe changed from version to version, even after minor upgrade)

        -->

        <!--

        <levels>

          <logger>

            <name>ContextAccess (default)</name>

            <level>none</level>

          </logger>

          <logger>

            <name>DatabaseOrdinary (test)</name>

            <level>none</level>

          </logger>

        </levels>

        -->

    </logger>

    <send_crash_reports>

        <!-- Changing <enabled> to true allows sending crash reports to -->

        <!-- the ClickHouse core developers team via Sentry https://sentry.io -->

        <!-- Doing so at least in pre-production environments is highly appreciated -->

        <enabled>false</enabled>

        <!-- Change <anonymize> to true if you don't feel comfortable attaching the server hostname to the crash report -->

        <anonymize>false</anonymize>

        <!-- Default endpoint should be changed to different Sentry DSN only if you have -->

        <!-- some in-house engineers or hired consultants who're going to debug ClickHouse issues for you -->

        <endpoint>https://6f33034cfe684dd7a3ab9875e57b1c8d@o388870.ingest.sentry.io/5226277</endpoint>

    </send_crash_reports>

    <!--display_name>production</display_name--> <!-- It is the name that will be shown in the client -->

    <http_port>8123</http_port>

    <tcp_port>9000</tcp_port>

    <mysql_port>9004</mysql_port>

    <!-- For HTTPS and SSL over native protocol. -->

    <!--

    <https_port>8443</https_port>

    <tcp_port_secure>9440</tcp_port_secure>

    -->

    <!-- Used with https_port and tcp_port_secure. Full ssl options list: https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h#L71 -->

    <openSSL>

        <server> <!-- Used for https server AND secure tcp port -->

            <!-- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->

            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>

            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>

            <!-- openssl dhparam -out /etc/clickhouse-server/dhparam.pem 4096 -->

            <dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile>

            <verificationMode>none</verificationMode>

            <loadDefaultCAFile>true</loadDefaultCAFile>

            <cacheSessions>true</cacheSessions>

            <disableProtocols>sslv2,sslv3</disableProtocols>

            <preferServerCiphers>true</preferServerCiphers>

        </server>

        <client> <!-- Used for connecting to https dictionary source and secured Zookeeper communication -->

            <loadDefaultCAFile>true</loadDefaultCAFile>

            <cacheSessions>true</cacheSessions>

            <disableProtocols>sslv2,sslv3</disableProtocols>

            <preferServerCiphers>true</preferServerCiphers>

            <!-- Use for self-signed: <verificationMode>none</verificationMode> -->

            <invalidCertificateHandler>

                <!-- Use for self-signed: <name>AcceptCertificateHandler</name> -->

                <name>RejectCertificateHandler</name>

            </invalidCertificateHandler>

        </client>

    </openSSL>

    <!-- Default root page on http[s] server. For example load UI from https://tabix.io/ when opening http://localhost:8123 -->

    <!--

    <http_server_default_response><![CDATA[<html ng-app="SMI2"><head><base href="http://ui.tabix.io/"></head><body><div ui-view="" class="content-ui"></div><script src="http://loader.tabix.io/master.js"></script></body></html>]]></http_server_default_response>

    -->

    <!-- Port for communication between replicas. Used for data exchange. -->

    <interserver_http_port>9009</interserver_http_port>

    <!-- Hostname that is used by other replicas to request this server.

         If not specified, than it is determined analogous to 'hostname -f' command.

         This setting could be used to switch replication to another network interface.

      -->

    <!--

    <interserver_http_host>example.yandex.ru</interserver_http_host>

    -->

    <!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->

    <listen_host>::</listen_host>

    <!-- Same for hosts with disabled ipv6: -->

    <!-- <listen_host>0.0.0.0</listen_host> -->

    <!-- Default values - try listen localhost on ipv4 and ipv6: -->

    <!--

    <listen_host>::1</listen_host>

    <listen_host>127.0.0.1</listen_host>

    -->

    <!-- Don't exit if ipv6 or ipv4 unavailable, but listen_host with this protocol specified -->

    <!-- <listen_try>0</listen_try> -->

    <!-- Allow listen on same address:port -->

    <!-- <listen_reuse_port>0</listen_reuse_port> -->

    <!-- <listen_backlog>64</listen_backlog> -->

    <max_connections>4096</max_connections>

    <keep_alive_timeout>3</keep_alive_timeout>

    <!-- Maximum number of concurrent queries. -->

    <max_concurrent_queries>100</max_concurrent_queries>

    <!-- Maximum memory usage (resident set size) for server process.

         Zero value or unset means default. Default is "max_server_memory_usage_to_ram_ratio" of available physical RAM.

         If the value is larger than "max_server_memory_usage_to_ram_ratio" of available physical RAM, it will be cut down.

         The constraint is checked on query execution time.

         If a query tries to allocate memory and the current memory usage plus allocation is greater

          than specified threshold, exception will be thrown.

         It is not practical to set this constraint to small values like just a few gigabytes,

          because memory allocator will keep this amount of memory in caches and the server will deny service of queries.

      -->

    <max_server_memory_usage>0</max_server_memory_usage>

    <!-- Maximum number of threads in the Global thread pool.

    This will default to a maximum of 10000 threads if not specified.

    This setting will be useful in scenarios where there are a large number

    of distributed queries that are running concurrently but are idling most

    of the time, in which case a higher number of threads might be required.

    -->

    <max_thread_pool_size>10000</max_thread_pool_size>

    <!-- On memory constrained environments you may have to set this to value larger than 1.

      -->

    <max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>

    <!-- Simple server-wide memory profiler. Collect a stack trace at every peak allocation step (in bytes).

         Data will be stored in system.trace_log table with query_id = empty string.

         Zero means disabled.

      -->

    <total_memory_profiler_step>4194304</total_memory_profiler_step>

    <!-- Collect random allocations and deallocations and write them into system.trace_log with 'MemorySample' trace_type.

         The probability is for every alloc/free regardless to the size of the allocation.

         Note that sampling happens only when the amount of untracked memory exceeds the untracked memory limit,

          which is 4 MiB by default but can be lowered if 'total_memory_profiler_step' is lowered.

         You may want to set 'total_memory_profiler_step' to 1 for extra fine grained sampling.

      -->

    <total_memory_tracker_sample_probability>0</total_memory_tracker_sample_probability>

    <!-- Set limit on number of open files (default: maximum). This setting makes sense on Mac OS X because getrlimit() fails to retrieve

         correct maximum value. -->

    <!-- <max_open_files>262144</max_open_files> -->

    <!-- Size of cache of uncompressed blocks of data, used in tables of MergeTree family.

         In bytes. Cache is single for server. Memory is allocated only on demand.

         Cache is used when 'use_uncompressed_cache' user setting turned on (off by default).

         Uncompressed cache is advantageous only for very short queries and in rare cases.

      -->

    <uncompressed_cache_size>8589934592</uncompressed_cache_size>

    <!-- Approximate size of mark cache, used in tables of MergeTree family.

         In bytes. Cache is single for server. Memory is allocated only on demand.

         You should not lower this value.

      -->

    <mark_cache_size>5368709120</mark_cache_size>

    <!-- Path to data directory, with trailing slash. -->

    <path>/var/lib/clickhouse/</path>

    <!-- Path to temporary data for processing hard queries. -->

    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>

    <!-- Policy from the <storage_configuration> for the temporary files.

         If not set <tmp_path> is used, otherwise <tmp_path> is ignored.

         Notes:

         - move_factor              is ignored

         - keep_free_space_bytes    is ignored

         - max_data_part_size_bytes is ignored

         - you must have exactly one volume in that policy

    -->

    <!-- <tmp_policy>tmp</tmp_policy> -->

    <!-- Directory with user provided files that are accessible by 'file' table function. -->

    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>

    <!-- LDAP server definitions. -->

    <ldap_servers>

        <!-- List LDAP servers with their connection parameters here to later 1) use them as authenticators for dedicated local users,

              who have 'ldap' authentication mechanism specified instead of 'password', or to 2) use them as remote user directories.

             Parameters:

                host - LDAP server hostname or IP, this parameter is mandatory and cannot be empty.

                port - LDAP server port, default is 636 if enable_tls is set to true, 389 otherwise.

                auth_dn_prefix, auth_dn_suffix - prefix and suffix used to construct the DN to bind to.

                        Effectively, the resulting DN will be constructed as auth_dn_prefix + escape(user_name) + auth_dn_suffix string.

                        Note, that this implies that auth_dn_suffix should usually have comma ',' as its first non-space character.

                enable_tls - flag to trigger use of secure connection to the LDAP server.

                        Specify 'no' for plain text (ldap://) protocol (not recommended).

                        Specify 'yes' for LDAP over SSL/TLS (ldaps://) protocol (recommended, the default).

                        Specify 'starttls' for legacy StartTLS protocol (plain text (ldap://) protocol, upgraded to TLS).

                tls_minimum_protocol_version - the minimum protocol version of SSL/TLS.

                        Accepted values are: 'ssl2', 'ssl3', 'tls1.0', 'tls1.1', 'tls1.2' (the default).

                tls_require_cert - SSL/TLS peer certificate verification behavior.

                        Accepted values are: 'never', 'allow', 'try', 'demand' (the default).

                tls_cert_file - path to certificate file.

                tls_key_file - path to certificate key file.

                tls_ca_cert_file - path to CA certificate file.

                tls_ca_cert_dir - path to the directory containing CA certificates.

                tls_cipher_suite - allowed cipher suite (in OpenSSL notation).

             Example:

                <my_ldap_server>

                    <host>localhost</host>

                    <port>636</port>

                    <auth_dn_prefix>uid=</auth_dn_prefix>

                    <auth_dn_suffix>,ou=users,dc=example,dc=com</auth_dn_suffix>

                    <enable_tls>yes</enable_tls>

                    <tls_minimum_protocol_version>tls1.2</tls_minimum_protocol_version>

                    <tls_require_cert>demand</tls_require_cert>

                    <tls_cert_file>/path/to/tls_cert_file</tls_cert_file>

                    <tls_key_file>/path/to/tls_key_file</tls_key_file>

                    <tls_ca_cert_file>/path/to/tls_ca_cert_file</tls_ca_cert_file>

                    <tls_ca_cert_dir>/path/to/tls_ca_cert_dir</tls_ca_cert_dir>

                    <tls_cipher_suite>ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:AES256-GCM-SHA384</tls_cipher_suite>

                </my_ldap_server>

        -->

    </ldap_servers>

    <!-- Sources to read users, roles, access rights, profiles of settings, quotas. -->

    <user_directories>

        <users_xml>

            <!-- Path to configuration file with predefined users. -->

            <path>users.xml</path>

        </users_xml>

        <local_directory>

            <!-- Path to folder where users created by SQL commands are stored. -->

            <path>/var/lib/clickhouse/access/</path>

        </local_directory>

        <!-- To add an LDAP server as a remote user directory of users that are not defined locally, define a single 'ldap' section

              with the following parameters:

                server - one of LDAP server names defined in 'ldap_servers' config section above.

                        This parameter is mandatory and cannot be empty.

                roles - section with a list of locally defined roles that will be assigned to each user retrieved from the LDAP server.

                        If no roles are specified, user will not be able to perform any actions after authentication.

                        If any of the listed roles is not defined locally at the time of authentication, the authenthication attempt

                         will fail as if the provided password was incorrect.

             Example:

                <ldap>

                    <server>my_ldap_server</server>

                    <roles>

                        <my_local_role1 />

                        <my_local_role2 />

                    </roles>

                </ldap>

        -->

    </user_directories>

    <!-- Default profile of settings. -->

    <default_profile>default</default_profile>

    <!-- Comma-separated list of prefixes for user-defined settings. -->

    <custom_settings_prefixes></custom_settings_prefixes>

    <!-- System profile of settings. This settings are used by internal processes (Buffer storage, Distributed DDL worker and so on). -->

    <!-- <system_profile>default</system_profile> -->

    <!-- Default database. -->

    <default_database>default</default_database>

    <!-- Server time zone could be set here.

         Time zone is used when converting between String and DateTime types,

          when printing DateTime in text formats and parsing DateTime from text,

          it is used in date and time related functions, if specific time zone was not passed as an argument.

         Time zone is specified as identifier from IANA time zone database, like UTC or Africa/Abidjan.

         If not specified, system time zone at server startup is used.

         Please note, that server could display time zone alias instead of specified name.

         Example: W-SU is an alias for Europe/Moscow and Zulu is an alias for UTC.

    -->

    <!-- <timezone>Europe/Moscow</timezone> -->

    <!-- You can specify umask here (see "man umask"). Server will apply it on startup.

         Number is always parsed as octal. Default umask is 027 (other users cannot read logs, data files, etc; group can only read).

    -->

    <!-- <umask>022</umask> -->

    <!-- Perform mlockall after startup to lower first queries latency

          and to prevent clickhouse executable from being paged out under high IO load.

         Enabling this option is recommended but will lead to increased startup time for up to a few seconds.

    -->

    <mlock_executable>true</mlock_executable>

    <!-- Reallocate memory for machine code ("text") using huge pages. Highly experimental. -->

    <remap_executable>false</remap_executable>

    <!-- Configuration of clusters that could be used in Distributed tables.

         https://clickhouse.tech/docs/en/operations/table_engines/distributed/

      -->

    <remote_servers incl="clickhouse_remote_servers" >

        <!-- Test only shard config for testing distributed storage -->

        <cluster_1>

            <!-- Inter-server per-cluster secret for Distributed queries

                 default: no secret (no authentication will be performed)

                 If set, then Distributed queries will be validated on shards, so at least:

                 - such cluster should exist on the shard,

                 - such cluster should have the same secret.

                 And also (and which is more important), the initial_user will

                 be used as current user for the query.

                 Right now the protocol is pretty simple and it only takes into account:

                 - cluster name

                 - query

                 Also it will be nice if the following will be implemented:

                 - source hostname (see interserver_http_host), but then it will depends from DNS,

                   it can use IP address instead, but then the you need to get correct on the initiator node.

                 - target hostname / ip address (same notes as for source hostname)

                 - time-based security tokens

            -->

            <!-- <secret></secret> -->

            <shard>

                <!-- Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas). -->

                <!-- <internal_replication>false</internal_replication> -->

                <!-- Optional. Shard weight when writing data. Default: 1. -->

                <!-- <weight>1</weight> -->

                <replica>

                    <host>pandora-cpu-1.novalocal</host>

                    <port>9000</port>

                    <user>default</user>

                    <password></password>

                    <!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->

                    <!-- <priority>1</priority> -->

                </replica>

            </shard>

            <shard>

                <replica>

                    <host>pandora-cpu-2.novalocal</host>

                    <port>9000</port>

                    <user>default</user>

                    <password></password>

                    <!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->

                    <!-- <priority>1</priority> -->

                </replica>

            </shard>

            <shard>

                <replica>

                    <host>pandora-cpu-3.novalocal</host>

                    <port>9000</port>

                    <user>default</user>

                    <password></password>

                </replica>

            </shard>

        </cluster_1>

    </remote_servers>

    <!-- The list of hosts allowed to use in URL-related storage engines and table functions.

        If this section is not present in configuration, all hosts are allowed.

    -->

    <remote_url_allow_hosts>

        <!-- Host should be specified exactly as in URL. The name is checked before DNS resolution.

            Example: "yandex.ru", "yandex.ru." and "www.yandex.ru" are different hosts.

                    If port is explicitly specified in URL, the host:port is checked as a whole.

                    If host specified here without port, any port with this host allowed.

                    "yandex.ru" -> "yandex.ru:443", "yandex.ru:80" etc. is allowed, but "yandex.ru:80" -> only "yandex.ru:80" is allowed.

            If the host is specified as IP address, it is checked as specified in URL. Example: "[2a02:6b8:a::a]".

            If there are redirects and support for redirects is enabled, every redirect (the Location field) is checked.

        -->

        <!-- Regular expression can be specified. RE2 engine is used for regexps.

            Regexps are not aligned: don't forget to add ^ and $. Also don't forget to escape dot (.) metacharacter

            (forgetting to do so is a common source of error).

        -->

    </remote_url_allow_hosts>

    <!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.

         By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.

         Values for substitutions are specified in /yandex/name_of_substitution elements in that file.

      -->

    <!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.

         Optional. If you don't use replicated tables, you could omit that.

         See https://clickhouse.yandex/docs/en/table_engines/replication/

      -->

<!--    <zookeeper incl="zookeeper-servers" optional="true" /> -->

    <zookeeper>

        <node index="1">

        <host>172.16.12.130</host>

        <port>2181</port>

    </node>

    <node index="2">

        <host>172.16.12.130</host>

        <port>2182</port>

    </node>

    <node index="3">

        <host>172.16.16.130</host>

        <port>2183</port>

    </node>

    </zookeeper>

    <!-- Substitutions for parameters of replicated tables.

          Optional. If you don't use replicated tables, you could omit that.

         See https://clickhouse.yandex/docs/en/table_engines/replication/#creating-replicated-tables

      -->

    <macros incl="macros" optional="true" />

    <!-- Reloading interval for embedded dictionaries, in seconds. Default: 3600. -->

    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>

    <!-- Maximum session timeout, in seconds. Default: 3600. -->

    <max_session_timeout>3600</max_session_timeout>

    <!-- Default session timeout, in seconds. Default: 60. -->

    <default_session_timeout>60</default_session_timeout>

    <!-- Sending data to Graphite for monitoring. Several sections can be defined. -->

    <!--

        interval - send every X second

        root_path - prefix for keys

        hostname_in_path - append hostname to root_path (default = true)

        metrics - send data from table system.metrics

        events - send data from table system.events

        asynchronous_metrics - send data from table system.asynchronous_metrics

    -->

    <!--

    <graphite>

        <host>localhost</host>

        <port>42000</port>

        <timeout>0.1</timeout>

        <interval>60</interval>

        <root_path>one_min</root_path>

        <hostname_in_path>true</hostname_in_path>

        <metrics>true</metrics>

        <events>true</events>

        <events_cumulative>false</events_cumulative>

        <asynchronous_metrics>true</asynchronous_metrics>

    </graphite>

    <graphite>

        <host>localhost</host>

        <port>42000</port>

        <timeout>0.1</timeout>

        <interval>1</interval>

        <root_path>one_sec</root_path>

        <metrics>true</metrics>

        <events>true</events>

        <events_cumulative>false</events_cumulative>

        <asynchronous_metrics>false</asynchronous_metrics>

    </graphite>

    -->

    <!-- Serve endpoint for Prometheus monitoring. -->

    <!--

        endpoint - mertics path (relative to root, statring with "/")

        port - port to setup server. If not defined or 0 than http_port used

        metrics - send data from table system.metrics

        events - send data from table system.events

        asynchronous_metrics - send data from table system.asynchronous_metrics

        status_info - send data from different component from CH, ex: Dictionaries status

    -->

    <!--

    <prometheus>

        <endpoint>/metrics</endpoint>

        <port>9363</port>

        <metrics>true</metrics>

        <events>true</events>

        <asynchronous_metrics>true</asynchronous_metrics>

        <status_info>true</status_info>

    </prometheus>

    -->

    <!-- Query log. Used only for queries with setting log_queries = 1. -->

    <query_log>

        <!-- What table to insert data. If table is not exist, it will be created.

             When query log structure is changed after system update,

              then old table will be renamed and new table will be created automatically.

        -->

        <database>system</database>

        <table>query_log</table>

        <!--

            PARTITION BY expr https://clickhouse.yandex/docs/en/table_engines/custom_partitioning_key/

            Example:

                event_date

                toMonday(event_date)

                toYYYYMM(event_date)

                toStartOfHour(event_time)

        -->

        <partition_by>toYYYYMM(event_date)</partition_by>

        <!-- Instead of partition_by, you can provide full engine expression (starting with ENGINE = ) with parameters,

             Example: <engine>ENGINE = MergeTree PARTITION BY toYYYYMM(event_date) ORDER BY (event_date, event_time) SETTINGS index_granularity = 1024</engine>

          -->

        <!-- Interval of flushing data. -->

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

    </query_log>

    <!-- Trace log. Stores stack traces collected by query profilers.

         See query_profiler_real_time_period_ns and query_profiler_cpu_time_period_ns settings. -->

    <trace_log>

        <database>system</database>

        <table>trace_log</table>

        <partition_by>toYYYYMM(event_date)</partition_by>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

    </trace_log>

    <!-- Query thread log. Has information about all threads participated in query execution.

         Used only for queries with setting log_query_threads = 1. -->

    <query_thread_log>

        <database>system</database>

        <table>query_thread_log</table>

        <partition_by>toYYYYMM(event_date)</partition_by>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

    </query_thread_log>

    <!-- Uncomment if use part log.

         Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).

    <part_log>

        <database>system</database>

        <table>part_log</table>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

    </part_log>

    -->

    <!-- Uncomment to write text log into table.

         Text log contains all information from usual server log but stores it in structured and efficient way.

         The level of the messages that goes to the table can be limited (<level>), if not specified all messages will go to the table.

    <text_log>

        <database>system</database>

        <table>text_log</table>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

        <level></level>

    </text_log>

    -->

    <!-- Metric log contains rows with current values of ProfileEvents, CurrentMetrics collected with "collect_interval_milliseconds" interval. -->

    <metric_log>

        <database>system</database>

        <table>metric_log</table>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

        <collect_interval_milliseconds>1000</collect_interval_milliseconds>

    </metric_log>

    <!--

        Asynchronous metric log contains values of metrics from

        system.asynchronous_metrics.

    -->

    <asynchronous_metric_log>

        <database>system</database>

        <table>asynchronous_metric_log</table>

        <!--

            Asynchronous metrics are updated once a minute, so there is

            no need to flush more often.

        -->

        <flush_interval_milliseconds>60000</flush_interval_milliseconds>

    </asynchronous_metric_log>

    <!--

        OpenTelemetry log contains OpenTelemetry trace spans.

    -->

    <opentelemetry_span_log>

        <!--

            The default table creation code is insufficient, this <engine> spec

            is a workaround. There is no 'event_time' for this log, but two times,

            start and finish. It is sorted by finish time, to avoid inserting

            data too far away in the past (probably we can sometimes insert a span

            that is seconds earlier than the last span in the table, due to a race

            between several spans inserted in parallel). This gives the spans a

            global order that we can use to e.g. retry insertion into some external

            system.

        -->

        <engine>

            engine MergeTree

            partition by toYYYYMM(finish_date)

            order by (finish_date, finish_time_us, trace_id)

        </engine>

        <database>system</database>

        <table>opentelemetry_span_log</table>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>

    </opentelemetry_span_log>

    <!-- Crash log. Stores stack traces for fatal errors.

         This table is normally empty. -->

    <crash_log>

        <database>system</database>

        <table>crash_log</table>

        <partition_by />

        <flush_interval_milliseconds>1000</flush_interval_milliseconds>

    </crash_log>

    <!-- Parameters for embedded dictionaries, used in Yandex.Metrica.

         See https://clickhouse.yandex/docs/en/dicts/internal_dicts/

    -->

    <!-- Path to file with region hierarchy. -->

    <!-- <path_to_regions_hierarchy_file>/opt/geo/regions_hierarchy.txt</path_to_regions_hierarchy_file> -->

    <!-- Path to directory with files containing names of regions -->

    <!-- <path_to_regions_names_files>/opt/geo/</path_to_regions_names_files> -->

    <!-- Configuration of external dictionaries. See:

         https://clickhouse.yandex/docs/en/dicts/external_dicts/

    -->

    <dictionaries_config>*_dictionary.xml</dictionaries_config>

    <!-- Uncomment if you want data to be compressed 30-100% better.

         Don't do that if you just started using ClickHouse.

      -->

    <compression incl="clickhouse_compression">

    <!--

        <!- - Set of variants. Checked in order. Last matching case wins. If nothing matches, lz4 will be used. - ->

        <case>

            <!- - Conditions. All must be satisfied. Some conditions may be omitted. - ->

            <min_part_size>10000000000</min_part_size>        <!- - Min part size in bytes. - ->

            <min_part_size_ratio>0.01</min_part_size_ratio>   <!- - Min size of part relative to whole table size. - ->

            <!- - What compression method to use. - ->

            <method>zstd</method>

        </case>

    -->

    </compression>

    <!-- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster.

         Works only if ZooKeeper is enabled. Comment it if such functionality isn't required. -->

    <distributed_ddl>

        <!-- Path in ZooKeeper to queue with DDL queries -->

        <path>/clickhouse/task_queue/ddl</path>

        <!-- Settings from this profile will be used to execute DDL queries -->

        <!-- <profile>default</profile> -->

        <!-- Controls how much ON CLUSTER queries can be run simultaneously. -->

        <!-- <pool_size>1</pool_size> -->

    </distributed_ddl>

    <!-- Settings to fine tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h -->

    <!--

    <merge_tree>

        <max_suspicious_broken_parts>5</max_suspicious_broken_parts>

    </merge_tree>

    -->

    <!-- Protection from accidental DROP.

         If size of a MergeTree table is greater than max_table_size_to_drop (in bytes) than table could not be dropped with any DROP query.

         If you want do delete one table and don't want to change clickhouse-server config, you could create special file <clickhouse-path>/flags/force_drop_table and make DROP once.

         By default max_table_size_to_drop is 50GB; max_table_size_to_drop=0 allows to DROP any tables.

         The same for max_partition_size_to_drop.

         Uncomment to disable protection.

    -->

    <!-- <max_table_size_to_drop>0</max_table_size_to_drop> -->

    <!-- <max_partition_size_to_drop>0</max_partition_size_to_drop> -->

    <!-- Example of parameters for GraphiteMergeTree table engine -->

    <graphite_rollup_example>

        <pattern>

            <regexp>click_cost</regexp>

            <function>any</function>

            <retention>

                <age>0</age>

                <precision>3600</precision>

            </retention>

            <retention>

                <age>86400</age>

                <precision>60</precision>

            </retention>

        </pattern>

        <default>

            <function>max</function>

            <retention>

                <age>0</age>

                <precision>60</precision>

            </retention>

            <retention>

                <age>3600</age>

                <precision>300</precision>

            </retention>

            <retention>

                <age>86400</age>

                <precision>3600</precision>

            </retention>

        </default>

    </graphite_rollup_example>

    <!-- Directory in <clickhouse-path> containing schema files for various input formats.

         The directory will be created if it doesn't exist.

      -->

    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>

    <!-- Default query masking rules, matching lines would be replaced with something else in the logs

        (both text logs and system.query_log).

        name - name for the rule (optional)

        regexp - RE2 compatible regular expression (mandatory)

        replace - substitution string for sensitive data (optional, by default - six asterisks)

    -->

    <query_masking_rules>

        <rule>

            <name>hide encrypt/decrypt arguments</name>

            <regexp>((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\)</regexp>

            <!-- or more secure, but also more invasive:

                (aes_\w+)\s*\(.*\)

            -->

            <replace>\1(???)</replace>

        </rule>

    </query_masking_rules>

    <!-- Uncomment to use custom http handlers.

        rules are checked from top to bottom, first match runs the handler

            url - to match request URL, you can use 'regex:' prefix to use regex match(optional)

            methods - to match request method, you can use commas to separate multiple method matches(optional)

            headers - to match request headers, match each child element(child element name is header name), you can use 'regex:' prefix to use regex match(optional)

        handler is request handler

            type - supported types: static, dynamic_query_handler, predefined_query_handler

            query - use with predefined_query_handler type, executes query when the handler is called

            query_param_name - use with dynamic_query_handler type, extracts and executes the value corresponding to the <query_param_name> value in HTTP request params

            status - use with static type, response status code

            content_type - use with static type, response content-type

            response_content - use with static type, Response content sent to client, when using the prefix 'file://' or 'config://', find the content from the file or configuration send to client.

    <http_handlers>

        <rule>

            <url>/</url>

            <methods>POST,GET</methods>

            <headers><pragma>no-cache</pragma></headers>

            <handler>

                <type>dynamic_query_handler</type>

                <query_param_name>query</query_param_name>

            </handler>

        </rule>

        <rule>

            <url>/predefined_query</url>

            <methods>POST,GET</methods>

            <handler>

                <type>predefined_query_handler</type>

                <query>SELECT * FROM system.settings</query>

            </handler>

        </rule>

        <rule>

            <handler>

                <type>static</type>

                <status>200</status>

                <content_type>text/plain; charset=UTF-8</content_type>

                <response_content>config://http_server_default_response</response_content>

            </handler>

        </rule>

    </http_handlers>

    -->

    <!-- Uncomment to disable ClickHouse internal DNS caching. -->

    <!-- <disable_internal_dns_cache>1</disable_internal_dns_cache> -->

</yandex>

docker搭建clickhouse集群的更多相关文章

Docker 搭建 etcd 集群
阅读目录: 主机安装集群搭建 API 操作 API 说明和 etcdctl 命令说明 etcd 是 CoreOS 团队发起的一个开源项目(Go 语言,其实很多这类项目都是 Go 语言实现的,只能说很 ...
Docker搭建PXC集群
如何创建MySQL的PXC集群下载PXC集群镜像文件下载 docker pull percona/percona-xtradb-cluster 重命名 [root@hongshaorou ~]# ...
Docker搭建RabbitMQ集群
Docker搭建RabbitMQ集群 Docker安装见官网 RabbitMQ镜像下载及配置见此博文集群搭建首先,我们需要启动运行RabbitMQ docker run -d --hostna ...
docker搭建etcd集群环境
其实关于集群网上说的方案已经很多了,尤其是官网,只是这里我个人只有一个虚拟机,在开发环境下建议用docker-compose来搭建etcd集群. 1.拉取etcd镜像 docker pull quay ...
docker 搭建zookeeper集群和kafka集群
docker 搭建zookeeper集群安装docker-compose容器编排工具 Compose介绍 Docker Compose 是 Docker 官方编排(Orchestration)项目之 ...
使用Docker搭建Spark集群（用于实现网站流量实时分析模块）
上一篇使用Docker搭建了Hadoop的完全分布式:使用Docker搭建Hadoop集群(伪分布式与完全分布式),本次记录搭建spark集群,使用两者同时来实现之前一直未完成的项目:网站日志流量分析 ...
使用Docker搭建Hadoop集群(伪分布式与完全分布式)
之前用虚拟机搭建Hadoop集群(包括伪分布式和完全分布式:Hadoop之伪分布式安装),但是这样太消耗资源了,自学了Docker也来操练一把,用Docker来构建Hadoop集群,这里搭建的Hado ...
庐山真面目之十二微服务架构基于Docker搭建Consul集群、Ocelot网关集群和IdentityServer版本实现
庐山真面目之十二微服务架构基于Docker搭建Consul集群.Ocelot网关集群和IdentityServer版本实现一.简介在第七篇文章<庐山真面目之七微服务架构Consul ...
Elasticsearch使用系列-Docker搭建Elasticsearch集群
Elasticsearch使用系列-ES简介和环境搭建 Elasticsearch使用系列-ES增删查改基本操作+ik分词 Elasticsearch使用系列-基本查询和聚合查询+sql插件 Elas ...

随机推荐

Docker：docker部署redis
docker镜像库拉取镜像 # 下载镜像 docker pull redis:4.0 查看镜像 # 查看下载镜像 docker images 启动镜像 # 启动镜像 docker run --na ...
Hibernate框架（五）面向对象查询语言和锁
Hibernate做了数据库中表和我们实体类的映射,使我们不必再编写sql语言了.但是有时候查询的特殊性,还是需要我们手动来写查询语句呢,Hibernate框架为了解决这个问题给我们提供了HQL(Hi ...
buu 不一样的flag
一.查壳二.拖入ida,分析从这里和51到53行的代码,基本判断这是一个迷宫题,并且是5行5列的一个迷宫.我当时感觉到一个奇怪的地方是第一个,我自己想明白是因为可能是int型,数字占了4个字节, ...
Flask（10）- 标准类视图
前言前面文章讲解 Flask 路由的时候,都是将 URL 路径和一个视图函数关联当 Flask 框架接收到请求后,会根据请求 URL,调用响应的视图函数进行处理 Flask 不仅提供了视图函数来处 ...
ESP32省电模式连接WIFI笔记
基于ESP-IDF4.1版本 main.c文件如下: #include <string.h> #include "freertos/FreeRTOS.h" #inclu ...
vmware使用U盘安装系统
创建好系统创建一个新的硬盘,选择"physicalDrive1" 如果识别不到physicalDrive 1,使用下面的方法. 1.在本机的服务里面启用下面的服务. 2.重启 V ...
【有奖互动】HMS Core. Sparkle游戏应用创新沙龙，诚邀您参与
活动简介随着互联网基础设施的完善和"宅经济"效应凸显,游戏行业逆势上扬,迎来巨大消费市场.同时,用户需求愈加多样化,如何进一步创新和技术升级.提升核心竞争力已成为游戏开发与运营的 ...
.net core工具组件系列之Redis—— 第一篇：Windows环境配置Redis(5.x以上版本)以及部署为Windows服务
Cygwin工具编译Redis Redis6.x版本是未编译版本(官方很调皮,所以没办法,咱只好帮他们编译一下了),所以咱们先下载一个Cygwin,用它来对Redis进行编译. Cygwin下载地址: ...
深入GraphQL 的使用语法
深入GraphQL 的使用语法对于GraphQL 的使用语法在上一节中已经大概介绍了基本的使用方式了,这一篇将会对上一篇入门做拓展,努力将所有的使用语法都覆盖到. 1. 终端语法首先是介绍在前端查 ...
在Rancher中修改K8S服务参数的万金油法则
作者简介王海龙,Rancher中国社区技术经理,负责Rancher中国技术社区的维护和运营.拥有7年的云计算领域经验,经历了OpenStack到Kubernetes的技术变革,无论底层操作系统Lin ...

docker搭建clickhouse集群

docker搭建clickhouse集群的更多相关文章

随机推荐

热门专题