问题导读
1.如何设置storm内存?
2.如果没有配置文件的情况下,该如何配置一些参数?
3.通过哪个参数可以配置内存?

Storm中真正干活的是各个worker,而worker由supervisor负责启动。在topology启动过程中我们会看到如下的启动日志:

这就是启动一个worker进程,也就是一个JVM进程。
默认情况下,Storm启动worker进程时,JVM的最大内存是768M。
但我在使用过程中,由于会在Bolt中加载大量数据,768M内存无法满足需求,会导致内存溢出程序崩溃。
经过研究发现,可以通过在Strom的配置文件storm.yaml中设置worker的启动参数:

worker.childopts: "-Xmx2048m" 
该参数会在启动时传递给JVM,然后就可以在worker中使用2048m内存了。
目前好像Storm还没有配置文件的详细说明,比如可以配置哪些参数,怎么配置?
大家可以先参考Storm源代码中的Config.java.

 package backtype.storm;

 import backtype.storm.ConfigValidation;
import backtype.storm.serialization.IKryoDecorator;
import backtype.storm.serialization.IKryoFactory;
import com.esotericsoftware.kryo.Serializer;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map; /**
* Topology configs are specified as a plain old map. This class provides a
* convenient way to create a topology config map by providing setter methods for
* all the configs that can be set. It also makes it easier to do things like add
* serializations.
*
* <p>This class also provides constants for all the configurations possible on
* a Storm cluster and Storm topology. Each constant is paired with a schema
* that defines the validity criterion of the corresponding field. Default
* values for these configs can be found in defaults.yaml.</p>
*
* <p>Note that you may put other configurations in any of the configs. Storm
* will ignore anything it doesn't recognize, but your topologies are free to make
* use of them by reading them in the prepare method of Bolts or the open method of
* Spouts.</p>
*/
public class Config extends HashMap<String, Object> {
/**
* The transporter for communication among Storm tasks
*/
public static final String STORM_MESSAGING_TRANSPORT = "storm.messaging.transport";
public static final Object STORM_MESSAGING_TRANSPORT_SCHEMA = String.class; /**
* Netty based messaging: The buffer size for send/recv buffer
*/
public static final String STORM_MESSAGING_NETTY_BUFFER_SIZE = "storm.messaging.netty.buffer_size";
public static final Object STORM_MESSAGING_NETTY_BUFFER_SIZE_SCHEMA = Number.class; /**
* Netty based messaging: The max # of retries that a peer will perform when a remote is not accessible
*/
public static final String STORM_MESSAGING_NETTY_MAX_RETRIES = "storm.messaging.netty.max_retries";
public static final Object STORM_MESSAGING_NETTY_MAX_RETRIES_SCHEMA = Number.class; /**
* Netty based messaging: The min # of milliseconds that a peer will wait.
*/
public static final String STORM_MESSAGING_NETTY_MIN_SLEEP_MS = "storm.messaging.netty.min_wait_ms";
public static final Object STORM_MESSAGING_NETTY_MIN_SLEEP_MS_SCHEMA = Number.class; /**
* Netty based messaging: The max # of milliseconds that a peer will wait.
*/
public static final String STORM_MESSAGING_NETTY_MAX_SLEEP_MS = "storm.messaging.netty.max_wait_ms";
public static final Object STORM_MESSAGING_NETTY_MAX_SLEEP_MS_SCHEMA = Number.class; /**
* Netty based messaging: The # of worker threads for the server.
*/
public static final String STORM_MESSAGING_NETTY_SERVER_WORKER_THREADS = "storm.messaging.netty.server_worker_threads";
public static final Object STORM_MESSAGING_NETTY_SERVER_WORKER_THREADS_SCHEMA = Number.class; /**
* Netty based messaging: The # of worker threads for the client.
*/
public static final String STORM_MESSAGING_NETTY_CLIENT_WORKER_THREADS = "storm.messaging.netty.client_worker_threads";
public static final Object STORM_MESSAGING_NETTY_CLIENT_WORKER_THREADS_SCHEMA = Number.class; /**
* A list of hosts of ZooKeeper servers used to manage the cluster.
*/
public static final String STORM_ZOOKEEPER_SERVERS = "storm.zookeeper.servers";
public static final Object STORM_ZOOKEEPER_SERVERS_SCHEMA = ConfigValidation.StringsValidator; /**
* The port Storm will use to connect to each of the ZooKeeper servers.
*/
public static final String STORM_ZOOKEEPER_PORT = "storm.zookeeper.port";
public static final Object STORM_ZOOKEEPER_PORT_SCHEMA = Number.class; /**
* A directory on the local filesystem used by Storm for any local
* filesystem usage it needs. The directory must exist and the Storm daemons must
* have permission to read/write from this location.
*/
public static final String STORM_LOCAL_DIR = "storm.local.dir";
public static final Object STORM_LOCAL_DIR_SCHEMA = String.class; /**
* A global task scheduler used to assign topologies's tasks to supervisors' wokers.
*
* If this is not set, a default system scheduler will be used.
*/
public static final String STORM_SCHEDULER = "storm.scheduler";
public static final Object STORM_SCHEDULER_SCHEMA = String.class; /**
* The mode this Storm cluster is running in. Either "distributed" or "local".
*/
public static final String STORM_CLUSTER_MODE = "storm.cluster.mode";
public static final Object STORM_CLUSTER_MODE_SCHEMA = String.class; /**
* The hostname the supervisors/workers should report to nimbus. If unset, Storm will
* get the hostname to report by calling <code>InetAddress.getLocalHost().getCanonicalHostName()</code>.
*
* You should set this config when you dont have a DNS which supervisors/workers
* can utilize to find each other based on hostname got from calls to
* <code>InetAddress.getLocalHost().getCanonicalHostName()</code>.
*/
public static final String STORM_LOCAL_HOSTNAME = "storm.local.hostname";
public static final Object STORM_LOCAL_HOSTNAME_SCHEMA = String.class; /**
* The transport plug-in for Thrift client/server communication
*/
public static final String STORM_THRIFT_TRANSPORT_PLUGIN = "storm.thrift.transport";
public static final Object STORM_THRIFT_TRANSPORT_PLUGIN_SCHEMA = String.class; /**
* The serializer class for ListDelegate (tuple payload).
* The default serializer will be ListDelegateSerializer
*/
public static final String TOPOLOGY_TUPLE_SERIALIZER = "topology.tuple.serializer";
public static final Object TOPOLOGY_TUPLE_SERIALIZER_SCHEMA = String.class; /**
* Whether or not to use ZeroMQ for messaging in local mode. If this is set
* to false, then Storm will use a pure-Java messaging system. The purpose
* of this flag is to make it easy to run Storm in local mode by eliminating
* the need for native dependencies, which can be difficult to install.
*
* Defaults to false.
*/
public static final String STORM_LOCAL_MODE_ZMQ = "storm.local.mode.zmq";
public static final Object STORM_LOCAL_MODE_ZMQ_SCHEMA = Boolean.class; /**
* The root location at which Storm stores data in ZooKeeper.
*/
public static final String STORM_ZOOKEEPER_ROOT = "storm.zookeeper.root";
public static final Object STORM_ZOOKEEPER_ROOT_SCHEMA = String.class; /**
* The session timeout for clients to ZooKeeper.
*/
public static final String STORM_ZOOKEEPER_SESSION_TIMEOUT = "storm.zookeeper.session.timeout";
public static final Object STORM_ZOOKEEPER_SESSION_TIMEOUT_SCHEMA = Number.class; /**
* The connection timeout for clients to ZooKeeper.
*/
public static final String STORM_ZOOKEEPER_CONNECTION_TIMEOUT = "storm.zookeeper.connection.timeout";
public static final Object STORM_ZOOKEEPER_CONNECTION_TIMEOUT_SCHEMA = Number.class; /**
* The number of times to retry a Zookeeper operation.
*/
public static final String STORM_ZOOKEEPER_RETRY_TIMES="storm.zookeeper.retry.times";
public static final Object STORM_ZOOKEEPER_RETRY_TIMES_SCHEMA = Number.class; /**
* The interval between retries of a Zookeeper operation.
*/
public static final String STORM_ZOOKEEPER_RETRY_INTERVAL="storm.zookeeper.retry.interval";
public static final Object STORM_ZOOKEEPER_RETRY_INTERVAL_SCHEMA = Number.class; /**
* The ceiling of the interval between retries of a Zookeeper operation.
*/
public static final String STORM_ZOOKEEPER_RETRY_INTERVAL_CEILING="storm.zookeeper.retry.intervalceiling.millis";
public static final Object STORM_ZOOKEEPER_RETRY_INTERVAL_CEILING_SCHEMA = Number.class; /**
* The Zookeeper authentication scheme to use, e.g. "digest". Defaults to no authentication.
*/
public static final String STORM_ZOOKEEPER_AUTH_SCHEME="storm.zookeeper.auth.scheme";
public static final Object STORM_ZOOKEEPER_AUTH_SCHEME_SCHEMA = String.class; /**
* A string representing the payload for Zookeeper authentication. It gets serialized using UTF-8 encoding during authentication.
*/
public static final String STORM_ZOOKEEPER_AUTH_PAYLOAD="storm.zookeeper.auth.payload";
public static final Object STORM_ZOOKEEPER_AUTH_PAYLOAD_SCHEMA = String.class; /**
* The id assigned to a running topology. The id is the storm name with a unique nonce appended.
*/
public static final String STORM_ID = "storm.id";
public static final Object STORM_ID_SCHEMA = String.class; /**
* The host that the master server is running on.
*/
public static final String NIMBUS_HOST = "nimbus.host";
public static final Object NIMBUS_HOST_SCHEMA = String.class; /**
* Which port the Thrift interface of Nimbus should run on. Clients should
* connect to this port to upload jars and submit topologies.
*/
public static final String NIMBUS_THRIFT_PORT = "nimbus.thrift.port";
public static final Object NIMBUS_THRIFT_PORT_SCHEMA = Number.class; /**
* This parameter is used by the storm-deploy project to configure the
* jvm options for the nimbus daemon.
*/
public static final String NIMBUS_CHILDOPTS = "nimbus.childopts";
public static final Object NIMBUS_CHILDOPTS_SCHEMA = String.class; /**
* How long without heartbeating a task can go before nimbus will consider the
* task dead and reassign it to another location.
*/
public static final String NIMBUS_TASK_TIMEOUT_SECS = "nimbus.task.timeout.secs";
public static final Object NIMBUS_TASK_TIMEOUT_SECS_SCHEMA = Number.class; /**
* How often nimbus should wake up to check heartbeats and do reassignments. Note
* that if a machine ever goes down Nimbus will immediately wake up and take action.
* This parameter is for checking for failures when there's no explicit event like that
* occuring.
*/
public static final String NIMBUS_MONITOR_FREQ_SECS = "nimbus.monitor.freq.secs";
public static final Object NIMBUS_MONITOR_FREQ_SECS_SCHEMA = Number.class; /**
* How often nimbus should wake the cleanup thread to clean the inbox.
* @see NIMBUS_INBOX_JAR_EXPIRATION_SECS
*/
public static final String NIMBUS_CLEANUP_INBOX_FREQ_SECS = "nimbus.cleanup.inbox.freq.secs";
public static final Object NIMBUS_CLEANUP_INBOX_FREQ_SECS_SCHEMA = Number.class; /**
* The length of time a jar file lives in the inbox before being deleted by the cleanup thread.
*
* Probably keep this value greater than or equal to NIMBUS_CLEANUP_INBOX_JAR_EXPIRATION_SECS.
* Note that the time it takes to delete an inbox jar file is going to be somewhat more than
* NIMBUS_CLEANUP_INBOX_JAR_EXPIRATION_SECS (depending on how often NIMBUS_CLEANUP_FREQ_SECS
* is set to).
* @see NIMBUS_CLEANUP_FREQ_SECS
*/
public static final String NIMBUS_INBOX_JAR_EXPIRATION_SECS = "nimbus.inbox.jar.expiration.secs";
public static final Object NIMBUS_INBOX_JAR_EXPIRATION_SECS_SCHEMA = Number.class; /**
* How long before a supervisor can go without heartbeating before nimbus considers it dead
* and stops assigning new work to it.
*/
public static final String NIMBUS_SUPERVISOR_TIMEOUT_SECS = "nimbus.supervisor.timeout.secs";
public static final Object NIMBUS_SUPERVISOR_TIMEOUT_SECS_SCHEMA = Number.class; /**
* A special timeout used when a task is initially launched. During launch, this is the timeout
* used until the first heartbeat, overriding nimbus.task.timeout.secs.
*
* <p>A separate timeout exists for launch because there can be quite a bit of overhead
* to launching new JVM's and configuring them.</p>
*/
public static final String NIMBUS_TASK_LAUNCH_SECS = "nimbus.task.launch.secs";
public static final Object NIMBUS_TASK_LAUNCH_SECS_SCHEMA = Number.class; /**
* Whether or not nimbus should reassign tasks if it detects that a task goes down.
* Defaults to true, and it's not recommended to change this value.
*/
public static final String NIMBUS_REASSIGN = "nimbus.reassign";
public static final Object NIMBUS_REASSIGN_SCHEMA = Boolean.class; /**
* During upload/download with the master, how long an upload or download connection is idle
* before nimbus considers it dead and drops the connection.
*/
public static final String NIMBUS_FILE_COPY_EXPIRATION_SECS = "nimbus.file.copy.expiration.secs";
public static final Object NIMBUS_FILE_COPY_EXPIRATION_SECS_SCHEMA = Number.class; /**
* A custom class that implements ITopologyValidator that is run whenever a
* topology is submitted. Can be used to provide business-specific logic for
* whether topologies are allowed to run or not.
*/
public static final String NIMBUS_TOPOLOGY_VALIDATOR = "nimbus.topology.validator";
public static final Object NIMBUS_TOPOLOGY_VALIDATOR_SCHEMA = String.class; /**
* Class name for authorization plugin for Nimbus
*/
public static final String NIMBUS_AUTHORIZER = "nimbus.authorizer";
public static final Object NIMBUS_AUTHORIZER_SCHEMA = String.class; /**
* Storm UI binds to this port.
*/
public static final String UI_PORT = "ui.port";
public static final Object UI_PORT_SCHEMA = Number.class; /**
* HTTP UI port for log viewer
*/
public static final String LOGVIEWER_PORT = "logviewer.port";
public static final Object LOGVIEWER_PORT_SCHEMA = Number.class; /**
* Childopts for log viewer java process.
*/
public static final String LOGVIEWER_CHILDOPTS = "logviewer.childopts";
public static final Object LOGVIEWER_CHILDOPTS_SCHEMA = String.class; /**
* Appender name used by log viewer to determine log directory.
*/
public static final String LOGVIEWER_APPENDER_NAME = "logviewer.appender.name";
public static final Object LOGVIEWER_APPENDER_NAME_SCHEMA = String.class; /**
* Childopts for Storm UI Java process.
*/
public static final String UI_CHILDOPTS = "ui.childopts";
public static final Object UI_CHILDOPTS_SCHEMA = String.class; /**
* List of DRPC servers so that the DRPCSpout knows who to talk to.
*/
public static final String DRPC_SERVERS = "drpc.servers";
public static final Object DRPC_SERVERS_SCHEMA = ConfigValidation.StringsValidator; /**
* This port is used by Storm DRPC for receiving DPRC requests from clients.
*/
public static final String DRPC_PORT = "drpc.port";
public static final Object DRPC_PORT_SCHEMA = Number.class; /**
* DRPC thrift server worker threads
*/
public static final String DRPC_WORKER_THREADS = "drpc.worker.threads";
public static final Object DRPC_WORKER_THREADS_SCHEMA = Number.class; /**
* DRPC thrift server queue size
*/
public static final String DRPC_QUEUE_SIZE = "drpc.queue.size";
public static final Object DRPC_QUEUE_SIZE_SCHEMA = Number.class; /**
* This port on Storm DRPC is used by DRPC topologies to receive function invocations and send results back.
*/
public static final String DRPC_INVOCATIONS_PORT = "drpc.invocations.port";
public static final Object DRPC_INVOCATIONS_PORT_SCHEMA = Number.class; /**
* The timeout on DRPC requests within the DRPC server. Defaults to 10 minutes. Note that requests can also
* timeout based on the socket timeout on the DRPC client, and separately based on the topology message
* timeout for the topology implementing the DRPC function.
*/
public static final String DRPC_REQUEST_TIMEOUT_SECS = "drpc.request.timeout.secs";
public static final Object DRPC_REQUEST_TIMEOUT_SECS_SCHEMA = Number.class; /**
* Childopts for Storm DRPC Java process.
*/
public static final String DRPC_CHILDOPTS = "drpc.childopts";
public static final Object DRPC_CHILDOPTS_SCHEMA = String.class; /**
* the metadata configed on the supervisor
*/
public static final String SUPERVISOR_SCHEDULER_META = "supervisor.scheduler.meta";
public static final Object SUPERVISOR_SCHEDULER_META_SCHEMA = Map.class;
/**
* A list of ports that can run workers on this supervisor. Each worker uses one port, and
* the supervisor will only run one worker per port. Use this configuration to tune
* how many workers run on each machine.
*/
public static final String SUPERVISOR_SLOTS_PORTS = "supervisor.slots.ports";
public static final Object SUPERVISOR_SLOTS_PORTS_SCHEMA = ConfigValidation.NumbersValidator; /**
* This parameter is used by the storm-deploy project to configure the
* jvm options for the supervisor daemon.
*/
public static final String SUPERVISOR_CHILDOPTS = "supervisor.childopts";
public static final Object SUPERVISOR_CHILDOPTS_SCHEMA = String.class; /**
* How long a worker can go without heartbeating before the supervisor tries to
* restart the worker process.
*/
public static final String SUPERVISOR_WORKER_TIMEOUT_SECS = "supervisor.worker.timeout.secs";
public static final Object SUPERVISOR_WORKER_TIMEOUT_SECS_SCHEMA = Number.class; /**
* How long a worker can go without heartbeating during the initial launch before
* the supervisor tries to restart the worker process. This value override
* supervisor.worker.timeout.secs during launch because there is additional
* overhead to starting and configuring the JVM on launch.
*/
public static final String SUPERVISOR_WORKER_START_TIMEOUT_SECS = "supervisor.worker.start.timeout.secs";
public static final Object SUPERVISOR_WORKER_START_TIMEOUT_SECS_SCHEMA = Number.class; /**
* Whether or not the supervisor should launch workers assigned to it. Defaults
* to true -- and you should probably never change this value. This configuration
* is used in the Storm unit tests.
*/
public static final String SUPERVISOR_ENABLE = "supervisor.enable";
public static final Object SUPERVISOR_ENABLE_SCHEMA = Boolean.class; /**
* how often the supervisor sends a heartbeat to the master.
*/
public static final String SUPERVISOR_HEARTBEAT_FREQUENCY_SECS = "supervisor.heartbeat.frequency.secs";
public static final Object SUPERVISOR_HEARTBEAT_FREQUENCY_SECS_SCHEMA = Number.class; /**
* How often the supervisor checks the worker heartbeats to see if any of them
* need to be restarted.
*/
public static final String SUPERVISOR_MONITOR_FREQUENCY_SECS = "supervisor.monitor.frequency.secs";
public static final Object SUPERVISOR_MONITOR_FREQUENCY_SECS_SCHEMA = Number.class; /**
* The jvm opts provided to workers launched by this supervisor. All "%ID%" substrings are replaced
* with an identifier for this worker.
*/
public static final String WORKER_CHILDOPTS = "worker.childopts";
public static final Object WORKER_CHILDOPTS_SCHEMA = String.class; /**
* How often this worker should heartbeat to the supervisor.
*/
public static final String WORKER_HEARTBEAT_FREQUENCY_SECS = "worker.heartbeat.frequency.secs";
public static final Object WORKER_HEARTBEAT_FREQUENCY_SECS_SCHEMA = Number.class; /**
* How often a task should heartbeat its status to the master.
*/
public static final String TASK_HEARTBEAT_FREQUENCY_SECS = "task.heartbeat.frequency.secs";
public static final Object TASK_HEARTBEAT_FREQUENCY_SECS_SCHEMA = Number.class; /**
* How often a task should sync its connections with other tasks (if a task is
* reassigned, the other tasks sending messages to it need to refresh their connections).
* In general though, when a reassignment happens other tasks will be notified
* almost immediately. This configuration is here just in case that notification doesn't
* come through.
*/
public static final String TASK_REFRESH_POLL_SECS = "task.refresh.poll.secs";
public static final Object TASK_REFRESH_POLL_SECS_SCHEMA = Number.class; /**
* True if Storm should timeout messages or not. Defaults to true. This is meant to be used
* in unit tests to prevent tuples from being accidentally timed out during the test.
*/
public static final String TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS = "topology.enable.message.timeouts";
public static final Object TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS_SCHEMA = Boolean.class; /**
* When set to true, Storm will log every message that's emitted.
*/
public static final String TOPOLOGY_DEBUG = "topology.debug";
public static final Object TOPOLOGY_DEBUG_SCHEMA = Boolean.class; /**
* Whether or not the master should optimize topologies by running multiple
* tasks in a single thread where appropriate.
*/
public static final String TOPOLOGY_OPTIMIZE = "topology.optimize";
public static final Object TOPOLOGY_OPTIMIZE_SCHEMA = Boolean.class; /**
* How many processes should be spawned around the cluster to execute this
* topology. Each process will execute some number of tasks as threads within
* them. This parameter should be used in conjunction with the parallelism hints
* on each component in the topology to tune the performance of a topology.
*/
public static final String TOPOLOGY_WORKERS = "topology.workers";
public static final Object TOPOLOGY_WORKERS_SCHEMA = Number.class; /**
* How many instances to create for a spout/bolt. A task runs on a thread with zero or more
* other tasks for the same spout/bolt. The number of tasks for a spout/bolt is always
* the same throughout the lifetime of a topology, but the number of executors (threads) for
* a spout/bolt can change over time. This allows a topology to scale to more or less resources
* without redeploying the topology or violating the constraints of Storm (such as a fields grouping
* guaranteeing that the same value goes to the same task).
*/
public static final String TOPOLOGY_TASKS = "topology.tasks";
public static final Object TOPOLOGY_TASKS_SCHEMA = Number.class; /**
* How many executors to spawn for ackers.
*
* <p>If this is set to 0, then Storm will immediately ack tuples as soon
* as they come off the spout, effectively disabling reliability.</p>
*/
public static final String TOPOLOGY_ACKER_EXECUTORS = "topology.acker.executors";
public static final Object TOPOLOGY_ACKER_EXECUTORS_SCHEMA = Number.class; /**
* The maximum amount of time given to the topology to fully process a message
* emitted by a spout. If the message is not acked within this time frame, Storm
* will fail the message on the spout. Some spouts implementations will then replay
* the message at a later time.
*/
public static final String TOPOLOGY_MESSAGE_TIMEOUT_SECS = "topology.message.timeout.secs";
public static final Object TOPOLOGY_MESSAGE_TIMEOUT_SECS_SCHEMA = Number.class; /**
* A list of serialization registrations for Kryo ( http://code.google.com/p/kryo/ ),
* the underlying serialization framework for Storm. A serialization can either
* be the name of a class (in which case Kryo will automatically create a serializer for the class
* that saves all the object's fields), or an implementation of com.esotericsoftware.kryo.Serializer.
*
* See Kryo's documentation for more information about writing custom serializers.
*/
public static final String TOPOLOGY_KRYO_REGISTER = "topology.kryo.register";
public static final Object TOPOLOGY_KRYO_REGISTER_SCHEMA = ConfigValidation.StringsValidator; /**
* A list of classes that customize storm's kryo instance during start-up.
* Each listed class name must implement IKryoDecorator. During start-up the
* listed class is instantiated with 0 arguments, then its 'decorate' method
* is called with storm's kryo instance as the only argument.
*/
public static final String TOPOLOGY_KRYO_DECORATORS = "topology.kryo.decorators";
public static final Object TOPOLOGY_KRYO_DECORATORS_SCHEMA = ConfigValidation.StringsValidator; /**
* Class that specifies how to create a Kryo instance for serialization. Storm will then apply
* topology.kryo.register and topology.kryo.decorators on top of this. The default implementation
* implements topology.fall.back.on.java.serialization and turns references off.
*/
public static final String TOPOLOGY_KRYO_FACTORY = "topology.kryo.factory";
public static final Object TOPOLOGY_KRYO_FACTORY_SCHEMA = String.class; /**
* Whether or not Storm should skip the loading of kryo registrations for which it
* does not know the class or have the serializer implementation. Otherwise, the task will
* fail to load and will throw an error at runtime. The use case of this is if you want to
* declare your serializations on the storm.yaml files on the cluster rather than every single
* time you submit a topology. Different applications may use different serializations and so
* a single application may not have the code for the other serializers used by other apps.
* By setting this config to true, Storm will ignore that it doesn't have those other serializations
* rather than throw an error.
*/
public static final String TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS= "topology.skip.missing.kryo.registrations";
public static final Object TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS_SCHEMA = Boolean.class; /*
* A list of classes implementing IMetricsConsumer (See storm.yaml.example for exact config format).
* Each listed class will be routed all the metrics data generated by the storm metrics API.
* Each listed class maps 1:1 to a system bolt named __metrics_ClassName#N, and it's parallelism is configurable.
*/
public static final String TOPOLOGY_METRICS_CONSUMER_REGISTER = "topology.metrics.consumer.register";
public static final Object TOPOLOGY_METRICS_CONSUMER_REGISTER_SCHEMA = ConfigValidation.MapsValidator; /**
* The maximum parallelism allowed for a component in this topology. This configuration is
* typically used in testing to limit the number of threads spawned in local mode.
*/
public static final String TOPOLOGY_MAX_TASK_PARALLELISM="topology.max.task.parallelism";
public static final Object TOPOLOGY_MAX_TASK_PARALLELISM_SCHEMA = Number.class; /**
* The maximum number of tuples that can be pending on a spout task at any given time.
* This config applies to individual tasks, not to spouts or topologies as a whole.
*
* A pending tuple is one that has been emitted from a spout but has not been acked or failed yet.
* Note that this config parameter has no effect for unreliable spouts that don't tag
* their tuples with a message id.
*/
public static final String TOPOLOGY_MAX_SPOUT_PENDING="topology.max.spout.pending";
public static final Object TOPOLOGY_MAX_SPOUT_PENDING_SCHEMA = Number.class; /**
* A class that implements a strategy for what to do when a spout needs to wait. Waiting is
* triggered in one of two conditions:
*
* 1. nextTuple emits no tuples
* 2. The spout has hit maxSpoutPending and can't emit any more tuples
*/
public static final String TOPOLOGY_SPOUT_WAIT_STRATEGY="topology.spout.wait.strategy";
public static final Object TOPOLOGY_SPOUT_WAIT_STRATEGY_SCHEMA = String.class; /**
* The amount of milliseconds the SleepEmptyEmitStrategy should sleep for.
*/
public static final String TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS="topology.sleep.spout.wait.strategy.time.ms";
public static final Object TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS_SCHEMA = Number.class; /**
* The maximum amount of time a component gives a source of state to synchronize before it requests
* synchronization again.
*/
public static final String TOPOLOGY_STATE_SYNCHRONIZATION_TIMEOUT_SECS="topology.state.synchronization.timeout.secs";
public static final Object TOPOLOGY_STATE_SYNCHRONIZATION_TIMEOUT_SECS_SCHEMA = Number.class; /**
* The percentage of tuples to sample to produce stats for a task.
*/
public static final String TOPOLOGY_STATS_SAMPLE_RATE="topology.stats.sample.rate";
public static final Object TOPOLOGY_STATS_SAMPLE_RATE_SCHEMA = Number.class; /**
* The time period that builtin metrics data in bucketed into.
*/
public static final String TOPOLOGY_BUILTIN_METRICS_BUCKET_SIZE_SECS="topology.builtin.metrics.bucket.size.secs";
public static final Object TOPOLOGY_BUILTIN_METRICS_BUCKET_SIZE_SECS_SCHEMA = Number.class; /**
* Whether or not to use Java serialization in a topology.
*/
public static final String TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION="topology.fall.back.on.java.serialization";
public static final Object TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION_SCHEMA = Boolean.class; /**
* Topology-specific options for the worker child process. This is used in addition to WORKER_CHILDOPTS.
*/
public static final String TOPOLOGY_WORKER_CHILDOPTS="topology.worker.childopts";
public static final Object TOPOLOGY_WORKER_CHILDOPTS_SCHEMA = String.class; /**
* This config is available for TransactionalSpouts, and contains the id ( a String) for
* the transactional topology. This id is used to store the state of the transactional
* topology in Zookeeper.
*/
public static final String TOPOLOGY_TRANSACTIONAL_ID="topology.transactional.id";
public static final Object TOPOLOGY_TRANSACTIONAL_ID_SCHEMA = String.class; /**
* A list of task hooks that are automatically added to every spout and bolt in the topology. An example
* of when you'd do this is to add a hook that integrates with your internal
* monitoring system. These hooks are instantiated using the zero-arg constructor.
*/
public static final String TOPOLOGY_AUTO_TASK_HOOKS="topology.auto.task.hooks";
public static final Object TOPOLOGY_AUTO_TASK_HOOKS_SCHEMA = ConfigValidation.StringsValidator; /**
* The size of the Disruptor receive queue for each executor. Must be a power of 2.
*/
public static final String TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE="topology.executor.receive.buffer.size";
public static final Object TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE_SCHEMA = ConfigValidation.PowerOf2Validator; /**
* The maximum number of messages to batch from the thread receiving off the network to the
* executor queues. Must be a power of 2.
*/
public static final String TOPOLOGY_RECEIVER_BUFFER_SIZE="topology.receiver.buffer.size";
public static final Object TOPOLOGY_RECEIVER_BUFFER_SIZE_SCHEMA = ConfigValidation.PowerOf2Validator; /**
* The size of the Disruptor send queue for each executor. Must be a power of 2.
*/
public static final String TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE="topology.executor.send.buffer.size";
public static final Object TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE_SCHEMA = ConfigValidation.PowerOf2Validator; /**
* The size of the Disruptor transfer queue for each worker.
*/
public static final String TOPOLOGY_TRANSFER_BUFFER_SIZE="topology.transfer.buffer.size";
public static final Object TOPOLOGY_TRANSFER_BUFFER_SIZE_SCHEMA = Number.class; /**
* How often a tick tuple from the "__system" component and "__tick" stream should be sent
* to tasks. Meant to be used as a component-specific configuration.
*/
public static final String TOPOLOGY_TICK_TUPLE_FREQ_SECS="topology.tick.tuple.freq.secs";
public static final Object TOPOLOGY_TICK_TUPLE_FREQ_SECS_SCHEMA = Number.class; /**
* Configure the wait strategy used for internal queuing. Can be used to tradeoff latency
* vs. throughput
*/
public static final String TOPOLOGY_DISRUPTOR_WAIT_STRATEGY="topology.disruptor.wait.strategy";
public static final Object TOPOLOGY_DISRUPTOR_WAIT_STRATEGY_SCHEMA = String.class; /**
* The size of the shared thread pool for worker tasks to make use of. The thread pool can be accessed
* via the TopologyContext.
*/
public static final String TOPOLOGY_WORKER_SHARED_THREAD_POOL_SIZE="topology.worker.shared.thread.pool.size";
public static final Object TOPOLOGY_WORKER_SHARED_THREAD_POOL_SIZE_SCHEMA = Number.class; /**
* The interval in seconds to use for determining whether to throttle error reported to Zookeeper. For example,
* an interval of 10 seconds with topology.max.error.report.per.interval set to 5 will only allow 5 errors to be
* reported to Zookeeper per task for every 10 second interval of time.
*/
public static final String TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS="topology.error.throttle.interval.secs";
public static final Object TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS_SCHEMA = Number.class; /**
* See doc for TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS
*/
public static final String TOPOLOGY_MAX_ERROR_REPORT_PER_INTERVAL="topology.max.error.report.per.interval";
public static final Object TOPOLOGY_MAX_ERROR_REPORT_PER_INTERVAL_SCHEMA = Number.class; /**
* How often a batch can be emitted in a Trident topology.
*/
public static final String TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS="topology.trident.batch.emit.interval.millis";
public static final Object TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS_SCHEMA = Number.class; /**
* Name of the topology. This config is automatically set by Storm when the topology is submitted.
*/
public static final String TOPOLOGY_NAME="topology.name";
public static final Object TOPOLOGY_NAME_SCHEMA = String.class; /**
* Max pending tuples in one ShellBolt
*/
public static final String TOPOLOGY_SHELLBOLT_MAX_PENDING="topology.shellbolt.max.pending";
public static final Object TOPOLOGY_SHELLBOLT_MAX_PENDING_SCHEMA = Number.class; /**
* The root directory in ZooKeeper for metadata about TransactionalSpouts.
*/
public static final String TRANSACTIONAL_ZOOKEEPER_ROOT="transactional.zookeeper.root";
public static final Object TRANSACTIONAL_ZOOKEEPER_ROOT_SCHEMA = String.class; /**
* The list of zookeeper servers in which to keep the transactional state. If null (which is default),
* will use storm.zookeeper.servers
*/
public static final String TRANSACTIONAL_ZOOKEEPER_SERVERS="transactional.zookeeper.servers";
public static final Object TRANSACTIONAL_ZOOKEEPER_SERVERS_SCHEMA = ConfigValidation.StringsValidator; /**
* The port to use to connect to the transactional zookeeper servers. If null (which is default),
* will use storm.zookeeper.port
*/
public static final String TRANSACTIONAL_ZOOKEEPER_PORT="transactional.zookeeper.port";
public static final Object TRANSACTIONAL_ZOOKEEPER_PORT_SCHEMA = Number.class; /**
* The number of threads that should be used by the zeromq context in each worker process.
*/
public static final String ZMQ_THREADS = "zmq.threads";
public static final Object ZMQ_THREADS_SCHEMA = Number.class; /**
* How long a connection should retry sending messages to a target host when
* the connection is closed. This is an advanced configuration and can almost
* certainly be ignored.
*/
public static final String ZMQ_LINGER_MILLIS = "zmq.linger.millis";
public static final Object ZMQ_LINGER_MILLIS_SCHEMA = Number.class; /**
* The high water for the ZeroMQ push sockets used for networking. Use this config to prevent buffer explosion
* on the networking layer.
*/
public static final String ZMQ_HWM = "zmq.hwm";
public static final Object ZMQ_HWM_SCHEMA = Number.class; /**
* This value is passed to spawned JVMs (e.g., Nimbus, Supervisor, and Workers)
* for the java.library.path value. java.library.path tells the JVM where
* to look for native libraries. It is necessary to set this config correctly since
* Storm uses the ZeroMQ and JZMQ native libs.
*/
public static final String JAVA_LIBRARY_PATH = "java.library.path";
public static final Object JAVA_LIBRARY_PATH_SCHEMA = String.class; /**
* The path to use as the zookeeper dir when running a zookeeper server via
* "storm dev-zookeeper". This zookeeper instance is only intended for development;
* it is not a production grade zookeeper setup.
*/
public static final String DEV_ZOOKEEPER_PATH = "dev.zookeeper.path";
public static final Object DEV_ZOOKEEPER_PATH_SCHEMA = String.class; /**
* A map from topology name to the number of machines that should be dedicated for that topology. Set storm.scheduler
* to backtype.storm.scheduler.IsolationScheduler to make use of the isolation scheduler.
*/
public static final String ISOLATION_SCHEDULER_MACHINES = "isolation.scheduler.machines";
public static final Object ISOLATION_SCHEDULER_MACHINES_SCHEMA = Map.class; public static void setDebug(Map conf, boolean isOn) {
conf.put(Config.TOPOLOGY_DEBUG, isOn);
} public void setDebug(boolean isOn) {
setDebug(this, isOn);
} @Deprecated
public void setOptimize(boolean isOn) {
put(Config.TOPOLOGY_OPTIMIZE, isOn);
} public static void setNumWorkers(Map conf, int workers) {
conf.put(Config.TOPOLOGY_WORKERS, workers);
} public void setNumWorkers(int workers) {
setNumWorkers(this, workers);
} public static void setNumAckers(Map conf, int numExecutors) {
conf.put(Config.TOPOLOGY_ACKER_EXECUTORS, numExecutors);
} public void setNumAckers(int numExecutors) {
setNumAckers(this, numExecutors);
} public static void setMessageTimeoutSecs(Map conf, int secs) {
conf.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, secs);
} public void setMessageTimeoutSecs(int secs) {
setMessageTimeoutSecs(this, secs);
} public static void registerSerialization(Map conf, Class klass) {
getRegisteredSerializations(conf).add(klass.getName());
} public void registerSerialization(Class klass) {
registerSerialization(this, klass);
} public static void registerSerialization(Map conf, Class klass, Class<? extends Serializer> serializerClass) {
Map<String, String> register = new HashMap<String, String>();
register.put(klass.getName(), serializerClass.getName());
getRegisteredSerializations(conf).add(register);
} public void registerSerialization(Class klass, Class<? extends Serializer> serializerClass) {
registerSerialization(this, klass, serializerClass);
} public void registerMetricsConsumer(Class klass, Object argument, long parallelismHint) {
HashMap m = new HashMap();
m.put("class", klass.getCanonicalName());
m.put("parallelism.hint", parallelismHint);
m.put("argument", argument); List l = (List)this.get(TOPOLOGY_METRICS_CONSUMER_REGISTER);
if(l == null) { l = new ArrayList(); }
l.add(m);
this.put(TOPOLOGY_METRICS_CONSUMER_REGISTER, l);
} public void registerMetricsConsumer(Class klass, long parallelismHint) {
registerMetricsConsumer(klass, null, parallelismHint);
} public void registerMetricsConsumer(Class klass) {
registerMetricsConsumer(klass, null, 1L);
} public static void registerDecorator(Map conf, Class<? extends IKryoDecorator> klass) {
getRegisteredDecorators(conf).add(klass.getName());
} public void registerDecorator(Class<? extends IKryoDecorator> klass) {
registerDecorator(this, klass);
} public static void setKryoFactory(Map conf, Class<? extends IKryoFactory> klass) {
conf.put(Config.TOPOLOGY_KRYO_FACTORY, klass.getName());
} public void setKryoFactory(Class<? extends IKryoFactory> klass) {
setKryoFactory(this, klass);
} public static void setSkipMissingKryoRegistrations(Map conf, boolean skip) {
conf.put(Config.TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS, skip);
} public void setSkipMissingKryoRegistrations(boolean skip) {
setSkipMissingKryoRegistrations(this, skip);
} public static void setMaxTaskParallelism(Map conf, int max) {
conf.put(Config.TOPOLOGY_MAX_TASK_PARALLELISM, max);
} public void setMaxTaskParallelism(int max) {
setMaxTaskParallelism(this, max);
} public static void setMaxSpoutPending(Map conf, int max) {
conf.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, max);
} public void setMaxSpoutPending(int max) {
setMaxSpoutPending(this, max);
} public static void setStatsSampleRate(Map conf, double rate) {
conf.put(Config.TOPOLOGY_STATS_SAMPLE_RATE, rate);
} public void setStatsSampleRate(double rate) {
setStatsSampleRate(this, rate);
} public static void setFallBackOnJavaSerialization(Map conf, boolean fallback) {
conf.put(Config.TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION, fallback);
} public void setFallBackOnJavaSerialization(boolean fallback) {
setFallBackOnJavaSerialization(this, fallback);
} private static List getRegisteredSerializations(Map conf) {
List ret;
if(!conf.containsKey(Config.TOPOLOGY_KRYO_REGISTER)) {
ret = new ArrayList();
} else {
ret = new ArrayList((List) conf.get(Config.TOPOLOGY_KRYO_REGISTER));
}
conf.put(Config.TOPOLOGY_KRYO_REGISTER, ret);
return ret;
} private static List getRegisteredDecorators(Map conf) {
List ret;
if(!conf.containsKey(Config.TOPOLOGY_KRYO_DECORATORS)) {
ret = new ArrayList();
} else {
ret = new ArrayList((List) conf.get(Config.TOPOLOGY_KRYO_DECORATORS));
}
conf.put(Config.TOPOLOGY_KRYO_DECORATORS, ret);
return ret;
}
}

转http://www.aboutyun.com/thread-8109-1-1.html

storm配置:如何解决worker进程内存过小的问题的更多相关文章

  1. twitter storm 源码走读之5 -- worker进程内部消息传递处理和数据结构分析

    欢迎转载,转载请注明出处,徽沪一郎. 本文从外部消息在worker进程内部的转化,传递及处理过程入手,一步步分析在worker-data中的数据项存在的原因和意义.试图从代码实现的角度来回答,如果是从 ...

  2. twitter storm源码走读之4 -- worker进程中线程的分类及用途

    欢迎转载,转载请注明出版,徽沪一郎. 本文重点分析storm的worker进程在正常启动之后有哪些类型的线程,针对每种类型的线程,剖析其用途及消息的接收与发送流程. 概述 worker进程启动过程中最 ...

  3. Storm概念学习系列 之Worker工作者进程

    不多说,直接上干货! Worker工作者进程   工作者进程(Worker)是一个java进程,执行拓扑的一部分任务.一个Worker进程执行一个Topology的子集,它会启动一个或多个Execut ...

  4. worker进程中线程的分类及用途

    worker进程中线程的分类及用途 欢迎转载,转载请注明出版,徽沪一郎. 本文重点分析storm的worker进程在正常启动之后有哪些类型的线程,针对每种类型的线程,剖析其用途及消息的接收与发送流程. ...

  5. hadoop生态搭建(3节点)-11.storm配置

    # http://archive.apache.org/dist/storm/apache-storm-1.1.0/ # ======================================= ...

  6. Ubuntu下配置Apache的Worker模式

    其实Apache本身的并发能力是足够强大的,但是Ubuntu默认安装的是Prefork模式下的Apache.所以导致很多人后面盲目的去 安装lighttpd或者nginx一类替代软件.但是这类软件有一 ...

  7. storm 配置,呵呵。

    配置项 配置说明 storm.zookeeper.servers ZooKeeper服务器列表 storm.zookeeper.port ZooKeeper连接端口 storm.local.dir s ...

  8. Storm 配置图文解析

    Storm 配置图文解析 參考阅读:http://www.xiaofateng.com/? p=959 ============================== | sample-topology ...

  9. <spark> error:启动spark后查看进程,进程中master和worker进程冲突

    启动hadoop再启动spark后jps,发现master进程和worker进程同时存在,调试了半天配置文件. 测试发现,当我关闭hadoop后 worker进程还是存在, 但是,当我再关闭spark ...

随机推荐

  1. loadicon后一定要调用destroyicon吗

    Remarks It is only necessary to call DestroyIcon for icons and cursors created with the following fu ...

  2. 7. Add song to Phone

    package com.example.thenewboston; import android.app.Activity; import android.media.MediaPlayer; imp ...

  3. Windows 之 手机访问 PC 端本地部署的站点

    测试网页在手机上的显示工具我们可以使用谷歌内核的浏览器,打开开发者工具(F12),在device那里选择设备,然后刷新来查看网页在手机上的显示效果. 但毕竟是模拟的,如果想要在真机上调试该怎么办呢. ...

  4. 嵌入式web服务器-thttpd

    交叉编译thttpd http://lakie.blog.163.com/blog/static/45185220201162910432330/ thttpd安装与调试 http://blog.cs ...

  5. iOS日期加减

    - (NSDate *)jsDateFromBeginDate:(NSDate *)beginDate todays:(int)days { NSDate *dateTemp = [[NSDate a ...

  6. Python3.2官方文档翻译--标准库概览(一)

    7.1 操作系统接口 Os模块提供主要很多与操作系统交互的函数. >>> import os >>> os.getcwd() # Return the curren ...

  7. git工具 将源码clone到本地指定目录的三种方式

      git工具 将源码clone到本地指定目录的三种方式 CreationTime--2018年7月27日15点34分 Author:Marydon 1.情景展示 运行git-bash.exe,输入命 ...

  8. CentOS7 查看硬盘情况

    lsblk                                  查看分区和磁盘 df -h                                         查看空间使用情 ...

  9. 安装mysql 和 apache

    一. 安装apache服务器 1. 检查apache服务器是否安装 #service httpd status 2. 如提示未被识别的服务,则表明组件未安装,需手动安装 #yum install ht ...

  10. TouchID 指纹解锁

    概述 TouchID 指纹解锁 详细 代码下载:http://www.demodashi.com/demo/10701.html 一.软硬件支持 指纹验证功能的最低硬件支持为iPhone5s, iPa ...