Storm(2) - Log Stream Processing
Introduction
This chapter will present an implementation recipe for an enterprise log storage and a search and analysis solution based on the Storm processor. Log data processing isn't necessarily a problem that needs solving again; it is, however, a good analogy.
Stream processing is a key architectural concern in the modern enterprise; however, streams of data are often semi-structured at best. By presenting an approach to enterprise log processing, this chapter is designed to provide the reader with all the key elements to achieve this level of capability on any kind of data. Log data is also extremely convenient in an academic setting given its sheer abundance. A key success factor for any stream processing or analytics effort is a deep understanding of the actual data and sourcing data can often be difficult.
It is, therefore, important that the reader considers how the architectural blueprint could be applied to other forms of data within the enterprise.
The following diagram illustrates all the elements that we will develop in this chapter:
You will learn how to create a log agent that can be distributed across all the nodes in your environment. You will also learn to collect these log entries centrally using Storm and Redis, and then analyze, index, and count the logs, such that we will be able to search them later and display base statistics for them.
Creating a log agent
. download and config logstash to steam local node log into the topology wget https://logstash.objects.dreamhost.com/release/logstash-1.1.7-monolithic.jar . create the file of shipper.conf input { file { type => "syslog" path => ["/var/log/messages", "/var/log/system.*", "/var/log/*.log"] } } output { #output events to stdout for debugging. feel free to remove it stdout { } redis { host => "localhost" data_type => "list" key => "rawLogs" } } . start a local instance of Redis, and then start logstash java -jar logstash--monolithic.jar -f shipper.conf
Creating the log spout
Start by creating the project directory and the standard Maven folder structure (http://maven.apache.org/guides/introduction/introduction-to-the- standard-directory-layout.html). 1. Create the POM as per the Creating a "Hello World" topology recipe in Chapter 1, Setting Up Your Development Environment, updating the <artifactId> and <name> tag values and including the following dependencies: <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> <scope>test</scope> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.6.1</version> </dependency> <dependency> <groupId>org.jmock</groupId> <artifactId>jmock-legacy</artifactId> <version>2.5.1</version> <scope>test</scope> </dependency> <dependency> <groupId>storm</groupId> <artifactId>storm</artifactId> <version>0.8.1</version> <!-- keep storm out of the jar-with-dependencies --> <scope>provided</scope> <exclusions> <exclusion> <artifactId>slf4j-api</artifactId> <groupId>org.slf4j</groupId> </exclusion> </exclusions> </dependency> <dependency> <groupId>com.googlecode.json-simple</groupId> <artifactId>json-simple</artifactId> <version>1.1</version> </dependency> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>commons-httpclient</groupId> <artifactId>commons-httpclient</artifactId> <version>3.1</version> </dependency> <dependency> <groupId>org.jmock</groupId> <artifactId>jmock-junit4</artifactId> <version>2.5.1</version> <scope>test</scope> </dependency> <dependency> <groupId>com.github.ptgoetz</groupId> <artifactId>storm-cassandra</artifactId> <version>0.3.1-SNAPSHOT</version> </dependency> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>0.20.2</version> </dependency> <dependency> <groupId>org.drools</groupId> <artifactId>drools-core</artifactId> <version>5.5.0.Final</version> </dependency> <dependency> <groupId>org.drools</groupId> <artifactId>drools-compiler</artifactId> <version>5.5.0.Final</version> </dependency> 2. Import the project into Eclipse after generating the Eclipse project files as follows: mvn eclipse:eclipse 3. Tuples in the log topology will carry a log domain object that encapsulates the data and parsing logic for a single log record or an entry in a logfile. In the created project, create this domain object: public class LogEntry { public static Logger LOG = Logger.getLogger(LogEntry.class); private String source; private String type; private List<String> tags = new ArrayList<String>(); private Map<String,String> fields = new HashMap<String, String>(); private Date timestamp; private String sourceHost; private String sourcePath; private String message = ""; private boolean filter = false; private NotificationDetails notifyAbout = null; private static String[] FORMATS = new String[]{ "yyyy-MM-dd'T'HH:mm:ss.SSS", "yyyy.MM.dd G 'at' HH:mm:ss z", "yyyyy.MMMMM.dd GGG hh:mm aaa", "EEE, d MMM yyyy HH:mm:ss Z", "yyMMddHHmmssZ" }; @SuppressWarnings("unchecked") public LogEntry(JSONObject json){ source = (String)json.get("@source"); timestamp = parseDate((String)json.get("@timestamp")); sourceHost = (String)json.get("@source_host"); sourcePath = (String)json.get("@source_path"); message = (String)json.get("@message"); type = (String)json.get("@type"); JSONArray array = (JSONArray)json.get("@tags"); tags.addAll(array); JSONObject fields = (JSONObject)json.get("@fields"); fields.putAll(fields); } public Date parseDate(String value){ Date temp; for(int i = 0; i < FORMATS.length; i++){ SimpleDateFormat format = new SimpleDateFormat(FORMATS[i]); try { temp = format.parse(value); if(temp != null) { return temp; } } catch (ParseException e) {} } LOG.error("Could not parse timestamp for log"); return null; } @SuppressWarnings("unchecked") public JSONObject toJSON(){ JSONObject json = new JSONObject(); json.put("@source", source); json.put("@timestamp",DateFormat.getDateInstance().format(timestamp)); json.put("@source_host",sourceHost); json.put("@source_path",sourcePath); json.put("@message",message); json.put("@type",type); JSONArray temp = new JSONArray(); temp.addAll(tags); json.put("@tags", temp); JSONObject fieldTemp = new JSONObject(); fieldTemp.putAll(fields); json.put("@fields",fieldTemp); return json; } } The getter, setter, and equals methods have been excluded from this code snippet; however, they must be implemented in order. The equals method is vital for unit testing purposes. 4. Then create the Logspout class that extends the BaseRichSpout interface and implements the same pattern as described in Chapter 1, Setting Up Your Development Environment, declaring a single field as follows: outputFieldsDeclarer.declare(new Fields(FieldNames.LOG_ENTRY)); And then emitting the received log entries into the topology as follows: public void nextTuple() { String content = jedis.rpop(LOG_CHANNEL); if(content==null || "nil".equals(content)) { //sleep to prevent starving other threads try { Thread.sleep(300); } catch (InterruptedException e) { } } else { JSONObject obj=(JSONObject)JSONValue.parse(content); LogEntry entry = new LogEntry(obj); collector.emit(new Values(entry)); } }
Rule-based analysis of the log stream
1. Within Eclipse, create a class called LogRulesBolt, which extends BaseRichBolt, within the storm.cookbook.log package. As with the LogSpout class, the LogRulesBolt class will emit a single value containing a LogEntry instance. declarer.declare(new Fields(FieldNames.LOG_ENTRY); 2. Add a private member-level variable ksession of the StatelessKnowledgeSession class and initialize it within the bolt's prepare method. private StatelessKnowledgeSession ksession; public void prepare() { KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder(); kbuilder.add(ResourceFactory.newClassPathResource("/Syslog.drl", getClass()), ResourceType.DRL ); if ( kbuilder.hasErrors() ) { LOG.error( kbuilder.getErrors().toString()); } KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase(); kbase.addKnowledgePackages( kbuilder.getKnowledgePackages()); ksession = kbase.newStatelessKnowledgeSession(); } 3. In the bolt's execute method, you need to pass the LogEntry object from the tuple into the knowledge session. public void execute() { LogEntry entry = (LogEntry)input.getValueByField(FieldNames.LOG_ENTRY); if(entry == null) { LOG.fatal( "Received null or incorrect value from tuple" ); return; } ksession.execute(entry ); if(!entry.isFilter()){ collector.emit(new Values(entry)); } } 4. You next need to create the rules resource file; this can simply be done with a text editor or using the Eclipse plugin available from the update site (http:// download.jboss.org/drools/release/5.5.0.Final/org.drools.updatesite/). The rules resource file should be placed at the root of the classpath; create the file named Syslog.drl in src/main/resources and add this folder to the build path within Eclipse by right-clicking on the folder and going to Build Path | Use as source folder. 5. Add the following content to the rules resource: package storm.cookbook.log.rules import storm.cookbook.log.model.LogEntry; import java.util.regex.Matcher; import java.util.regex.Pattern; rule "Host Correction" when l: LogEntry(sourceHost == "localhost") then l.setSourceHost("localhost.example.com"); end rule "Filter By Type" when l: LogEntry(type != "syslog") then l.setFilter(true); end rule "Extract Fields" salience 100 //run later when l: LogEntry(filter != true) then String logEntryPattern = "^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\d+)\"([^\"]+)\" \"([^\"]+)\""; Matcher matcher = Pattern.compile(logEntryPattern). matcher(l.getMessage()); if(matcher.find()){ l.addField("_pid",matcher.group(1)); l.addField("_src",matcher.group(2)); } end
Indexing and persisting the log data
1. Create a new BaseRichBolt class called IndexerBolt and declare the org. elasticsearch.client.Client client as a private member variable. You must initialize it as follows within the prepare method: public class IndexerBolt extends BaseRichBolt { import org.elasticsearch.client.Client; private Client client; public void prepare() { if((Boolean)stormConf.get(backtype.storm.Config.TOPOLOGY_DEBUG) == true){ ode = NodeBuilder.nodeBuilder().local(true).node(); } else { String clusterName = (String) stormConf.get(Conf.ELASTIC_CLUSTER_NAME); if(clusterName == null) { clusterName = Conf.DEFAULT_ELASTIC_CLUSTER; } node = NodeBuilder.nodeBuilder().clusterName(clusterName).node(); } client = node.client(); } } 2. The LogEntry object can then be indexed during the execute method of the bolt: public void execute() { LogEntry entry = (LogEntry)input.getValueByField(FieldNames.LOG_ENTRY); if(entry == null){ LOG.fatal( "Received null or incorrect value from tuple" ); return; } String toBeIndexed = entry.toJSON().toJSONString(); IndexResponse response = client.prepareIndex(INDEX_NAME, INDEX_TYPE).setSource(toBeIndexed).execute().actionGet(); if(response == null) { LOG.error("Failed to index Tuple: " + input.toString()); } else{ if(response.getId() == null) { LOG.error("Failed to index Tuple: " + input.toString()); } else{ LOG.debug("Indexing success on Tuple: " + input.toString()); collector.emit(new Values(entry,response.getId())); } } } 3. The unit test of this bolt is not obvious; it is therefore worthwhile to give some explanation here. Create a new JUnit 4 unit test in your test source folder under the storm.cookbook.log package. Add a private inner class called StoringMatcher as follows: private static class StoringMatcher extends BaseMatcher<Values> { private final List<Values> objects = new ArrayList<Values>(); @Override public boolean matches(Object item) { if (item instanceof Values) { objects.add((Values)item); return true; } return false; } @Override public void describeTo(Description description) { description.appendText("any integer"); } public Values getLastValue() { return objects.remove(0); } } 4. Then implement the actual test method as follows: @Test public void testIndexing() throws IOException { //Config, ensure we are in debug mode Map config = new HashMap(); config.put(backtype.storm.Config.TOPOLOGY_DEBUG, true); Node node = NodeBuilder.nodeBuilder().local(true).node(); Client client = node.client(); final OutputCollector collector = context.mock(OutputCollector.class); IndexerBolt bolt = new IndexerBolt(); bolt.prepare(config, null, collector); final LogEntry entry = getEntry(); final Tuple tuple = getTuple(); final StoringMatcher matcher = new StoringMatcher(); context.checking(new Expectations(){ oneOf(tuple).getValueByField(FieldNames.LOG_ENTRY); will(returnValue(entry)); oneOf(collector).emit(with(matcher)); }); bolt.execute(tuple); context.assertIsSatisfied(); //get the ID for the index String id = (String) matcher.getLastValue().get(1); //Check that the indexing working GetResponse response = client.prepareGet(IndexerBolt.INDEX_NAME, IndexerBolt.INDEX_TYPE,id).execute().actionGet(); assertTrue(response.exists()); }
Counting and persisting log statistics
1. Download and install the storm-cassandra contrib project into your Maven repository: git clone https://github.com/quintona/storm-cassandra cd storm-cassandra mvn clean install 2. Create a new BaseRichBolt class called VolumeCountingBolt in the storm. cookbook.log package. The bolt must declare three output fields: public class VolumeCountingBolt { public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields(FIELD_ROW_KEY, FIELD_COLUMN, FIELD_INCREMENT)); } } 3. Then implement a static utility method to derive the minute representation of the log's time: public static Long getMinuteForTime(Date time) { Calendar c = Calendar.getInstance(); c.setTime(time); c.set(Calendar.SECOND, 0); c.set(Calendar.MILLISECOND, 0); return c.getTimeInMillis(); } 4. Implement the execute method (yes, it is that short): public void execute() { LogEntry entry = (LogEntry) input.getValueByField(FieldNames.LOG_ENTRY); collector.emit(new Values(getMinuteForTime(entry.getTimestamp()), entry.getSource(), 1L)); } 5. Finally, create the LogTopology class as per the pattern presented in Chapter 1, Setting Up Your Development Environment, and create the topology as follows: public class LogTopology { public static void main(String[] args) { builder.setSpout("logSpout", new LogSpout(), 10); builder.setBolt("logRules", new LogRulesBolt(), 10).shuffleGrouping("logSpout"); builder.setBolt("indexer", new IndexerBolt(), 10).shuffleGrouping("logRules"); builder.setBolt("counter", new VolumeCountingBolt(), 10).shuffleGrouping("logRules"); CassandraCounterBatchingBolt logPersistenceBolt = new CassandraCounterBatchingBolt(Conf.COUNT_CF_NAME, VolumeCountingBolt.FIELD_ROW_KEY, VolumeCountingBolt.FIELD_INCREMENT ); logPersistenceBolt.setAckStrategy(AckStrategy.ACK_ON_RECEIVE); builder.setBolt("countPersistor", logPersistenceBolt, 10).shuffleGrouping("counter"); conf.put(Conf.REDIS_PORT_KEY, Conf.DEFAULT_JEDIS_PORT); conf.put(CassandraBolt.CASSANDRA_KEYSPACE, Conf.LOGGING_KEYSPACE); } }
Creating a log analytics dashboard
1. Create a new project called log-web using the standard Maven archetype command: mvn archetype:generate -DgroupId=storm.cookbook -DartifactId=log-web -DarchetypeArtifactId=maven-archetype-webapp This will generate a standard project structure and Maven POM file for you. 2. Open the pom.xml file and remove the default dependencies, replacing them with the following dependencies: <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.8.1</version> <scope>test</scope> </dependency> <dependency> <groupId>org.hectorclient</groupId> <artifactId>hector-core</artifactId> <version>1.1-2</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.6.1</version> </dependency> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-server</artifactId> <version>1.16</version> </dependency> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-grizzly2</artifactId> <version>1.16</version> </dependency> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-servlet</artifactId> <version>1.16</version> </dependency> <dependency> <groupId>com.sun.jersey</groupId> <artifactId>jersey-json</artifactId> <version>1.16</version> </dependency> <dependency> <groupId>com.sun.jersey.contribs</groupId> <artifactId>jersey-multipart</artifactId> <version>1.16</version> </dependency> <dependency> <groupId>org.jmock</groupId> <artifactId>jmock-junit4</artifactId> <version>2.5.1</version> <scope>test</scope> </dependency> <dependency> <groupId>com.googlecode.json-simple</groupId> <artifactId>json-simple</artifactId> <version>1.1</version> </dependency> 3. Then add the following build plugins to the build section of the POM: <plugin> <groupId>org.mortbay.jetty</groupId> <artifactId>jetty-maven-plugin</artifactId> </plugin> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <executions> <execution> <goals> <goal>java</goal> </goals> </execution> </executions> </plugin> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>2.3</version> <configuration> <source>1.6</source> <target>1.6</target> <optimize>true</optimize> <showDeprecation>true</showDeprecation> <showWarnings>true</showWarnings> </configuration> </plugin> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>cassandra-maven-plugin</artifactId> </plugin> 4. Then import the project into Eclipse using the mvn eclipse:eclipse command and the Eclipse project import process. 5. The excellent Twitter Bootstrap GUI library will be used in the creation of the user interface. Start by downloading this into a separate location on your drive and expanding it. wget http://twitter.github.com/bootstrap/assets/bootstrap.zip unzip boostrap.zip 6. The bootstrap gives us a rapid start by providing many practical examples; we will simply copy one and adapt it: cp bootstrap/docs/examples/hero.html log-web/src/main/webapp/index.html cp bootstrap/docs/about log-web/src/main/webapp/about.html cp boostrap/docs/assets log-web/src/main/webapp/ cp boostrap/docs/templates log-web/src/main/webapp/ 7. While there is much HTML to update, we will focus on the important elements: the central content and graph. Update the index.html file, replacing the existing <div class="container"> tag and its contents with the following: <div class="container"> <div class="hero-unit"> <div id="chart"> <svg style="height: 300px;"></svg> </div> </div> <div class="row"> <div class="span4"> <h2> Timeseries</h2> <p>This graph shows a view of the log volumes of a given time period by day</p> <button id="updateToggleButton" type="button" class="btn btn-primary">Toggle Updates</button> </div> </div> </div> 8. For the graph, we will use the excellent data-visualization library, D3 (http://d3js.org/), and some preconfigured models based on D3, called NVD3 (http://nvd3. org/), by adding their compiled JavaScript into our webapp's assets folder: wget https://github.com/novus/nvd3/zipball/master unzip novus-nvd3-4e12985.zip cp novus-nvd3-4e12985/nv.d3.js log-web/src/main/webapp/assets/js/ cp novus-nvd3-4e12985/lib/d3.v2.js log-web/src/main/webapp/assets/js/ cp novus-nvd3-4e12985/src/nv.d3.css log-web/src/main/webapp/assets/css/ 9. Next, we include these into the HTML file and write the client-side JavaScript to retrieve the data and update the graph. 10. Add the following script includes at the bottom of the HTML file, after the other <script> tags: <script src="assets/js/d3.v2.js"></script> <script src="assets/js/nv.d3.js"></script> 11. And the CSS imports in the html header: <link type="text/css" rel="stylesheet" href="assets/css/nv.d3.css"> 12. Then add our custom JavaScript into a <script></script> tag below the other script imports, towards the bottom of the file: <script type="javascript"> var chart; var continueUpdates = true; nv.addGraph(function () { chart = nv.models.stackedAreaChart() .x(function(d) { return d[0] }) .y(function(d) { return d[1] }) .clipEdge(true); chart.xAxis.tickFormat(function(d) { return d3.time.format('%X')(new Date(d)) }) .axisLabel('Time').showMaxMin(false); chart.yAxis.axisLabel('Volume').tickFormat(d3.format(',.2f')); d3.select('#chart svg').datum(getdata()).transition().duration(500).call(chart); nv.utils.windowResize(chart.update); chart.dispatch.on('stateChange', function (e) { nv.log('New State:', JSON.stringify(e)); }); return chart; }); function update() { fetch(); if (continueUpdates) { setTimeout(update, 60000); } } update(); $(document).ready(function () { $('#updateToggleButton').bind('click', function () { if (continueUpdates) { continueUpdates = false; } else { continueUpdates = true; update(); } }); }); </script> 13. And then add the code to fetch the data from the server: var alreadyFetched = {}; function getUrl(){ var today = new Date(); today.setSeconds(0); today.setMilliseconds(0); var timestamp = today.valueOf(); var dataurl = "http://localhost:8080/services/LogCount/TotalsForMinute/" + timestamp + "/"; return dataurl; } function fetch() { // find the URL in the link right next to us var dataurl = getUrl(); // then fetch the data with jQuery function onDataReceived(series) { // append to the existing data for(var i = 0; i < series.length; i++){ if(alreadyFetched[series[i].FileName] == null){ alreadyFetched[series[i].FileName] = { FileName: series[i].FileName, values: [{ Minute: series[i].Minute, Total: series[i].Total }] }; } else { alreadyFetched[series[i].FileName].values.push({ Minute: series[i].Minute, Total: series[i].Total }); if(alreadyFetched[series[i].FileName].values.length > 30){ alreadyFetched[series[i].FileName].values.pop(); } } } //update the graph d3.select('#chart svg').datum(getdata()).transition().duration(500).call(chart); } function onError(request, status, error){ console.log("Received Error from AJAX: " + request.responseText); } $.ajax({ url:dataurl, type:'GET', dataType:'json', crossDomain: true, xhrFields: { withCredentials: true }, success:onDataReceived, error:onError }); } function getdata(){ var series = []; var keys = []; for (key in alreadyFetched) { keys.push(key); } for(var i = 0; i < keys.length; i++){ var newValues = []; for(var j = 0; j < alreadyFetched[keys[i]].values.length; j++){ newValues.push([alreadyFetched[keys[i]].values[j].Minute, alreadyFetched[keys[i]].values[j].Total]); } series.push({ key:alreadyFetched[keys[i]].FileName, values:newValues }); } return series; } 14. This completes the client-side part of the implementation. In order to expose the data to the client layer, we need to expose services to retrieve the data. 15. Start by creating a utility class called CassandraUtils in the storm.cookbook.services.resources package and add the following content: public class CassandraUtils { public static Cluster cluster; public static Keyspace keyspace; protected static Properties properties; public static boolean initCassandra(){ properties = new Properties(); try { properties.load(Main.class.getResourceAsStream("/cassandra.properties")); } catch (IOException ioe) { ioe.printStackTrace(); } cluster = HFactory.getOrCreateCluster(properties.getProperty("cluster.name", "DefaultCluster"), properties.getProperty("cluster.hosts", "127.0.0.1:9160")); ConfigurableConsistencyLevel ccl = new ConfigurableConsistencyLevel(); ccl.setDefaultReadConsistencyLevel(HConsistencyLevel.ONE); String keyspaceName = properties.getProperty("logging.keyspace", "Logging"); keyspace = HFactory.createKeyspace(keyspaceName, cluster, ccl); return (cluster.describeKeyspace(keyspaceName) != null); } } 16. Then create the LogCount class in the same package, which essentially exposes a RESTful lookup service: @Path("/LogCount") public class LogCount { @GET @Path("/TotalsForMinute/{timestamp}") @Produces("application/json") public String getMinuteTotals(@PathParam("timestamp") String timestamp){ SliceCounterQuery<String, String> query = HFactory.createCounterSliceQuery( CassandraUtils.keyspace, StringSerializer.get(), StringSerializer.get()); query.setColumnFamily("LogVolumeByMinute"); query.setKey(timestamp); query.setRange("", "", false, 100); QueryResult<CounterSlice<String>> result = query.execute(); Iterator<HCounterColumn<String>> it = result.get().getColumns().iterator(); JSONArray content = new JSONArray(); while (it.hasNext()) { HCounterColumn<String> column = it.next(); JSONObject fileObject = new JSONObject(); fileObject.put("FileName", column.getName()); fileObject.put("Total", column.getValue()); fileObject.put("Minute", Long.parseLong(timestamp)); content.add(fileObject); } return content.toJSONString(); } } 17. Finally, you expose the service by creating the LogServices class: @ApplicationPath("/") public class LogServices extends Application { public LogServices(){ CassandraUtils.initCassandra(); } @Override public Set<Class<?>> getClasses() { final Set<Class<?>> classes = new HashSet<Class<?>>(); // register root resource classes.add(LogCount.class); return classes; } } 18. Then configure the web.xml file: <web-app> <display-name>Log-Web</display-name> <servlet> <servlet-name>storm.cookbook.services.LogServices</servlet-name> <servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class> <init-param> <param-name>javax.ws.rs.Application</param-name> <param-value>storm.cookbook.services.LogServices</param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>storm.cookbook.services.LogServices</servlet-name> <url-pattern>/services/*</url-pattern> </servlet-mapping> </web-app> 19. You can now run your project using the following command from the root of your web-log project: mvn jetty:run Your dashboard will then be available at localhost:8080.
Storm(2) - Log Stream Processing的更多相关文章
- Apache Samza - Reliable Stream Processing atop Apache Kafka and Hadoop YARN
http://engineering.linkedin.com/data-streams/apache-samza-linkedins-real-time-stream-processing-fram ...
- 腾讯大数据平台Oceanus: A one-stop platform for real time stream processing powered by Apache Flink
January 25, 2019Use Cases, Apache Flink The Big Data Team at Tencent In recent years, the increa ...
- Stream Processing 101: From SQL to Streaming SQL in 10 Minutes
转自:https://wso2.com/library/articles/2018/02/stream-processing-101-from-sql-to-streaming-sql-in-ten- ...
- 13 Stream Processing Patterns for building Streaming and Realtime Applications
原文:https://iwringer.wordpress.com/2015/08/03/patterns-for-streaming-realtime-analytics/ Introduction ...
- 1.2 Use Cases中 Stream Processing官网剖析(博主推荐)
不多说,直接上干货! 一切来源于官网 http://kafka.apache.org/documentation/ Stream Processing 流处理 Many users of Kafka ...
- Akka(23): Stream:自定义流构件功能-Custom defined stream processing stages
从总体上看:akka-stream是由数据源头Source,流通节点Flow和数据流终点Sink三个框架性的流构件(stream components)组成的.这其中:Source和Sink是stre ...
- Stream processing with Apache Flink and Minio
转自:https://blog.minio.io/stream-processing-with-apache-flink-and-minio-10da85590787 Modern technolog ...
- [QA翻译]如何在Storm里拆分stream流?
原文:http://stackoverflow.com/questions/19807395/how-would-i-split-a-stream-in-apache-storm 问题:我现在不清楚如 ...
- 1.1 Introduction中 Kafka for Stream Processing官网剖析(博主推荐)
不多说,直接上干货! 一切来源于官网 http://kafka.apache.org/documentation/ Kafka for Stream Processing kafka的流处理 It i ...
随机推荐
- GitHub上不错的Android开源项目(三)
收集相关系列资料,自己用作参考,练习和实践.小伙伴们,总有一天,你也能写出 Niubility 的 Android App :-) GitHub上不错的Android开源项目(一):http://ww ...
- iOS - AFNetworking 网络请求
前言 在 iOS 开发中,一般情况下,简单的向某个 Web 站点简单的页面提交请求并获取服务器的响应,用 Xcode 自带的 NSURLConnection 是能胜任的.但是,在绝大部分下我们所需要访 ...
- Windows_CMD_临时环境变量
1. 以 path 为例: 1.1.查看: set path 1.2.添加: set path=%path%;"要添加的路径" 附录:下面贴上一些常用的环境变量及作用 %ALLUS ...
- ubuntu_杂
1. 虚拟机安装的 ubuntu 12.04,在关闭了 一段时间不操作就屏保 和 锁屏 之后, 在一段时间之后 虚拟机的屏幕还是变黑了,貌似是 os将 显示器关闭了. 这个相关的设置是在哪里? 相关设 ...
- js输出26个字母两种方法(js fromCharCode的使用)
方法一 var character = new Array("A","B","C","D","E", ...
- Java源码初学_HashMap
一.概念 HashMap的实例有两个参数影响其性能:初始容量和加载因子.容量是哈希表中桶的数量,初始容量只是哈希表在创建时的容量.加载因子 是哈希表在其容量自动增加之前可以达到多满的一种尺度.当哈希表 ...
- (二)stm32之中断配置
一.stm32的中断和异常 Cortex拥有强大的异常响应系统,它能够打断当前代码执行流程事件分为异常和中断,它们用一个表管理起来,编号为0~15为内核异常,16以上的为外部中断,这个表就是中断向量表 ...
- Android批量图片加载经典系列——采用二级缓存、异步加载网络图片
一.问题描述 Android应用中经常涉及从网络中加载大量图片,为提升加载速度和效率,减少网络流量都会采用二级缓存和异步加载机制,所谓二级缓存就是通过先从内存中获取.再从文件中获取,最后才会访问网络. ...
- Linux下利用rsync实现多服务器文件同步
windows做为文件服务器,使用rsync的windows服务版本,然后配置好就可以了.需要的朋友可以参考下. windows做为文件服务器,使用rsync的windows服务版本:cwRsyncS ...
- css3中的过渡(transition)
css3 transition(过渡)1.语法: transition:[ transition-property ] || [ transition-duration ] || [ transiti ...