现在非常喜欢IDEA,之前在mac 上用的eclipse 经常出现无缘无故的错误。所以转为IDEA.  不过新工具需要学习成本,手头上的项目就遇到了很多问题,现列举如下:

背景描述

在hadoop 开发时,经常在mr阶段将清洗后的数据入库到Hbase. 在这个过程中,需要编译、打jar包,然后上传到服务器,执行hadoop jar   *.jar 命令。每次清洗后需要手动4步操作。农民阿姨天生喜欢取巧,故这几天一直研究如何简化此过程。

思路描述

1.之前项目自动化打包上传都用ant ,不过是在window系统下eclipse开发的。但是在mac的IDEA中,屡次失败。总是出现如下错误信息

Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:287)
at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:240)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:317)
at java.util.jar.JarVerifier.update(JarVerifier.java:228)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:348)
at java.util.jar.JarFile.getInputStream(JarFile.java:415)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:101)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

网上搜了很多资料,解决方案都是失败。原因是 项目中引用的jar包签名有问题。(之前用eclipse导出jar包没有出现过这种情况)

2.用IDEA导出jar包,操作步骤如下
Fiel –>Project Structure

然后出现如下:

选好Manifest File ,Main Class

之后Build->Build Artifacts 可以编译打包了。去项目中 out 文件下可以找到。

但是通过这种打包方式,在运行时报错。因为hbase的lib相关包没有包含进去。这个一种笨的解决方案是在hadoop 中每个节点上的lib包下都拷贝一份hbase的lib包。这种方案有弊端,我没有选择。如何解决这个问题?我用maven 打包,上传jar包,运行,成功。

最终方案:

通过maven可以正常打包,那如何自动化上传并运行呢?我是用maven 打包,用ant 上传并执行的。

maven  pom 文件内容:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>groupId</groupId>
<artifactId>antdemo</artifactId>
<version>1</version> <dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.51</version>
</dependency>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-jsch</artifactId>
<version>1.9.5</version>
</dependency> </dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<!--这里要替换成jar包main方法所在类-->
<mainClass>HBaseImport</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- 指定在打包节点执行jar包合并操作 -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build> </project>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

ant 中的 build.xml的内容:

<?xml version="1.0" encoding="UTF-8"?>

<project name="项目名称" basedir="." default="sshexec">
<description>本配置文件供ANT编译项目、自动进行单元测试、打包并部署之用。</description>
<description>默认操作(输入命令:ant)为编译源程序并发布运行。</description> <!--属性设置-->
<property environment="env" />
<!--<property file="build.xml" />-->
<property name="build.dir" location="build.xml"/>
<property name="src.dir" value="${basedir}/src" />
<!--<property name="java.lib.dir" location="lib"/>-->
<property name="java.lib.dir" value="/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home/lib" />
<property name="classes.dir" value="${basedir}/classes" />
<property name="dist.dir" value="${basedir}/dist" />
<property name="third.lib.dir" value="/Users/zzy/Kclouds/01_mac/doc/workspace/other_workspace/hbase-0.98.8-hadoop2/lib" />
<!--<property name="third.lib.dir" value="{basedir}/lib" />--> <property name="localpath.dir" value="${basedir}" />
<property name="remote.host" value="192.168.122.211"/>
<property name="remote.username" value="root"/>
<property name="remote.password" value="lg"/>
<property name="remote.home" value="/test"/>
<!--每次需要知道的main类,写到这里-->
<property name="main.class" value="HBaseImport"/> <!-- 基本编译路径设置 --> <path id="compile.classpath">
<fileset dir="${java.lib.dir}">
<include name="tools.jar" />
</fileset>
<fileset dir="${third.lib.dir}">
<include name="*.jar"/>
</fileset> </path> <!-- 运行路径设置 -->
<path id="run.classpath">
<path refid="compile.classpath" />
<pathelement location="${classes.dir}" />
</path>
<!-- 清理,删除临时目录 -->
<target name="clean" description="清理,删除临时目录">
<!--delete dir="${build.dir}" /-->
<delete dir="${dist.dir}" />
<delete dir="${classes.dir}" />
<echo level="info">清理完毕</echo> </target>
<!-- 初始化,建立目录,复制文件 -->
<target name="init" depends="clean" description="初始化,建立目录,复制文件">
<mkdir dir="${classes.dir}" />
<mkdir dir="${dist.dir}" />
</target>
<!-- 编译源文件-->
<target name="compile" depends="init" description="编译源文件">
<javac srcdir="${src.dir}" destdir="${classes.dir}" source="1.7" target="1.7" includeAntRuntime="false" debug="false" verbose="false">
<compilerarg line="-encoding UTF-8 "/>
<classpath refid="compile.classpath" />
</javac>
</target>
<!-- 打包类文件 -->
<target name="jar" depends="compile" description="打包类文件">
<jar jarfile="${dist.dir}/jar.jar">
<fileset dir="${classes.dir}" includes="**/*.*" />
</jar>
</target>
<!--上传到服务器
**需要把lib目录下的jsch-0.1.51拷贝到$ANT_HOME/lib下,如果是Eclipse下的Ant环境必须在Window->Preferences->Ant->Runtime->Classpath中加入jsch-0.1.51。
-->
<!--<target name="ssh" depends="jar">-->
<!--<scp file="${dist.dir}/jar.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>-->
<!--</target>-->
<!-- -->
<!--<target name="sshexec" depends="ssh">-->
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/jar.jar ${main.class}"/>-->
<!--&lt;!&ndash;<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>&ndash;&gt;-->
<!--</target>--> <target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target> </project>
 
 
最关键的是:
<target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

这步是上传并执行,好好研究一下这步代码。

MR清洗后入库Hbase 代码:

 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

package demo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date; /**
* Created by zzy on 15/11/17.
*/ /**
* mr 中操作hbase
*/
public class HBaseImport {
static class BatchImportMapper extends Mapper<LongWritable,Text,LongWritable,Text>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
super.map(key, value, context);
String line = value.toString();
String[] splited = line.split("\t");
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String time =sdf.format(new Date(Long.parseLong(splited[0].trim()))); String rowkey = splited[1]+"_"+time;
Text v2s = new Text();
v2s.set(rowkey+"\t" +line);
context.write(key,v2s); }
}
static class BatchImportReducer extends TableReducer<LongWritable,Text,NullWritable> {
private byte[] family = "cf".getBytes(); @Override
protected void reduce(LongWritable key, Iterable<Text> v2s, Context context) throws IOException, InterruptedException {
// super.reduce(key, values, context);
for (Text v2 :v2s){
String [] splited = v2.toString().split("\t");
String rowKey = splited[0];
Put put = new Put(rowKey.getBytes());
put.add(family,"raw".getBytes(),v2.toString().getBytes());
put.add(family,"reportTime".getBytes(),splited[1].getBytes());
put.add(family,"msisdn".getBytes(),splited[2].getBytes());
put.add(family,"apmac".getBytes(),splited[3].getBytes());
put.add(family,"acmac".getBytes(),splited[4].getBytes());
}
}
}
private static final String tableName = "logs";
private static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum","192.168.122.213:2181");
conf.set("hbase.rootdir","hdfs://192.168.122.211:9000/hbase");
conf.set(TableOutputFormat.OUTPUT_TABLE, tableName); Job job = Job.getInstance(conf,HBaseImport.class.getSimpleName()); TableMapReduceUtil.addDependencyJars(job);
job.setJarByClass(HBaseImport.class);
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class); job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TableOutputFormat.class); FileInputFormat.setInputPaths(job, "hdfs://192.168.122.211:9000/user/hbase");
job.waitForCompletion(true);
// FileInputFormat.setInputPaths(job,""); }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

本文章如何对阁下有帮助,希望您高举贵手,支持一下。

Mac 下用IDEA时maven,ant打包 (mr 入库hbase)的更多相关文章

  1. 解决Mac下AndroidStudio内容时卡顿

    Mac下AndroidStudio在写代码的时候出现卡顿,小圆圈会一直转,此时我们应该检查下AndroidStudio的内存使用情况了. 1.点击左上角 AndroidStudio -- Prefer ...

  2. 解决mac下安装yeoman时没有权限问题

    在mac下安装yeoman经常会出现如下图错误: 解决办法:在命令行执行-- sudo chown -R $USER /usr/local/lib/node_modules 回车就OK

  3. Mac下配置Idea的Maven

    环境版本: Mac OS: 10.13.4 JDK: 1.8 Idea: 2018.3 Maven: 3.6.0 Maven 相关配置: Maven 下载: http://maven.apache.o ...

  4. 【Vegas原创】MAC下,idea手动maven jar包的方法

    1,到自己的项目目录下 Vegass-MacBook-Air:gms-boyol Vegas$ pwd/Users/Vegas/SynologyDrive/Coding/workspace/gms-b ...

  5. Mac下安装和配置Maven

    1.下载Maven 官网:http://maven.apache.org/download.cgi 下载版本:apache-maven-3.5.3-bin.tar.gz 2.配置环境变量 打开term ...

  6. mac下配置eclipse的maven环境

    转自:http://www.cnblogs.com/yqskj/archive/2013/03/30/2990292.html 1.下载maven的bin包,解压,配置到环境变量里面去 1). 首先到 ...

  7. mac下的应用程序发布 及 打包(Python写的脚本,可打包第三方库)

    其实这个问题在网上能搜到大把的解决方案.大家的统一答案都是 otool -L yourapp.app/Contents/MacOS/yourapp 根据输出信息在运行 install_name_too ...

  8. Mac下eclipse 启动时出现An error has occurred. See the log file的问题

    eclipse原来可以使用的好好的,装了多个版本的jdk后,打开eclipse出现An error has occurred. See the log file的问题,经过查找,可能原因之一是机子装了 ...

  9. Mac下通过homebrew安装maven

    1.安装Homebrew 将以下命令粘贴至终端 /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebr ...

随机推荐

  1. 使用uWSGI部署django项目

    先说说什么是uWSGI吧,他是实现了WSGI协议.uwsgi.http等协议的一个web服务器,那什么是WSGI呢? WSGI是一种Web服务器网关接口.它是一个Web服务器(如nginx)与应用服务 ...

  2. 【Beta】Daily Scrum 第一天

    [目录] 1.任务进度 2.困难及解决 3.燃尽图 4.代码check-in 5.总结 1. 任务进度 学号 已完成 接下来要完成的 612 添加计时界面返回按键事件,添加SharePreferenc ...

  3. 数论 Note.

    1. $ax+by=1 \rightarrow gcd(a,b)=1$ 2.如果一个数的后n位能被$2^n$整除,那么这个数能被$2^n$整除. 3.如果一个数的各位数之和能被3,9整除,那么这个数能 ...

  4. python的正则表达式 re-------可以在字符串前加上 r 这个前缀来避免部分疑惑,因为 r 开头的python字符串是 raw 字符串,所以里面的所有字符都不会被转义

    正则表达式使用反斜杆(\)来转义特殊字符,使其可以匹配字符本身,而不是指定其他特殊的含义.这可能会和python字面意义上的字符串转义相冲突,这也许有些令人费解.比如,要匹配一个反斜杆本身,你也许要用 ...

  5. GMap.NET使用一

    https://greatmaps.codeplex.com/releases/view/20235 从上面网站下载需要的组件dll,也可以下载源码研究,解压后有两个文件夹,如图1所示,根据不同的fr ...

  6. IO多路复用及ThreadingTCPServer源码阅读

    IO多路复用 socket模块是阻塞的,通过socket建立的服务端可以接收多个请求,但只能同时处理一个请求,其他请求都被阻塞.可以通过IO多路复用解决这个问题,socketserver内部使用的就是 ...

  7. SpringMVC处理请求流程

    SpringMVC核心处理流程: 1.DispatcherServlet前端控制器接收发过来的请求,交给HandlerMapping处理器映射器 2.HandlerMapping处理器映射器,根据请求 ...

  8. Linux学习笔记<三>

    <1>查看本机的IP地址 命令:ifconfig -a 机器的ip地址是:(inet 地址:172.16.163.57 ) <2>单独查看内存使用情况的命令:free -m 查 ...

  9. JavaScript学习笔记——对象知识点

    javascript对象的遍历.内存分布和封装特性 一.javascript对象遍历 1.javascript属性访问 对象.属性 对象[属性] //字符串格式 //javascript属性的访问方法 ...

  10. [Unity] Unity3D研究院编辑器之独立Inspector属性

    本文转自: http://www.xuanyusong.com/archives/3680雨松MOMO Unity提供了强大的Editor功能, 我们可以很轻易的在EditorGUI中绘制任意的属性. ...