现在非常喜欢IDEA,之前在mac 上用的eclipse 经常出现无缘无故的错误。所以转为IDEA.  不过新工具需要学习成本,手头上的项目就遇到了很多问题,现列举如下:

背景描述

在hadoop 开发时,经常在mr阶段将清洗后的数据入库到Hbase. 在这个过程中,需要编译、打jar包,然后上传到服务器,执行hadoop jar   *.jar 命令。每次清洗后需要手动4步操作。农民阿姨天生喜欢取巧,故这几天一直研究如何简化此过程。

思路描述

1.之前项目自动化打包上传都用ant ,不过是在window系统下eclipse开发的。但是在mac的IDEA中,屡次失败。总是出现如下错误信息

Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:287)
at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:240)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:317)
at java.util.jar.JarVerifier.update(JarVerifier.java:228)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:348)
at java.util.jar.JarFile.getInputStream(JarFile.java:415)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:101)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

网上搜了很多资料,解决方案都是失败。原因是 项目中引用的jar包签名有问题。(之前用eclipse导出jar包没有出现过这种情况)

2.用IDEA导出jar包,操作步骤如下
Fiel –>Project Structure

然后出现如下:

选好Manifest File ,Main Class

之后Build->Build Artifacts 可以编译打包了。去项目中 out 文件下可以找到。

但是通过这种打包方式,在运行时报错。因为hbase的lib相关包没有包含进去。这个一种笨的解决方案是在hadoop 中每个节点上的lib包下都拷贝一份hbase的lib包。这种方案有弊端,我没有选择。如何解决这个问题?我用maven 打包,上传jar包,运行,成功。

最终方案:

通过maven可以正常打包,那如何自动化上传并运行呢?我是用maven 打包,用ant 上传并执行的。

maven  pom 文件内容:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>groupId</groupId>
<artifactId>antdemo</artifactId>
<version>1</version> <dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.51</version>
</dependency>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-jsch</artifactId>
<version>1.9.5</version>
</dependency> </dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<!--这里要替换成jar包main方法所在类-->
<mainClass>HBaseImport</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- 指定在打包节点执行jar包合并操作 -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build> </project>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

ant 中的 build.xml的内容:

<?xml version="1.0" encoding="UTF-8"?>

<project name="项目名称" basedir="." default="sshexec">
<description>本配置文件供ANT编译项目、自动进行单元测试、打包并部署之用。</description>
<description>默认操作(输入命令:ant)为编译源程序并发布运行。</description> <!--属性设置-->
<property environment="env" />
<!--<property file="build.xml" />-->
<property name="build.dir" location="build.xml"/>
<property name="src.dir" value="${basedir}/src" />
<!--<property name="java.lib.dir" location="lib"/>-->
<property name="java.lib.dir" value="/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home/lib" />
<property name="classes.dir" value="${basedir}/classes" />
<property name="dist.dir" value="${basedir}/dist" />
<property name="third.lib.dir" value="/Users/zzy/Kclouds/01_mac/doc/workspace/other_workspace/hbase-0.98.8-hadoop2/lib" />
<!--<property name="third.lib.dir" value="{basedir}/lib" />--> <property name="localpath.dir" value="${basedir}" />
<property name="remote.host" value="192.168.122.211"/>
<property name="remote.username" value="root"/>
<property name="remote.password" value="lg"/>
<property name="remote.home" value="/test"/>
<!--每次需要知道的main类,写到这里-->
<property name="main.class" value="HBaseImport"/> <!-- 基本编译路径设置 --> <path id="compile.classpath">
<fileset dir="${java.lib.dir}">
<include name="tools.jar" />
</fileset>
<fileset dir="${third.lib.dir}">
<include name="*.jar"/>
</fileset> </path> <!-- 运行路径设置 -->
<path id="run.classpath">
<path refid="compile.classpath" />
<pathelement location="${classes.dir}" />
</path>
<!-- 清理,删除临时目录 -->
<target name="clean" description="清理,删除临时目录">
<!--delete dir="${build.dir}" /-->
<delete dir="${dist.dir}" />
<delete dir="${classes.dir}" />
<echo level="info">清理完毕</echo> </target>
<!-- 初始化,建立目录,复制文件 -->
<target name="init" depends="clean" description="初始化,建立目录,复制文件">
<mkdir dir="${classes.dir}" />
<mkdir dir="${dist.dir}" />
</target>
<!-- 编译源文件-->
<target name="compile" depends="init" description="编译源文件">
<javac srcdir="${src.dir}" destdir="${classes.dir}" source="1.7" target="1.7" includeAntRuntime="false" debug="false" verbose="false">
<compilerarg line="-encoding UTF-8 "/>
<classpath refid="compile.classpath" />
</javac>
</target>
<!-- 打包类文件 -->
<target name="jar" depends="compile" description="打包类文件">
<jar jarfile="${dist.dir}/jar.jar">
<fileset dir="${classes.dir}" includes="**/*.*" />
</jar>
</target>
<!--上传到服务器
**需要把lib目录下的jsch-0.1.51拷贝到$ANT_HOME/lib下,如果是Eclipse下的Ant环境必须在Window->Preferences->Ant->Runtime->Classpath中加入jsch-0.1.51。
-->
<!--<target name="ssh" depends="jar">-->
<!--<scp file="${dist.dir}/jar.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>-->
<!--</target>-->
<!-- -->
<!--<target name="sshexec" depends="ssh">-->
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/jar.jar ${main.class}"/>-->
<!--&lt;!&ndash;<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>&ndash;&gt;-->
<!--</target>--> <target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target> </project>
 
 
最关键的是:
<target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

这步是上传并执行,好好研究一下这步代码。

MR清洗后入库Hbase 代码:

 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

package demo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date; /**
* Created by zzy on 15/11/17.
*/ /**
* mr 中操作hbase
*/
public class HBaseImport {
static class BatchImportMapper extends Mapper<LongWritable,Text,LongWritable,Text>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
super.map(key, value, context);
String line = value.toString();
String[] splited = line.split("\t");
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String time =sdf.format(new Date(Long.parseLong(splited[0].trim()))); String rowkey = splited[1]+"_"+time;
Text v2s = new Text();
v2s.set(rowkey+"\t" +line);
context.write(key,v2s); }
}
static class BatchImportReducer extends TableReducer<LongWritable,Text,NullWritable> {
private byte[] family = "cf".getBytes(); @Override
protected void reduce(LongWritable key, Iterable<Text> v2s, Context context) throws IOException, InterruptedException {
// super.reduce(key, values, context);
for (Text v2 :v2s){
String [] splited = v2.toString().split("\t");
String rowKey = splited[0];
Put put = new Put(rowKey.getBytes());
put.add(family,"raw".getBytes(),v2.toString().getBytes());
put.add(family,"reportTime".getBytes(),splited[1].getBytes());
put.add(family,"msisdn".getBytes(),splited[2].getBytes());
put.add(family,"apmac".getBytes(),splited[3].getBytes());
put.add(family,"acmac".getBytes(),splited[4].getBytes());
}
}
}
private static final String tableName = "logs";
private static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum","192.168.122.213:2181");
conf.set("hbase.rootdir","hdfs://192.168.122.211:9000/hbase");
conf.set(TableOutputFormat.OUTPUT_TABLE, tableName); Job job = Job.getInstance(conf,HBaseImport.class.getSimpleName()); TableMapReduceUtil.addDependencyJars(job);
job.setJarByClass(HBaseImport.class);
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class); job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TableOutputFormat.class); FileInputFormat.setInputPaths(job, "hdfs://192.168.122.211:9000/user/hbase");
job.waitForCompletion(true);
// FileInputFormat.setInputPaths(job,""); }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

本文章如何对阁下有帮助,希望您高举贵手,支持一下。

Mac 下用IDEA时maven,ant打包 (mr 入库hbase)的更多相关文章

  1. 解决Mac下AndroidStudio内容时卡顿

    Mac下AndroidStudio在写代码的时候出现卡顿,小圆圈会一直转,此时我们应该检查下AndroidStudio的内存使用情况了. 1.点击左上角 AndroidStudio -- Prefer ...

  2. 解决mac下安装yeoman时没有权限问题

    在mac下安装yeoman经常会出现如下图错误: 解决办法:在命令行执行-- sudo chown -R $USER /usr/local/lib/node_modules 回车就OK

  3. Mac下配置Idea的Maven

    环境版本: Mac OS: 10.13.4 JDK: 1.8 Idea: 2018.3 Maven: 3.6.0 Maven 相关配置: Maven 下载: http://maven.apache.o ...

  4. 【Vegas原创】MAC下,idea手动maven jar包的方法

    1,到自己的项目目录下 Vegass-MacBook-Air:gms-boyol Vegas$ pwd/Users/Vegas/SynologyDrive/Coding/workspace/gms-b ...

  5. Mac下安装和配置Maven

    1.下载Maven 官网:http://maven.apache.org/download.cgi 下载版本:apache-maven-3.5.3-bin.tar.gz 2.配置环境变量 打开term ...

  6. mac下配置eclipse的maven环境

    转自:http://www.cnblogs.com/yqskj/archive/2013/03/30/2990292.html 1.下载maven的bin包,解压,配置到环境变量里面去 1). 首先到 ...

  7. mac下的应用程序发布 及 打包(Python写的脚本,可打包第三方库)

    其实这个问题在网上能搜到大把的解决方案.大家的统一答案都是 otool -L yourapp.app/Contents/MacOS/yourapp 根据输出信息在运行 install_name_too ...

  8. Mac下eclipse 启动时出现An error has occurred. See the log file的问题

    eclipse原来可以使用的好好的,装了多个版本的jdk后,打开eclipse出现An error has occurred. See the log file的问题,经过查找,可能原因之一是机子装了 ...

  9. Mac下通过homebrew安装maven

    1.安装Homebrew 将以下命令粘贴至终端 /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebr ...

随机推荐

  1. tmux/screen里面如何用鼠标滚轮来卷动窗口内容

    tmux里面用鼠标滚轮来卷动窗口内容 在 tmux里面,因为每个窗口(tmux window)的历史内容已经被tmux接管了,所以原来console/terminal提供的Shift+PgUp/PgD ...

  2. Request 对象

    Request 对象用于检索从浏览器向服务器发送的请求中的信息. 1.使用Request对象的Browser属性,可以访问HttpBrowserCapabilities属性获得当前正在使用哪种类型的浏 ...

  3. POJ2635The Embarrassed Cryptographer(大数取余+素数筛选+好题)

    题目链接 题意:K是由两个素数乘积,如果最小的素数小于L,输出BAD最小的素数,否则输出GOOD 分析 素数打表将 L 大点的素数打出来,一定要比L大,然后就开始枚举,只需K对 素数 取余 看看是否为 ...

  4. UVA11987Almost Union-Find(并查集删除节点)

    题目链接 题意:n个数(即1-n)和m个操作: 1表示把x和y合并,2表示把x移到y集合里面,3表示统计x集合的元素个数 1,3好说,关键是2操作,可以先把2删除掉,删除的操作可以找一个其他的数字来取 ...

  5. the setting of serial port in the SecureCRT

    set echo(display characters which are sent) Line wrap        : press 'enter' to send '\r'(0x0D), go ...

  6. 机器学习笔记--KNN算法1

    前言 Hello ,everyone. 我是小花.大四毕业,留在学校有点事情,就在这里和大家吹吹我们的狐朋狗友算法---KNN算法,为什么叫狐朋狗友算法呢,在这里我先卖个关子,且听我慢慢道来. 一 K ...

  7. 初学java注解编程 记录错误及解决办法

    1 :在form表单提交到controller层时 利用hbim的封装的访问数据库 form表单中属性要加上method方法 不然不成功. 2 :在运行eclipse时 有时粘贴个数据或者删除个字段老 ...

  8. 日志分析 第六章 安装elasticsearch

    在这里,以两台es集群为例. es集群健康状况有三种状态,这里我们搭建的es集群,只要两台不同时挂掉,数据不会丢失. green 所有主要分片和复制分片都可用 yellow 所有主要分片可用,但不是所 ...

  9. HTML学习笔记——head、body及简单标签

    1> title标签.网站关键词.网站描述.实现百度网的跳转 2> 单标签.对标签.p标签 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML ...

  10. HTTP 传输内容的压缩

    一.HTTP压缩和内容编码的区别 HTTP压缩,在HTTP协议中,其实是内容编码的一种. 在http协议中,可以对内容(也就是body部分)进行编码, 可以采用gzip这样的编码. 从而达到压缩的目的 ...