现在非常喜欢IDEA,之前在mac 上用的eclipse 经常出现无缘无故的错误。所以转为IDEA.  不过新工具需要学习成本,手头上的项目就遇到了很多问题,现列举如下:

背景描述

在hadoop 开发时,经常在mr阶段将清洗后的数据入库到Hbase. 在这个过程中,需要编译、打jar包,然后上传到服务器,执行hadoop jar   *.jar 命令。每次清洗后需要手动4步操作。农民阿姨天生喜欢取巧,故这几天一直研究如何简化此过程。

思路描述

1.之前项目自动化打包上传都用ant ,不过是在window系统下eclipse开发的。但是在mac的IDEA中,屡次失败。总是出现如下错误信息

Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:287)
at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:240)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:317)
at java.util.jar.JarVerifier.update(JarVerifier.java:228)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:348)
at java.util.jar.JarFile.getInputStream(JarFile.java:415)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:101)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

网上搜了很多资料,解决方案都是失败。原因是 项目中引用的jar包签名有问题。(之前用eclipse导出jar包没有出现过这种情况)

2.用IDEA导出jar包,操作步骤如下
Fiel –>Project Structure

然后出现如下:

选好Manifest File ,Main Class

之后Build->Build Artifacts 可以编译打包了。去项目中 out 文件下可以找到。

但是通过这种打包方式,在运行时报错。因为hbase的lib相关包没有包含进去。这个一种笨的解决方案是在hadoop 中每个节点上的lib包下都拷贝一份hbase的lib包。这种方案有弊端,我没有选择。如何解决这个问题?我用maven 打包,上传jar包,运行,成功。

最终方案:

通过maven可以正常打包,那如何自动化上传并运行呢?我是用maven 打包,用ant 上传并执行的。

maven  pom 文件内容:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>groupId</groupId>
<artifactId>antdemo</artifactId>
<version>1</version> <dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.51</version>
</dependency>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-jsch</artifactId>
<version>1.9.5</version>
</dependency> </dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<!--这里要替换成jar包main方法所在类-->
<mainClass>HBaseImport</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- 指定在打包节点执行jar包合并操作 -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build> </project>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

ant 中的 build.xml的内容:

<?xml version="1.0" encoding="UTF-8"?>

<project name="项目名称" basedir="." default="sshexec">
<description>本配置文件供ANT编译项目、自动进行单元测试、打包并部署之用。</description>
<description>默认操作(输入命令:ant)为编译源程序并发布运行。</description> <!--属性设置-->
<property environment="env" />
<!--<property file="build.xml" />-->
<property name="build.dir" location="build.xml"/>
<property name="src.dir" value="${basedir}/src" />
<!--<property name="java.lib.dir" location="lib"/>-->
<property name="java.lib.dir" value="/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home/lib" />
<property name="classes.dir" value="${basedir}/classes" />
<property name="dist.dir" value="${basedir}/dist" />
<property name="third.lib.dir" value="/Users/zzy/Kclouds/01_mac/doc/workspace/other_workspace/hbase-0.98.8-hadoop2/lib" />
<!--<property name="third.lib.dir" value="{basedir}/lib" />--> <property name="localpath.dir" value="${basedir}" />
<property name="remote.host" value="192.168.122.211"/>
<property name="remote.username" value="root"/>
<property name="remote.password" value="lg"/>
<property name="remote.home" value="/test"/>
<!--每次需要知道的main类,写到这里-->
<property name="main.class" value="HBaseImport"/> <!-- 基本编译路径设置 --> <path id="compile.classpath">
<fileset dir="${java.lib.dir}">
<include name="tools.jar" />
</fileset>
<fileset dir="${third.lib.dir}">
<include name="*.jar"/>
</fileset> </path> <!-- 运行路径设置 -->
<path id="run.classpath">
<path refid="compile.classpath" />
<pathelement location="${classes.dir}" />
</path>
<!-- 清理,删除临时目录 -->
<target name="clean" description="清理,删除临时目录">
<!--delete dir="${build.dir}" /-->
<delete dir="${dist.dir}" />
<delete dir="${classes.dir}" />
<echo level="info">清理完毕</echo> </target>
<!-- 初始化,建立目录,复制文件 -->
<target name="init" depends="clean" description="初始化,建立目录,复制文件">
<mkdir dir="${classes.dir}" />
<mkdir dir="${dist.dir}" />
</target>
<!-- 编译源文件-->
<target name="compile" depends="init" description="编译源文件">
<javac srcdir="${src.dir}" destdir="${classes.dir}" source="1.7" target="1.7" includeAntRuntime="false" debug="false" verbose="false">
<compilerarg line="-encoding UTF-8 "/>
<classpath refid="compile.classpath" />
</javac>
</target>
<!-- 打包类文件 -->
<target name="jar" depends="compile" description="打包类文件">
<jar jarfile="${dist.dir}/jar.jar">
<fileset dir="${classes.dir}" includes="**/*.*" />
</jar>
</target>
<!--上传到服务器
**需要把lib目录下的jsch-0.1.51拷贝到$ANT_HOME/lib下,如果是Eclipse下的Ant环境必须在Window->Preferences->Ant->Runtime->Classpath中加入jsch-0.1.51。
-->
<!--<target name="ssh" depends="jar">-->
<!--<scp file="${dist.dir}/jar.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>-->
<!--</target>-->
<!-- -->
<!--<target name="sshexec" depends="ssh">-->
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/jar.jar ${main.class}"/>-->
<!--&lt;!&ndash;<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>&ndash;&gt;-->
<!--</target>--> <target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target> </project>
 
 
最关键的是:
<target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

这步是上传并执行,好好研究一下这步代码。

MR清洗后入库Hbase 代码:

 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

package demo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date; /**
* Created by zzy on 15/11/17.
*/ /**
* mr 中操作hbase
*/
public class HBaseImport {
static class BatchImportMapper extends Mapper<LongWritable,Text,LongWritable,Text>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
super.map(key, value, context);
String line = value.toString();
String[] splited = line.split("\t");
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String time =sdf.format(new Date(Long.parseLong(splited[0].trim()))); String rowkey = splited[1]+"_"+time;
Text v2s = new Text();
v2s.set(rowkey+"\t" +line);
context.write(key,v2s); }
}
static class BatchImportReducer extends TableReducer<LongWritable,Text,NullWritable> {
private byte[] family = "cf".getBytes(); @Override
protected void reduce(LongWritable key, Iterable<Text> v2s, Context context) throws IOException, InterruptedException {
// super.reduce(key, values, context);
for (Text v2 :v2s){
String [] splited = v2.toString().split("\t");
String rowKey = splited[0];
Put put = new Put(rowKey.getBytes());
put.add(family,"raw".getBytes(),v2.toString().getBytes());
put.add(family,"reportTime".getBytes(),splited[1].getBytes());
put.add(family,"msisdn".getBytes(),splited[2].getBytes());
put.add(family,"apmac".getBytes(),splited[3].getBytes());
put.add(family,"acmac".getBytes(),splited[4].getBytes());
}
}
}
private static final String tableName = "logs";
private static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum","192.168.122.213:2181");
conf.set("hbase.rootdir","hdfs://192.168.122.211:9000/hbase");
conf.set(TableOutputFormat.OUTPUT_TABLE, tableName); Job job = Job.getInstance(conf,HBaseImport.class.getSimpleName()); TableMapReduceUtil.addDependencyJars(job);
job.setJarByClass(HBaseImport.class);
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class); job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TableOutputFormat.class); FileInputFormat.setInputPaths(job, "hdfs://192.168.122.211:9000/user/hbase");
job.waitForCompletion(true);
// FileInputFormat.setInputPaths(job,""); }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

本文章如何对阁下有帮助,希望您高举贵手,支持一下。

Mac 下用IDEA时maven,ant打包 (mr 入库hbase)的更多相关文章

  1. 解决Mac下AndroidStudio内容时卡顿

    Mac下AndroidStudio在写代码的时候出现卡顿,小圆圈会一直转,此时我们应该检查下AndroidStudio的内存使用情况了. 1.点击左上角 AndroidStudio -- Prefer ...

  2. 解决mac下安装yeoman时没有权限问题

    在mac下安装yeoman经常会出现如下图错误: 解决办法:在命令行执行-- sudo chown -R $USER /usr/local/lib/node_modules 回车就OK

  3. Mac下配置Idea的Maven

    环境版本: Mac OS: 10.13.4 JDK: 1.8 Idea: 2018.3 Maven: 3.6.0 Maven 相关配置: Maven 下载: http://maven.apache.o ...

  4. 【Vegas原创】MAC下,idea手动maven jar包的方法

    1,到自己的项目目录下 Vegass-MacBook-Air:gms-boyol Vegas$ pwd/Users/Vegas/SynologyDrive/Coding/workspace/gms-b ...

  5. Mac下安装和配置Maven

    1.下载Maven 官网:http://maven.apache.org/download.cgi 下载版本:apache-maven-3.5.3-bin.tar.gz 2.配置环境变量 打开term ...

  6. mac下配置eclipse的maven环境

    转自:http://www.cnblogs.com/yqskj/archive/2013/03/30/2990292.html 1.下载maven的bin包,解压,配置到环境变量里面去 1). 首先到 ...

  7. mac下的应用程序发布 及 打包(Python写的脚本,可打包第三方库)

    其实这个问题在网上能搜到大把的解决方案.大家的统一答案都是 otool -L yourapp.app/Contents/MacOS/yourapp 根据输出信息在运行 install_name_too ...

  8. Mac下eclipse 启动时出现An error has occurred. See the log file的问题

    eclipse原来可以使用的好好的,装了多个版本的jdk后,打开eclipse出现An error has occurred. See the log file的问题,经过查找,可能原因之一是机子装了 ...

  9. Mac下通过homebrew安装maven

    1.安装Homebrew 将以下命令粘贴至终端 /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebr ...

随机推荐

  1. C++ essentials 之 static 关键字

    extraction from The C++ Programming Language, 4th. edition, Bjarne Stroustrup If no initializer is s ...

  2. django处理静态文件

    静态文件指的是js css 还有图片这些,配置方法如下 1. 在设置文件(settings.py)中,installed_apps中添加 django.contrib.staticfiles 然后设置 ...

  3. linux(centos) 项目部署阶段相关命令汇总

    1.ssh免密码登陆主要命令cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys-->添加公钥service sshd restart -- ...

  4. web前端开发修炼之道--编写高质量代码

    想想自己的页面实现是否糟糕 Web标准--结构.样式和行为的分离 Web标准可分为三个部分:结构标准.样式标准.行为标准. 结构标准包括XML标准.XHTML标准.HTML标准 样式标准主要是指的CS ...

  5. Altium Designer 15 --- Design PCB Frame by Rhinoceros

    step 1: Draw a PCB shape and the main component placed in the PCB. The drawing sheet should be in th ...

  6. java编程思想-java集合总结-基本概念

    1.java 容器类类库的用途是"保存对象",并将其划分为两个不同的概念: 1)Collection.一个独立元素的序列,这些元素都服从一条或多条规则.List 必须按照插入的顺序 ...

  7. 【原】web移动端常用知识点笔记

    摘要:因为平时搞移动端的比例多一点,做个小小的总结.虽然网上很多这方面的总结,不过还是想自己也总结一下,适合自己的才是最好的.这样也方便以后自己的查阅 viewport模板——通用 <!DOCT ...

  8. asp+mysql__不同类型用户登录

    未防注入//0.0 /***这里代码应用场景为多类用户登录,根据用户选择不同的单选按钮判断用户登录的类型,*从而进行不同的数据表进行判断,用户的用户名和密码是否正确.*/ public partial ...

  9. python 传递结构体指针到 c++ dll

    CMakeLists.txt # project(工程名) project(xxx) # add_library(链接库名称 SHARED 链接库代码) add_library(xxx SHARED ...

  10. input disabled 表单禁用

    启用 <input type="> 禁用 <input type=" disabled="">