今天,在做canopy算法实例时,遇到这个问题,所以记录下来。下面是源码:

  

package czx.com.mahout;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
import org.apache.hadoop.util.ToolRunner;
import org.apache.mahout.common.AbstractJob;
import org.apache.mahout.math.RandomAccessSparseVector;
import org.apache.mahout.math.VectorWritable; public class TextVecWrite extends AbstractJob { public static void main(String[] args) throws Exception {
ToolRunner.run(new Configuration(), new TextVecWrite(), args);
} /**
* TextVecWriterMapper
* @author czx
*;
*/
public static class TextVecWriterMapper extends Mapper<LongWritable, Text, LongWritable,VectorWritable >{
@SuppressWarnings("unchecked")
@Override
protected void map(LongWritable key, Text value,
@SuppressWarnings("rawtypes") org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException {
String[] split = value.toString().split("\\s{1,}");
RandomAccessSparseVector vector = new RandomAccessSparseVector(split.length);
for(int i=0;i<split.length;++i){
vector.set(i, Double.parseDouble(split[i]));
}
VectorWritable vectorWritable = new VectorWritable(vector);
context.write(key, vectorWritable);
}
} /**
* TextVectorWritableReducer
* @author czx
*
*/
public static class TextVectorWritableReducer extends Reducer<LongWritable, VectorWritable, LongWritable , VectorWritable >{
@Override
protected void reduce(LongWritable arg0, Iterable<VectorWritable> arg1,
Context arg2)
throws IOException, InterruptedException {
for(VectorWritable v:arg1){
arg2.write(arg0, v);
}
}
} @Override
public int run(String[] arg0) throws Exception {
addInputOption();
addOutputOption();
if(parseArguments(arg0)==null){
return -1;
}
Path input = getInputPath();
Path output = getOutputPath();
Configuration conf = getConf(); Job job = new Job(conf,"textvectorWritable with input:"+input.getName());
job.setOutputFormatClass(SequenceFileOutputFormat.class);
job.setMapperClass(TextVecWriterMapper.class);
job.setReducerClass(TextVectorWritableReducer.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setOutputValueClass(VectorWritable.class);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(VectorWritable.class);
job.setJarByClass(TextVecWrite.class); FileInputFormat.addInputPath(job, input);
SequenceFileOutputFormat.setOutputPath(job, output); if(!job.waitForCompletion(true)){
throw new InterruptedException("Canopy Job failed processing "+input);
}
return 0;
}
}

  将程序编译打包成JAR并运行如下:

  

 hadoop jar ClusteringUtils.jar czx.com.mahout.TextVecWrite -i /user/hadoop/testdata/synthetic_control.data -o /home/czx/1

  但出现如下错误:

  

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli2.Option
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 3 more

  最后,发现是将mahout根目录下的相应的jar包复制到hadoop-2.4.1/share/hadoop/common/lib文件夹下时,少复制了mahout-core-0.9-job.jar,于是复制mahout-core-0.9-job.jar后,重新启动hadoop即可。

  

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option的更多相关文章

  1. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

    学习架构探险,从零开始写Java Web框架时,在学习到springAOP时遇到一个异常: "C:\Program Files\Java\jdk1.7.0_40\bin\java" ...

  2. MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\startup.bat启动web服务正常.但在MyEclipse8.5中集成配置Tomcat7后,在 ...

  3. MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\startup.bat启动web服务正常.但 在MyEclipse8.5中集成配置Tomcat7后, ...

  4. 【转】MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    http://www.cnblogs.com/newsouls/p/4021198.html 今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\sta ...

  5. shiro报错SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".和Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

    未能加载类"org.slf4j.impl.StaticLoggerBinder" 解决方案: <dependency> <groupId>org.slf4j ...

  6. spark-shell报错:Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

    环境: openSUSE42.2 hadoop2.6.0-cdh5.10.0 spark1.6.0-cdh5.10.0 按照网上的spark安装教程安装完之后,启动spark-shell,出现如下报错 ...

  7. 报错:Exception in thread "main" java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem

    报错现象: Exception in thread "main" java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/Fil ...

  8. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer

    在执行spark on hive 的时候在  sql.show()处报错 : Exception in thread "main" java.lang.NoClassDefFoun ...

  9. Exception in thread main java.lang.NoClassDefFoundError: org/apache/juli/logging/LogFacto

    报错: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/juli/logging/Log ...

随机推荐

  1. java常用类与包装类--包装类

    2.基本数据类型数据的包装类 局部变量中基本数据类型直接分配在栈中,而对象分配在堆中 将基本数据类型封装成对象的好处在于可以在对象中定义更多的功能方法来操作该数据 包装类主要功能:用于基本数据类型与字 ...

  2. thinkphp之session操作

    原理机制 配置部分 代码部分 助手函数 借助第三方介质存入session 从负载均衡角度考虑----最好放在memocache,redis

  3. 在UIScrollView、UICollectionView和UITableView中添加UIRefreshControl实现下拉刷新

    Apple在iOS 6中添加了UIRefreshControl,但只能在UITableViewController中使用,不能在UIScrollView和UICollectionView中使用. 从i ...

  4. maven项目使用自己创建的jar包--maven without test code

    eclipse版本为2018-12(4.10.0) 1.创建一个jar包 首先自己建立了一个maven project,名为jweb.GAV坐标: <groupId>amberai< ...

  5. 可持久化Trie模板

    如果你了解过 01 Trie 和 可持久化线段树(例如 : 主席树 ).那么就比较好去可持久化 Trie 可持久化 Trie 当 01 Trie 用的时候能很方便解决一些原本 01 Trie 不能解决 ...

  6. sh_08_打印小星星

    sh_08_打印小星星 # 在控制台连续输出五行 *,每一行星号的数量依次递增 # * # ** # *** # **** # ***** # 1. 定义一个计数器变量,从数字1开始,循环会比较方便 ...

  7. 文件的读写过程open read write close

    在python中,读写文件有3个步骤: 调用open()函数,返回一个File对象. 调用File对象的read()或write()方法. 调用File对象的close()方法,关闭该文件. 在读取或 ...

  8. [CSP-S模拟测试]:最大异或和(数学)

    题目传送门(内部题81) 输入格式 第一行一个整数$T(T\leqslant 20)$,表示测试数据组数 接下来$T$组,对于每一组,第一行一个整数$n$ 第二行有$n$个整数,为$w_1,w_2.. ...

  9. C# DataTable 增加行与列

    亲测有用的方法 DataTable AllInfos = new DataTable();//生成一个表格 DataColumn typeColumn = new DataColumn();//建一个 ...

  10. nginx请求转发配置

    以下为无ssl证书配置的请求转发 server { listen ; server_name api.****.com; #以下为指定请求域名匹配到某一个端口 #location ~* /union ...