博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
hadoop mapreduce
阅读量:5087 次
发布时间:2019-06-13

本文共 4005 字,大约阅读时间需要 13 分钟。

写在前面:

需要保证hadoop版本  各个jar版本一致,否则可能出现各种哦莫名奇妙的错误!

maven 依赖:

4.0.0
jar
BaseTecLearn
BaseTecLearn
1.0-SNAPSHOT
org.apache.spark
spark-core_2.11
2.2.0
org.apache.spark
spark-sql_2.11
2.2.0
org.apache.thrift
libthrift
0.6.1
org.apache.hadoop
hadoop-common
2.7.1
org.apache.hadoop
hadoop-mapreduce-client-core
2.7.4
View Code

 

resource目录下配置日志(很重要,可以查看警告啥的)

log4j.rootLogger=WARN,stdout,logfile  log4j.appender.stdout=org.apache.log4j.ConsoleAppender  log4j.appender.stdout.layout=org.apache.log4j.PatternLayout  log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n  log4j.appender.logfile=org.apache.log4j.FileAppender  log4j.appender.logfile.File=hadoop.log   log4j.appender.logfile.layout=org.apache.log4j.PatternLayout  log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%ns

 

package top.letsgogo;import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount {  public static class TokenizerMapper       extends Mapper
{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer
{ private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable
values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path("/home/panteng/IdeaProjects/pushscore-sdk/baseTecLearn/target/classes/regular.txt")); FileOutputFormat.setOutputPath(job, new Path("/home/panteng/IdeaProjects/pushscore-sdk/baseTecLearn/target/classes/regular")); System.out.println(job.waitForCompletion(true)); //System.exit(job.waitForCompletion(true) ? 0 : 1); }}

 

转载于:https://www.cnblogs.com/tengpan-cn/p/7553495.html

你可能感兴趣的文章
jQuery如何获得select选中的值?input单选radio选中的值
查看>>
设计模式 之 享元模式
查看>>
如何理解汉诺塔
查看>>
洛谷 P2089 烤鸡【DFS递归/10重枚举】
查看>>
15 FFT及其框图实现
查看>>
Linux基本操作
查看>>
osg ifc ifccolumn
查看>>
C++ STL partial_sort
查看>>
3.0.35 platform 设备资源和数据
查看>>
centos redis 安装过程,解决办法
查看>>
IOS小技巧整理
查看>>
WebDriverExtensionsByC#
查看>>
我眼中的技术地图
查看>>
lc 145. Binary Tree Postorder Traversal
查看>>
sublime 配置java运行环境
查看>>
在centos上开关tomcat
查看>>
重启rabbitmq服务
查看>>
正则表达式(进阶篇)
查看>>
无人值守安装linux系统
查看>>
【传道】中国首部淘宝卖家演讲公开课:农业本该如此
查看>>