java - 在来自map的密钥中,Hadoop: 错误:java.io. IOException: 类型不匹配: expected apache hadoop io text,收到org.apache.hadoop.io.LongWritable

  显示原文与译文双语对照的内容
0 0

我是tryiing编写一个反向索引计算的地图减少程序。

我的地图是

public class InvertdIdxMapper extends Mapper<LongWritable, Text, Text, Text> {
public void map(LongWritable ikey, Text ivalue, Context context,Reporter reporter)
 throws IOException, InterruptedException {
 Text word=new Text();
 Text location=new Text();
 FileSplit filespilt=(FileSplit)reporter.getInputSplit();
 String fileName=filespilt.getPath().getName();
 location.set(fileName);
 String line=ivalue.toString();
 StringTokenizer itr=new StringTokenizer(line.toLowerCase());
 while (itr.hasMoreTokens()){
 word.set(itr.nextToken());
//System.out.println("Key is"+ word +"value is"+location);
 context.write(word, location);
 }
 }
}

我的recducer代码是

public class InvertedIdxReducer extends Reducer<Text, Text, Text, Text> {
public void reduce(Text _key, Iterable<Text> values, Context context)
 throws IOException, InterruptedException {
 boolean first=true;
 StringBuilder toReturn=new StringBuilder();
//process valuess
 Iterator<Text> itr =values.iterator();
 while(itr.hasNext()){
 if(!first)
 toReturn.append(",");
 first=false;
 toReturn.append(itr.next().toString());
 }
 context.write(_key,new Text(toReturn.toString()));
 }
}

和驱动程序代码是

public class InvertedIdxDriver {
public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 Job job = Job.getInstance(conf,"JobName");
 job.setJarByClass(InvertedIdxDriver.class);
//TODO: specify a mapper
 job.setMapperClass(InvertdIdxMapper.class);
//TODO: specify a reducer
 job.setReducerClass(InvertedIdxReducer.class);
//TODO: specify output types
 job.setOutputKeyClass(Text.class);
 job.setOutputValueClass(Text.class);
/////
 job.setMapOutputKeyClass(Text.class);
 job.setMapOutputValueClass(Text.class);
//TODO: specify input and output DIRECTORIES (not files)
 FileInputFormat.setInputPaths(job, new Path(args[0]));
 FileOutputFormat.setOutputPath(job, new Path(args[1]));
 if (!job.waitForCompletion(true))
 return;
 }
}

当我运行上代码时,我得到以下错误

 15/08/18 13:27:04 INFO mapreduce.Job: Task Id : attempt_1439870445298_0019_m_000000_2, Status : FAILED
Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
 at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
 at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
 at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
 at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

这个程序的输入是简单的文本文件,很少行。 我在这个项目中遵循了 ,并在这个帖子中发布了这个帖子,但我的问题仍然存在。 我缺少一些重要的地图减少编程 cavet?

请建议。

谢谢你

时间:原作者:2个回答

0 0

我认为你没有正确重写 map 方法,所以调用了默认的map 方法,这就是为什么。 检查 map 方法的签名是否正确。 我认为它应该是这样的:

protected void map(LongWritable iKey, Text iValue, Context context) throws IOException, InterruptedException

此外,你还需要替换此行:

FileSplit filespilt=(FileSplit)reporter.getInputSplit();

使用:

FileSplit filespilt=(FileSplit)context.getInputSplit();
原作者:
...