Hadoop java.lang.ArrayIndexOutOfBoundsException: 3

2015-11-12T12:30:04

The input is a list of house data where each input record contains information about a single house: (address,city,state,zip,value). The five items in a record is delimited by the sign comma (,). The output should be the average house value in each zip code. Following is my current code:

public class ziphousevalue1 {

    public static class ZipHouseValueMapper extends Mapper < LongWritable, Text, Text, IntWritable > {
        private static final Text zip = new Text();
        private static final IntWritable value = new IntWritable();

        protected void map(LongWritable offset, Text line, Context context) throws IOException, InterruptedException {
            String[] tokens = value.toString().split(",");
            zip.set(tokens[3]);
            value.set(Integer.parseInt(tokens[4]));
            context.write(new Text(zip), value);
        }
    }

    public static class ZipHouseValueReducer extends Reducer < Text, IntWritable, Text, DoubleWritable > {

        private DoubleWritable average = new DoubleWritable();

        protected void reduce(Text zip, Iterable < IntWritable > values, Context context) throws IOException, InterruptedException {
            int count = 0;
            int sum = 0;
            for (IntWritable value: values) {
                sum += value.get();
                count++;
            }
            average.set(sum / count);
            context.write(zip, average);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
        if (otherArgs.length != 2) {
            System.err.println("Usage: ziphousevalue <in> <out>");
            System.exit(2);
        }
        Job job = new Job(conf, "ziphousevalue");
        job.setJarByClass(ziphousevalue1.class);
        job.setMapperClass(ZipHouseValueMapper.class);
        job.setReducerClass(ZipHouseValueReducer.class);

        job.setNumReduceTasks(3);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        configure(conf);
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

    public static void configure(Configuration conf) {
        System.out.println("Test+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

    }
}

However, it produces following error. I have taken a look at similar problem in this site, none seems to solve the problem. I have made sure the input files are correct. Is there something else I should check to fix this error? Thank you for your time.

java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 3
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
at ziphousevalue1$ZipHouseValueMapper.map(ziphousevalue1.java:29)
at ziphousevalue1$ZipHouseValueMapper.map(ziphousevalue1.java:24)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/11/11 22:10:42 INFO mapreduce.Job: Job job_local112498506_0001 running in uber mode : false
15/11/11 22:10:42 INFO mapreduce.Job:  map 0% reduce 0%
15/11/11 22:10:42 INFO mapreduce.Job: Job job_local112498506_0001 failed with state FAILED due to: NA
15/11/11 22:10:42 INFO mapreduce.Job: Counters: 0

Copyright License:
Author:「ARSN」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/33664464/hadoop-java-lang-arrayindexoutofboundsexception-3

About “Hadoop java.lang.ArrayIndexOutOfBoundsException: 3” questions

The input is a list of house data where each input record contains information about a single house: (address,city,state,zip,value). The five items in a record is delimited by the sign comma (,). The
I am using Hadoop MapReduce to calculate each year's min and max value, but when I run the program, I get the error: FAILED Error: java.lang.ArrayIndexOutOfBoundsException: 5 I think this is because
I am getting Array index bound of exception in Map programme. Below is the data and mapreduce programme. Data: 1,raja,10,10000 2,jyo,10,10000 3,tej,11,20000 4,tej1,11,20000 MapReduce Programm...
I was trying to solve this question using hadoop. Find the top ten rated businesses using the average ratings. Top rated business will come first. Recall that 4th column in review.csv file represe...
I've some problems making a map reduce job for process a cdv file. The problem is with the map process but I'm not sure. I'm doing.. public void map(Object key, Text value, Context context) throws
I've created a table in Hive with the following command : CREATE TABLE tweet_table( tweet STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\n' LINES TERMINATED BY '\n'&#x
Can anyone help me solve this error please? package bigdata.tp1; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org....
After adding lined-sinple-sorted.txt and users.txt in input directory of hdfs. I am trying to run the following command. hduser@ubuntu:/usr/local/hadoop$ bin/hadoop jar /opt/mahout/core/target/ma...
I'm trying to push from a Hadoop cluster to Teradata using Teradata Connector (TDCH). The entire process involves copying the previous day's data (based on partition) from a hive table to a hive ex...
When i trying to run the Hive update statement getting the following error. 2021-02-25 15:38:54,934 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptI...

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.