Read sequencefile in Hadoop 2.0

2014-05-29T14:42:30

I am trying to read a sequencefile in hadoop 2.0 but I am unable to achieve it. I am using the below code which works perfectly fine in hadoop 1.0. Please let me know if I am missing something wrt 2.0

Configuration conf = new Configuration();
try {
FileSystem fs = FileSystem.get(conf);
Path p = new Path("/Users/xxx/git/xxx/src/test/cntr-20140527104344-r-00172");
SequenceFile.Reader reader = new SequenceFile.Reader(fs,p,conf);
Writable key = (Writable) ReflectionUtils.newInstance(reader.getKeyClass(), conf);
Writable value = (Writable) ReflectionUtils.newInstance(reader.getValueClass(), conf);

I am getting the below error while trying to debug.

2014-05-28 23:30:31,567 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(52)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-05-28 23:30:31,572 INFO  compress.CodecPool (CodecPool.java:getDecompressor(121)) - Got brand-new decompressor
java.io.EOFException
    at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:264)
    at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:254)
    at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:163)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:78)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:90)
    at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
    at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
    at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
    at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
    at com.xxx.bis.social.feedbehavior.cdl.Debuger.testSpliter(Debuger.java:30)

Kindly help.

Note : Reading and Writing Sequencefile using Hadoop 2.0 Apis I referred this link. But it did not work.

Copyright License:
Author:「Arjun」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/23927289/read-sequencefile-in-hadoop-2-0

About “Read sequencefile in Hadoop 2.0” questions

I am trying to read a sequencefile in hadoop 2.0 but I am unable to achieve it. I am using the below code which works perfectly fine in hadoop 1.0. Please let me know if I am missing something wrt ...
I am looking for an example which is using the new API to read and write Sequence Files. Effectively I need to know how to use these functions createWriter(Configuration conf, org.apache.hadoop.io.
Currently I use the following code to append to an existing SequenceFile: // initialize sequence writer Writer writer = SequenceFile.createWriter( FileContext.getFileContext(this.conf), ...
I try to build nodejs server which collect binary data from user and storing it to Hadoop sequencefile. As a good tutorial, there's approach using the Hadoop executable. My question: Is there java...
I'm creating a HashMap of key value pairs of a Hadoop Vector that is stored inside a SequenceFile. For efficiency purposes I want to know how long the Vector of key value pairs is so that I can
I read the SequenceFile.java in hadoop-1.0.4 source codes. And I find the sync(long) method which is used to find a "sync marker" (a 16 bytes MD5 when generated at file creation time) in SequenceFile
The Hadoop SequenceFile is basically a collection of key/value pairs. In my application, I need to consume events from Kafka and handle the possible duplicates. Can I use SequenceFile for deduplica...
I have written some binary image data to a Hadoop SequenceFile and would like to write it out as a PNG outside of Hadoop, if possible, using Java. [Edited] Overview of the data flow: Input files →
I am writing a MapReduce job to test some calculations. I split my input into maps so that each map does part of the calculus, the result will be a list of (X,y) pairs which I want to flush into a
I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 bu...

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.