Hadoop append to Sequencefile

2015-02-04T00:53:31

Currently I use the following code to append to an existing SequenceFile:

// initialize sequence writer
Writer writer = SequenceFile.createWriter(
        FileContext.getFileContext(this.conf), 
        this.conf, 
        new Path("/tmp/sequencefile"), 
        Text.class,
        BytesWritable.class, 
        CompressionType.NONE,
        null, 
        new Metadata(),
        EnumSet.of(CreateFlag.CREATE, CreateFlag.APPEND), 
        CreateOpts.blockSize(64 * 1024 * 1024));

writer.append(key, value);

// close writer
writer.hsync();
writer.close();

Everything works if the sequencefile not exists, but when the file exists Hadoop write the SequenceFile header (SEQ ...) again in the middle of the file and the file is unreadble for Hadoop.

I use Hadoop 2.6.0

Copyright License:
Author:「Christian D.」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/28304406/hadoop-append-to-sequencefile

About “Hadoop append to Sequencefile” questions

Currently I use the following code to append to an existing SequenceFile: // initialize sequence writer Writer writer = SequenceFile.createWriter( FileContext.getFileContext(this.conf), ...
I am trying to read a sequencefile in hadoop 2.0 but I am unable to achieve it. I am using the below code which works perfectly fine in hadoop 1.0. Please let me know if I am missing something wrt ...
I try to build nodejs server which collect binary data from user and storing it to Hadoop sequencefile. As a good tutorial, there's approach using the Hadoop executable. My question: Is there java...
I have written some binary image data to a Hadoop SequenceFile and would like to write it out as a PNG outside of Hadoop, if possible, using Java. [Edited] Overview of the data flow: Input files →
I'm creating a HashMap of key value pairs of a Hadoop Vector that is stored inside a SequenceFile. For efficiency purposes I want to know how long the Vector of key value pairs is so that I can
The Hadoop SequenceFile is basically a collection of key/value pairs. In my application, I need to consume events from Kafka and handle the possible duplicates. Can I use SequenceFile for deduplica...
I have a small program that writes 10 records to a block compressed SequenceFile on HDFS every second, and then run sync() every 5 minutes to ensure that everything older than 5 minutes are availab...
I am using hadoop 1.0.3 (I can't really upgrade right now,Thats for later. ) I have around 100 images in my HDFS and I am trying to combine them into a single sequencefile ( default no compression...
I was trying to run a matrix multiplication example presented by Mr. Norstadt under following link http://www.norstad.org/matrix-multiply/index.html. I can run it successfully with hadoop 0.20.2 bu...
I'm thinking to use a SequenceFile as "a little database" to store small files. I need that concurrency-client could store small file in this SequenceFile and retrieve an unique id (key of the reco...

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.