SequenceFile and MapFile, Checksumming, codecs and Writables

SequenceFile and MapFile, Checksumming, codecs and Writables

SequenceFile and MapFile are two file formats used in Hadoop to store binary key-value pairs. SequenceFile is optimized for sequential write and read access, while MapFile is optimized for random read access.

Checksumming is a technique used to ensure data integrity during transmission or storage. Hadoop supports various checksum algorithms such as CRC32, Adler32, etc.

Codecs are used to compress and decompress data in Hadoop. Hadoop provides various codecs such as Gzip, Snappy, LZO, etc. Writables are the serialization framework used by Hadoop to convert data into a binary format that can be stored or transmitted. Hadoop provides various types of Writables such as IntWritable, Text, BooleanWritable, etc.

Apply for Big Data and Hadoop Developer Certification

https://www.vskills.in/certification/certified-big-data-and-apache-hadoop-developer

Back to Tutorials

Get industry recognized certification – Contact us

Menu