Blog Archive, Redshift, Scanner, Streamer, Use Case / 24 March 2015 / Ben

One File or Thousands: changing compression type has never been easier

When it comes to big data, compression is key. However, many popular analysis tools can’t handle Zip compressed files. Each Zip file must be converted to another compression type (such as Gzip) before being analyzed by these tools. This accrues a high cost due to the need to store these extra files as well as the large amount of time it takes to carry out the conversion. That is, if you’re not using Essentia.

The parts of Essentia that dramatically improve on this are its native support of Zip and Gzip compression as well as its ability to streaming unzip Zip files and then compress them into Gzip format.Thus you can select exactly the Zip files you need from wherever your data is stored using the Essentia Scanner, streaming convert them into Gzip format, and then output them wherever you want. They can be sent directly into Redshift or other analysis tools, saved to file, or sent to S3 for later loading.

file-conversion

It doesn’t matter how many files you want to convert. With one, very minor line change you can convert one file or thousands. We’ve found our process over 20 times faster than conventional conversion methods and it does the conversion without any need to store additional files. If you have compressed files you need converted or analyzed, start benefiting from Essentia today!

Japan