Great, you have a lot of data, but how do you know what to do with it? Being able to explore your data without having to load everything into a database beforehand is a very necessary and useful step in deciding which data should be analyzed and how to analyze it.
With Essentia you can take a subset of your data and run SQL queries on it without loading anything into a database table. By streaming the data from S3 or a local directory directly into the query, you can quickly explore a subset of the files you plan to process before you commit a large amount of your resources to analyzing the full set of data.
Although Essentia is available as a standalone software product, we also use it daily in our marketing analytics
service offered under a SaaS model.
Like a lot of other people, we like the machine learning and data mining libraries available to R
to help explore and analyze data. But like everyone else, we face the burden of having to first clean, parse, and reduce large amounts of data before being able to use those analytic tools.