Apache server logs present an important opportunity with a multitude of valuable insights to be gained, but are typically buried in S3 directories with many other such logs in entirely different formats. Not only must the correct logs be extracted from their datastore, they must be converted into a format that can be properly analyzed.
This is where Essentia comes in. First we scan the S3 directory to be sure to select exactly the access logs we want to analyze. Then we use the Essentia Log Converter to convert these access logs into a form readable by our Preprocessor (ie a singly -delimited format) on the fly.
In one step we ignore the irrelevant columns in the apache logs so we can focus on processing only the most relevant data. Then we utilize a custom C module to bolster Essentia’s analysis and extract the location and system information out of the users’ IP addresses.
Apache logs are some of the most troublesome logs being used today. They can come in all shapes and sizes and often come riddled with errors and missing data. While the benefits of analyzing these logs are enormous, most powerful analysis tools lack the necessary versatility. Essentia fixes that.
With the Essentia Log Converter you can transform any apache log into a format ready for analysis. You can use it to convert your logs and then stream them directly into the Essentia Preprocessor to ignore the irrelevant columns, clean the data, and perform simple analysis on it.