You’ve probably heard of them, but if you haven’t: Elasticsearch and Kibana are data management products created by Elastic. Elasticsearch is a search engine based on Lucene, and Kibana provides a user interface for exploring and visualizing data. Both are part of the Elastic Stack, a collection of open source tools for collecting and analyzing data that has enjoyed a rise in popularity in recent years and now boasts a number of notable users, including Facebook and Netflix.
Recently, I’ve spent some time fiddling around with open data sets in Elasticsearch and Kibana. I’ve found that they work well with Essentia, creating a seamless workflow for data organization and analysis. Continue reading to see how it all comes together.
Let’s look at some examples:
These are records from NOAA’s Severe Weather Events database, specifically Tornado Vertex Signatures, which indicate increased likelihood of tornadoes. I wrote more about this data set in my last blog post, where I cleaned the raw data in Essentia and then displayed it in Google Earth. That process was a lengthier one: I streamed the data from Essentia into a CSV, then used an online tool to generate a KML file for use in Google Earth, which I then downloaded and installed. Kibana’s tile mapping function makes things easier. I created my Essentia category and Elasticsearch index in the same script, and converted the latitude and longitude coordinates into geopoints. Then, after doing some exploration in the “Discover” tab, I created the visualization above.
This is another familiar public data set: here, I used a Dashboard to place my two visualizations side by side. It’s interesting, to say the least. We can see a fairly similar upwards trend in both, with a slight spike around 2005 in the percent of adults receiving disability benefits. Kibana also makes it easy to search your index: for example, in creating the charts above, I originally had overall information for the United States. I searched “state:California” in the search bar, and Kibana instantly adjusted to reflect California data, which is what you see here.
This data comes from an AWS Public Data Set (read more about it here). I created an Essentia category, streamed the contents into an Elasticsearch index, and pulled up Kibana. Shown here is a visualization I created showing the number of border crossings to/from Mexico from 1995-2007.
Overall, I really enjoyed using Elasticsearch and Kibana in conjunction with Essentia. Essentia is perfect for cleaning and categorizing raw data, and Essentia’s output can be directly streamed into an Elasticsearch index, making for an easy transition from organization to analysis. Kibana plays well with all kinds of data, providing auto-analysis results for easy exploration. As is often the case with open data, you don’t really know what it looks like until you begin to analyze it. The “Discover” feature in Kibana helps by providing a list of available fields, an overview of popular values for each field, allows for filtering by time period, and is generally extremely convenient for exploring data before you know what you want to do with it. “Visualize” is very user-friendly as well, with an aesthetically pleasing and intuitive interface that allows for numerous types of analysis. As we start to accumulate more data than we know what to do with, it’s important that data tools are flexible and powerful, able to smoothly integrate into different workflows. Essentia, Elasticsearch, and Kibana all play well together: try it out!
Sample scripts and more information can be found on git repository.