Any Essentia file can be approached in the following steps:
1. Tell essentia to whether to run on your local machine or on ec2 worker instances.
2. Pick the bucket containing the data you want to analyze and scan it for files.
3. Organize these files into categories of your choosing and have essentia examine them to determine their
columns specifications or input them manually. This step is only required first time you access your bucket.
4. Define a database and what you want to store in it.
5. Start udbd so data can be stored in the database you just created.
6. Import data from one of your categories into your database using ess task stream
7. Export your modified data from the database and save it to a file.
The categorization step ONLY HAS TO BE RUN ONCE.
To start using essentia you need to scan your bucket and categorize your S3 files. This involves steps 2 and 3.
Typically, categories dont change much once you have completed the initial setup so all you have to repeat each time you want to access your data is step 2:
ess datastore select s3://*YourBucket* --aws_access_key=*YourAccessKey* --aws_secret_access_key=*YourSecretAccessKey*
ess datastore scan
Thus you can skip step 3 after your first run.
Learn more about how to Scan Your S3 Bucket.