AWS Athena: Serverless, Interactive Data Manipulation Service by Amazon

November 9, 2022 | Comments(0) |

TABLE OF CONTENT

1. Overview
2. Benefits of Athena
3. Workflow of Athena
4. Demo on Athena
5. Conclusion
6. About CloudThat
7. FAQs

Overview

Amazon Athena is an interactive query service offered by Amazon that makes it easy to explore data directly in the Amazon S3 bucket using standard SQL. Athena is serverless, so there is no infrastructure to manage, and we only pay for the queries we run. Athena is easy to use. It simply directs the data that is present in an S3 bucket and starts querying the data using standard SQL. Most results are delivered within seconds.

Athena is used to analyzing the data which is already available in the Amazon S3 bucket. Athena can operate with various types of structured and unstructured data types which include data formats like CSV, JSON, and ORC. If you want to do interactive, ad hoc SQL queries against data stored in Amazon S3, you should use Athena. Athena provides us with the easiest way to run queries for data in the Amazon S3 bucket.

Without the need to format data, Amazon Athena can perform interactive queries on the data stored in Amazon S3. For example, if you need to quickly check the web server logs to investigate a problem with our website, Athena can be helpful.

Benefits of Athena

  • Flexible
  • Serverless
  • Cost-Effective
  • Widely accessible
  • Fast performance
  • Secure
  • Easy integrations with other AWS services

Workflow of Athena

workflow

Demo on Athena

Step 1: Open the AWS S3 console.

Step 2: Click on create a bucket. Enter the bucket name.

step2

Step 3: Click on create a bucket.

step3

Step 4: Create a Test folder in the bucket. And upload one CSV file.

step4

Step 5: We have 3 columns of data in a CSV file.

step5

Step 6: Open the Amazon Athena console.

step6

Step 7: Click on settings and select our S3 bucket in the query result location which is already created in the previous steps.

step7

Step 8: Using the below query we can create a database and we need to select a database.

step8

Step 9: There are many options in the dropdown for creating a table. We need to select S3 bucket data.

step9

Step 10: Enter the table name and choose our existing database.  
step10

Step 11: Select the location where your data is stored.

step11

Step 12: Select the input file format which is uploaded in the bucket.

step12

Step 13: Enter the column name and select the Column type. We have only 3 columns if you have too many columns in your file, then you can use the bulk column feature.

step13

Step 14:  Click on create a table.

step14

Step 15: Now we will query the data which is in the file using standard SQL.

  • I am running the below query to display only “Kashyap” name data.
  • select * from demo where Name = ‘Kashyap’;

step15

Conclusion

As you can see in the blog, Amazon Athena is not a complex service. We can use it easily and makes our workflow simpler. We need to write proper queries for an accurate result within the seconds. I have covered all the points of Amazon Athena. If you want more learn about Amazon Athena, you can refer to this blog on Amazon Athena. 

About CloudThat

CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best-in-industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.

Drop a query if you have any questions regarding Athena and I will get back to you quickly.

To get started, go through our Consultancy page and Managed Services Package that is CloudThat’s offerings.

FAQs

Q1: What can be done with Amazon Athena?

A. You can analyze the data which are kept in Amazon S3 with the aid of Amazon Athena. Without aggregating or loading the data into Athena, you can use ANSI SQL to execute interactive analytics using Athena. Unstructured, semi-structured, and structured data sets can all be processed by Amazon Athena. Examples include columnar data formats like Apache Parquet and Apache ORC, CSV, JSON, and Avro. For simple visualization, Amazon Athena connects with Amazon QuickSight. Additionally, you can use an ODBC or JDBC driver to connect to Amazon Athena and generate reports or analyze data using SQL clients or business intelligence software.

Q2: Are there any additional charges associated with Amazon Athena?

A. Amazon Athena pulls information straight from Amazon S3 and executes a query and stores the results in the S3 bucket of your choice. So, you are charged at standard S3 charges for these result sets. Use lifecycle policies to limit the amount of data that is kept in S3.

Q3: What data formats does Amazon Athena support?

A. A wide range of data formats, including CSV, TSV, JSON, and Textfiles, are supported by Amazon Athena. It also supports open-source columnar formats like Apache ORC and Apache Parquet. Additionally, compressed data formats like Snappy, Zlib, LZO, and GZIP are also supported by Athena. You may boost performance and cut expenses by partitioning, compressing, and using columnar formats.


Leave a Reply