Serverless Migration of Data Using DynamoDB Import from S3 – Part 1

November 17, 2022 | Comments(0) |

TABLE OF CONTENT

1. Overview
2. Introduction to DynamoDB import from S3
3. Steps to import data from S3 to DynamoDB
4. Troubleshooting the errors
5. Conclusion
6. About CloudThat
7. FAQs

 

Overview  

Before DynamoDB import from S3, you had a few alternatives for bulk importing data into the DynamoDB table using a data pipeline. A data loader may be needed for bulk data import, which costs money to create and maintain. Loading terabytes of data may take days or weeks until the solution is deployed across a fleet of virtual instances.

Introduction to DynamoDB import from S3

DynamoDB import from S3 is fully serverless which enables you to bulk import terabytes of data from Amazon S3 into a new DynamoDB. Source data can either be a single Amazon S3 object or multiple Amazon S3 objects that use the same prefix. Every record in S3 should have a sort key(optional) and partition key to match the schema of the target table. It provides the ability to import application data staged in CSV, DynamoDB JSON, or ION format to DynamoDB speeds up the migration of legacy applications to the AWS cloud. You can start imports using AWS CLI, AWS management console, or AWS SDK.

You do not need to supply additional capacity when defining a new table because DynamoDB import from S3 does not use any writing capacity. You must confirm that the person requesting the import has the authority to list and obtain data from the source S3 bucket to import data between AWS accounts. Additionally, the requester must be given access according to the S3 bucket policies.

Benefits

  • Move data more easily with a few clicks using the AWS console
  • It supports Cross account and cross-region sharing
  • It is simple and easy to use.

Steps to import data from S3 to DynamoDB

Creating an S3 bucket

  1. Log in to the Amazon console and search for S3.
  2. Click on Create bucket. Provide a unique bucket name and select the region.
  3. Upload .CSV files to the bucket.

DynamoDB1

Note: Only CSV, DynamoDB JSON, or ION format are supported for importing data to DynamoDB

Importing S3 data to DynamoDB

  1. In the search bar, search for DynamoDB and select the service. Choose Imports from S3 in the navigation pane.
  1. Click on Import from S3

DynamoDB2

3. Provide the appropriate details as below:

  • Select the S3 bucket created in Step1
  • Select the AWS account where your source S3 bucket is located
  • Choose the compression type as per your source S3 data
  • Select the appropriate file format
  • Choose the CSV delimiter character as per data in the source file

DynamoDB3

DynamoDB4

Click on Next, to navigate to the next page

  • Provide the table name where you want to store data
  • Provide the partition key that should match the data
  • Provide a sort key if it is required
  • Choose table settings as the default setting. You can select customized settings to see additional options.

DynamoDB5

DynamoDB6

Click on Next, to navigate to the next page.

  • Review the options carefully before importing data. Once it is imported you cannot change it.
  • Click on Import

DynamoDB7

DynamoDB8

4. Check the status of your import on the Imports from the S3 page. This page shows all import jobs from the last 90 days.

DynamoDB9

5. To check the results of the import. Navigate to the Tables

DynamoDB10

Troubleshooting the errors

You can come across common mistakes including syntax errors, formatting issues, and records without the necessary primary key. Error information is recorded in the CloudWatch logs for later examination. The logging will stop once it reaches a threshold of 10,000, but the import will still go on.

  1. Go to CloudWatch log groups in the navigation panel of CloudWatch.
  2. Here you can see Log groups with name /aws-dynamodb/imports. The log stream indicates whether the import is successful or failed along with metadata.

DynamoDB11

DynamoDB12

Conclusion

DynamoDB import from S3 provides an easy way to import a huge amount of data from S3 to DynamoDB’s new table. It is integrated with CloudWatch which creates a log entry for each error. Using DynamoDB import from the S3 does not require any additional services to migrate it to DynamoDB which reduces the maintenance cost and speeds up the process.

About CloudThat

CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best-in-industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.

Drop a query if you have any questions regarding DynamoDB and I will get back to you quickly.

To get started, go through our Consultancy page and Managed Services Package that is CloudThats offerings.

FAQs

  1. How much does it cost to import data from S3?

A. The cost of running an import is based on the uncompressed size of the source data in S3, multiplied by a per-GB cost, which is $0.15 per GB in the US East Region.

Items that are processed but fail to load into the table due to some formatting issues in the source data are also billed as part of the import process.

  1. What are the limitations of the Import from S3 feature?

A. Data cannot be imported to already existing DynamoDB tables.


Leave a Reply