Import data to dynamodb from CSV

Photo by James Harrison on Unsplash

I have been working on AWS dynamoDB for a while now and every time I used to upload the JSON data to dynamoDB using a python script making use of the batch update. In many situations, I will have to convert the CSV file to JSON and then upload it to dynamoDB. So in this blog, I am going to show you exactly that.

If you are someone new to dynamoDB, then try dipping your toes using this link.

First of all, you need the access key and secret key of your IAM user if you need to access AWS services outside of the AWS environment. Make sure you have these ready.

There are two ways to do this

  • Using pandas library
  • Without using the pandas library

Using pandas library

install pandas using the following command

pip install pandas

Store the CSV file in JSON format using the following code

json_data = json.loads(
pd.read_csv('./data.csv').to_json(orient='records')
)

store the JSON data in dynamodb format as shown below

json_list = [{'item': json_data , 'table':'demo-table'}]

Enter the appropriate access, and secret key and upload the files to dynamodb as shown below

dynamodb = boto3.resource('dynamodb', aws_access_key_id=access_key, aws_secret_access_key=secret_key, region_name=region)

def insert_item_to_dynamodb(tablename,items):
    dynamoTable = dynamodb.Table(tablename)
    
    for record in items:
        dynamoTable.put_item(Item=record)
    
    print('Success')

for element in json_list:
    insert_item_to_dynamodb(element['table'],element['item'])

Without using the pandas library

In this method, you will have to specify the column names when uploading to dynamodb.

import boto3
import json
import os

table_name = "TABLE_NAME" #enter your table name

s3 = boto3.resource('s3')
db_table = boto3.resource('dynamodb').Table(table_name)

def save_to_dynamodb(id, name, age):
  return db_table.put_item(
      Item={
        'id': int(id),
        'Name': name,
        'age': int(age)
      })


csv_file = "" #path to csv
with open(csv_file, 'r') as f:
  next(f) # skip header
  for line in f:
    id, name, age = line.rstrip().split(',')
    result = save_to_dynamodb(id, name, age )
    print(result)

Happy Programming!!

Leave a Comment