2018-12-29

Connect PostgreSQL RDS instance and Python AWS Lambda function

I recently had a need to write from a Lambda function into a PostgreSQL RDS instance. Normally, I would just copy all my Python dependencies from my virtual env into a “dist” folder, zip that folder up with the lambda_function.py file and deploy that to S3, then Lambda.

For reasons well beyond the scope of this post, that method doesn’t work with the psycopg2 library, however. Lucky for us, this repo contains precompiled versions of psycopg2 for both Python2.* and Python3.6. The instructions in the repo are clear, so we’ll follow them. We’ll make a directory and copy the appropriate psycopg2 library for use with Python 3.6. (Don’t forget to rename it from psycopg2-3.6 to psycopg2.)

Next, we’ll fire up an RDS instance of PostgreSQL.

aws rds create-db-instance
    --db-subnet-group-name default \
    --db-instance-identifier LambdaPGConnect \
    --db-instance-class db.t2.micro \
    --engine postgres \
    --allocated-storage 5 \
    --publicly-accessible \
    --db-name LambdaPGConnectDB \
    --master-username lambdapgconnect \
    --master-user-password lambdapgconnect1234 \
    --backup-retention-period 3\
    --vpc-security-group-ids sg-########

I chose a subnet-group and security-group in my default VPC, which is fine for this tutorial, but not recommended for production work. Remember the security-group id, we’ll need that in a few minutes. I set the –publicly-accessible flag to true for this instance so that I can access it via my SQL client. (It can take a bit of time for AWS to create your DB.)

After the RDS instance is created, we’ll make some edits to the ‘Security Group’ inbound rules, then log-in to the instance via our PostgreSQL client of choice in order to make a table and populate it.

From the AWS console, go to RDS > Databases then click on the database you just created. Under ‘Connectivity’, look Security > VPC Security Groups and click on that VPC. Under the Inbound tab, click ‘Edit’ and add a ‘PostgreSQL’ rule. The ‘Protocol’ and ‘Port Range’ will self-populate and under ‘Source’ select ‘My IP.” This will allow you to access resources in that VPC from outside the VPC and from the IP address of your location.

security group edit inbound rules console

Then, add another rule of type ‘PostgreSQL’. This time the ‘Source’ will be ‘Custom’ again, but we want to add the Security Group itself, i.e. we’re telling the Security Group it can access resources from within itself. This rule is necessary because the Lambda function and the RDS instance are in the same group and their relationship must be made explicit.

From the command line, run aws rds describe-db-instances, from that json we’ll need the Endpoint.Address and Endpoint.Port as well as the master-username and master-user-password from the create statement. We should be able to log-in to the instance from our SQL client now. Create the table and populate it with data with the below SQL statements.

CREATE TABLE "public"."estabs_tbl" (
	"name" varchar(255),
	"estab_id" varchar(255),
	"address" varchar(255),
	"latitude" float4,
	"longitude" float4
)

INSERT INTO estabs_tbl (address, estab_id, latitude, longitude, name) VALUES
	('1222 24TH ST NW San Antonio, TX 78207','1021','29.4418','-98.5409','CASA DOS LAREDOS'),
	('1150 AUSTIN HWY San Antonio, TX 78209','1023','29.487','-98.4476','BUN & BARREL'),
	('1306 BASSE RD San Antonio, TX 78212','10235','29.4879','-98.5047','SHOP N SHOP');

Let’s create the Lambda package. We’ll just write a single file function.

lambda_function.py

#!/usr/bin/python
import psycopg2

db_host = "lambdapgconnect.ctxoyiwkbtuq.us-east-1.rds.amazonaws.com"
db_port = 5432
db_name = "LambdaPGConnectDB"
db_user = "lambdapgconnect"
db_pass = "lambdapgconnect1234"
db_table = "estabs_tbl"

def create_conn():
    conn = None
    try:
        conn = psycopg2.connect("dbname={} user={} host={} password={}".format(db_name,db_user,db_host,db_pass))
    except:
        print("Cannot connect.")
    return conn

def fetch(conn, query):
    result = []
    print("Now executing: {}".format(query))
    cursor = conn.cursor()
    cursor.execute(query)

    raw = cursor.fetchall()
    for line in raw:
        result.append(line)

    return result

def lambda_handler(event, context):
    query_cmd = "select count(\*) from estabs_tbl"
    print(query_cmd)

    # get a connection, if a connect cannot be made an exception will be raised here
    conn = create_conn()

    result = fetch(conn, query_cmd)
    conn.close()

    return result

Next, we’ll create the appropriate role to assign to the Lambda function.

create_role.sh

role_name="lambda-vpc-execution-role"
role_policy_arn="arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"

aws iam create-role --role-name "lambda-vpc-execution-role" --assume-role-policy-document file://role-policy.txt
aws iam attach-role-policy --role-name "${role_name}" --policy-arn "${role_policy_arn}"

This role allows Lambda to function within the VPC. We also have to assign the role privileges via role_policy.txt.

role_policy.txt

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

I’ve already generated this role for previous Lambda work so I’ll just need to generate the Lambda function. The role_arn is created when you generate the role. I’m using the command line json processor jq to pull the subnet_ids and the sec_group_id from the output of the aws_cli ec2 commands. The function generation itself is straightforward. All the communication and permissions for the RDS instance and the Lambda function were covered when we edited the security group inbound rules and added the appropriate role and role policy.

create_function.sh

#!/bin/zsh
lambda_name="LambdaPGConnectDB"
zip_file="${lambda_name}.zip"
role_arn="arn:aws:iam::nnnnnnnnnnnn:role/lambda-vpc-execution-role"
subnet_ids=`aws ec2 describe-subnets |\
                jq -r '.Subnets|map(.SubnetId)|join(",")'`
sec_group_id=`aws ec2 describe-security-groups --group-name "default" |\
                jq -r '.SecurityGroups[].GroupId'`

files="lambda_function.py"
chmod -R 755 ${files}
zip -r "${zip_file}"  psycopg2  $files

aws lambda create-function \
    --region "us-east-1" \
    --function-name "${lambda_name}" \
    --zip-file "fileb://${zip_file}" \
    --role "${role_arn}" \
    --handler "lambda_function.lambda_handler" \
    --runtime python3.6 \
    --timeout 60 \
    --vpc-config SubnetIds="${subnet_ids}",SecurityGroupIds="${sec_group_id}"

If you find you need to edit your function, it will probably be small enough to allow you to edit via the Lambda console, but that’s not a best practice. Instead, you can use a small shell script like this to update your function, rezip and push to AWS.

update_function.sh

#!/bin/zsh
lambda_name="LambdaPGConnectDB"
zip_file="${lambda_name}.zip"

files="lambda_function.py"
chmod -R 755 ${files}
zip -r "${zip_file}"  psycopg2  $files

aws lambda update-function-code --region "us-east-1" --function-name "${lambda_name}" --zip-file "fileb://${zip_file}"

A successful run will yield the following json and cheerful [[3]] in the output.txt file.

success response

{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}