Tweet Ex Machina: A Serverless Twitter Bot

June 24, 2024

Pure Cloud Excess

Ever had a cool tweet idea at 3 AM, only to forget it by morning? Or perhaps you're like me, prone to bursts of epiphanies at the most inconvenient times. Maybe you’re having coffee with a friend or you're in a meeting..

I want to quickly log the thought before it evaporates, and to prevent it from nagging me later. Also, hey, a short window for editing the tweet would be nice, all without paying for premium X.

As an engineer I decided to address this "problem" through the only logical means: constructing an enterprise-grade, serverless CI/CD pipeline for automated tweet posting. 😆

Project Overview: The Tweet Pipeline

[Diagram of the overall architecture here]

At its core, the Tweet Pipeline is a serverless, CI/CD pipeline for posting tweets to Twitter using AWS SAM and AWS Step Functions. It allows you to:

  1. Compose tweets as inspiration strikes
  2. Automatically stage them to a database
  3. Post them to Twitter on a schedule mimicking human behavior
  4. Run entirely within the AWS free tier

Check out the project on Github if you need something like this, or are just a fan of overengineering?

To see the pipeline in action, check my weirdo twitter account Nous Machina, where I attempt to get into character as an all-powerful superintelligent AI who transcends time and space.

Now, let's dissect this overengineered monstrosity.

The Heart of the Operation: Our Lambda Function

Our Lambda function, post_tweet.py, is the pulsating core of this digital construct. It handles two primary tasks: posting tweets and calculating the wait time between posts. Let's peer into its inner workings:

Posting Tweets

First, we check if it's an appropriate time to post:

def is_within_posting_hours():
    now_utc = datetime.now(timezone.utc)
    now_pst = now_utc - timedelta(hours=8)
    return 6 <= now_pst.hour < 24

This function ensures we're not tweeting during the wee hours of early morning. While posting at 4 am is a solid flex, it could signal psycho if you do it too often. Let’s just make sure this never happens.

Why Lambda?

You might wonder, "Couldn't we just use a simple cron job on a container or ec2?” Yep. But by using Lambda, we stay serverless and within AWS's generous free tier. In my personal account where I’m trying to keep costs as low as possible, I prefer to use containers and ec2 sparingly and only as a last resort.

When it's time to post, we fetch a tweet from our DynamoDB table:

response = table.scan(
    FilterExpression=Attr('posted').eq(False) & Attr('twitter_account').eq(TWITTER_ACCOUNT),
    ProjectionExpression='id, tweet',
    Select='SPECIFIC_ATTRIBUTES'
)

tweets = sorted(response['Items'], key=lambda x: x['id'])

We're only interested in unposted tweets for our specific Twitter account, sorted by ID to maintain chronological integrity.

Why DynamoDB?

A traditional SQL database could work here, but DynamoDB offers a generous free tier and scales effortlessly.

Calculating Wait Time

To make our posting schedule more human-like, we calculate a random wait time:

def calculate_wait_time():
    wait_time = random.randint(60, 360) * 60
    now_utc = datetime.now(timezone.utc)
    now_pst = now_utc - timedelta(hours=8)
    next_run_time = now_pst + timedelta(seconds=wait_time)
    if 1 <= next_run_time.hour < 6:
        extra_wait = (6 - next_run_time.hour) * 60 * 60
        wait_time += extra_wait
    return wait_time

This function not only randomizes the wait time but also adjusts it to avoid posting during our designated "sleep" hours.

Orchestrating the Process: Step Functions

To manage the flow of our tweet posting process, we use AWS Step Functions. Here's how we define our state machine in the template.yaml:

TweetPostingStateMachine:
  Type: AWS::StepFunctions::StateMachine
  Properties:
    DefinitionString:
      Fn::Sub: |
        {
          "StartAt": "CheckAndPostTweet",
          "States": {
            "CheckAndPostTweet": {
              "Type": "Task",
              "Resource": "${PostTweetFunction.Arn}",
              "Next": "CalculateWaitTime"
            },
            "CalculateWaitTime": {
              "Type": "Task",
              "Resource": "${PostTweetFunction.Arn}",
              "Parameters": {
                "task": "calculate_wait_time"
              },
              "Next": "LoopChoice"
            },
            # ... there's more, but you get the point

This state machine defines the flow of our tweet posting process:

  1. Check if we can post a tweet
  2. If yes, post it and calculate the next wait time
  3. Wait for the calculated time
  4. Repeat

Why Step Functions?

Lambda is billed by execution time elapsed. So you want the lambda to fire off and terminate quickly. If we kept it on in a while loop or something waiting for the timer to expire, this would be expensive and dumb. If your app does that use a container. Also, step functions allow you to pass state information from one lambda run to the next, in this case its the “wait time”. Another viable scheduler would be Eventbridge.

AWS Twitter Step Function

Step Functions are like having a flowchart that actually runs your code.

Building and Deploying: The CI/CD Pipeline

Finally, we use AWS CodeBuild to automate our build and deployment process. Here are the key parts of our buildspec.yaml:

phases:
  install:
    runtime-versions:
      python: 3.12
    commands:
      - pip install aws-sam-cli
  build:
    commands:
      - cd lambda
      - pip install -r requirements.txt -t .
      - cd ..
      - sam build -t template.yaml
  post_build:
    commands:
      - sam package --template-file template.yaml --output-template-file packaged.yaml --s3-bucket $S3_BUCKET
      - sam deploy --template-file packaged.yaml --stack-name tweet-queue-posting-stack --capabilities CAPABILITY_IAM --no-fail-on-empty-changeset

This buildspec does the following:

  1. Sets up our Python environment
  2. Installs our Lambda function dependencies
  3. Builds our SAM template
  4. Packages and deploys our application to AWS

Why a CI/CD pipeline?

For a simple tweet machine, manual deployments would work fine. But by using CodeBuild, we're setting ourselves up for future expansion. Want to add a feature? Just push to GitHub and watch your changes automatically deploy. And its easy to setup, included in the free tier, so why not.

In Defense Of Overengineering

One might ponder, "Couldn't a simple scheduling app suffice?" And they'd be correct. But where's the artistry in that? This project is a testament to the engineer's creed of "Why do something in 5 minutes when you can spend 5 hours automating it?"

By creating this enterprise-lite application for personal tweet automation, we're not just posting tweets; we're making a statement. A statement that says “Yes, I have an AWS 🔨 hammer, so yes, everything is an 🗡️ arn::.”

The Bigger Picture: AI-Powered Business Automation

While this project might seem like an exercise in serverless extravagance, theres actually a practical reason. I’m building out a larger project, this is but one shard. I'm exploring the use of LLM function calls (and AI agents in some cases) to handle various aspects of automating a business, focusing on areas where human-like intuitive decision-making is particularly valuable and challenging to achieve with traditional algorithms.

In my next post, I’ll show you how I’m using an LLM to replace the Step Function and tweet scheduler. Instead of relying on predefined rules, the AI will analyze the tweet log and queue, determining optimal posting times based on complex factors like content, engagement patterns, and current events.

For now, check the full project code of Tweet Pipeline on GitHub.