Writing a Chat Bot Auto Responder

Chat bots are popular in the industry right now. They are used for customer service, devops, and even product management. In this post, I’ll dive into writing a very simple bot while dealing with an inconsistent chat service API.

The Problem

An organization that I belong to uses GroupMe as their group chat solution. When new members join the group chat (channel), then someone from the leadership team sends them a direct message (DM) welcoming them and asking them to fill out a google form survey. Since we’re not always active in the channel, we run the risk on missing a quick turnaround time from someone joining the channel and us reaching out to them (attrition is a problem).

I felt that this process could use some automation.

The Constraints

I wanted a lightweight solution (i.e. don’t change the process too much).
The solution, if it involved tech, should be cheap (a.k.a. cost $0).
The channel user activity was relatively low (mostly used for announcements and some bursts of chatter).
The solution should still feel “high-touch”. It should feel personal when user contact is made.

Solution: Make an Auto Responder

When new members join the channel, have something automatically DM that person, greeting them and asking them to fill out our survey.

The question then becomes, how?

GroupMe has a notion of chat bots, server-side configured entities that join and listen to all the messages and actions that happen in a given channel. For each event that happens, it sends a callback (via HTTP) for you to reason about.

A possible auto responder could work like this:

Straight-forward. How do we deal with the constraints?

Lightweight: The process stays the same; user joins, we send them a message.
Cheap: We own the auto responder service, so we should host it somewhere where costs are free (GCP / AWS / Heroku micro tiers are all viable).
Scale: The cheapest cloud hosting tiers are sufficient from a throughput and minimal response time standpoint.
High-Touch: If we can send them a message as one of us, instead of the bot, even better.

The first-launched version of this setup is written in Go and runs as a CloudFunction in GCP.¹ The CloudFunction was estimated to be free given our traffic rates. The choice to use Go was because there are only a few languages that CloudFunctions support: javascript (via node), python, and go. I find no joy in coding in javascript. I hadn’t written a lick of python in many years. I didn’t know Go (still don’t), but thought it could be fun to learn a bit of it for a small side project.

Snags

The GroupMe bot sends a callback request for every bit of activity in the channel that it’s listening to. The callback payload from the GroupMe bot looks like the following:

{
  "attachments": [],
  "avatar_url": "https://i.groupme.com/123456789",
  "created_at": 1302623328,
  "group_id": "1234567890",
  "id": "1234567890",
  "name": "GroupMe",
  "sender_id": "2347890234",
  "sender_type": "system",
  "source_guid": "sdldsfv78978cvE23df",
  "system": true,
  "text": "Alice added Bob to the group.",
  "user_id": "1234567890"
}

I need enough information from this notification to:

deduce whether this is a “user joined the group” event
if so, get a unique user identifier so that I can message the user directly

There wasn’t an “event type” for the payload, so I used regular expressions on the text attribute to infer whether a payload corresponded to the two possible join events (a user joined the group on their own and a set of users were invited to the group an existing group member).

I thought that the user_id was the id of the user that joined the group. I was wrong. In the wild, the user_id is the id of the user that created the text. So if a user sends a message to the channel, the id belongs to that user. For “join events” the user that wrote that “message” to the channel is the system (GroupMe) which has the special id of 0. There’s no point in sending a direct message to the system.

Without a user id, I could not send a message to that user through the GroupMe /direct_messages API. I needed to get the user id(s) another way.

One option was to look up the group’s member list from the /groups/:id API. I would have to match up the user’s name against the list of members (though names are also mutable). That API also doesn’t support any member list filtering, sorting, or pagination. I didn’t want to use an API where its response body would grow at the rate of users being added to the group.

A second option would be to not rely on the GroupMe bot events at all. There exists a long-polled or websockets API for GroupMe. I could have listened to our channel on my own and reacted to its push messages. The problem with this approach is that the payload looks basically like the bot’s payload.

[
  {
    "id": "5",
    "clientId": "0w1hcbv0yv3puw0bptd6c0fq2i1c",
    "channel": "/meta/connect",
    "successful": true,
    "advice": { "reconnect": "retry", "interval": 0, "timeout": 30000 }
  },
  {
    "channel": "/user/185",
    "data": {
      "type": "line.create",
      "subject": {
        "name": "Andygv",
        "avatar_url": null,
        "location": {
          "name": null,
          "lng": null,
          "foursquare_checkin": false,
          "foursquare_venue_id": null,
          "lat": null
        },
        "created_at": 1322557919,
        "picture_url": null,
        "system": false,
        "text": "hey",
        "group_id": "1835",
        "id": "15717",
        "user_id": "162",
        "source_guid": "GUID 13225579210290"
      },
      "alert": "Andygv: hey"
    },
    "clientId": "1lhg38m0sk6b63080mpc71r9d7q1",
    "id": "4uso9uuv78tg4l7csica1kc4c",
    "authenticated": true
  }
]

Also I didn’t want to have my app be long-lived (hosting costs), since join events aren’t as common as other channel activity.

Note that there isn’t an API to get an individual user’s information (aside from your own).

I chose a third option. When a “join event” is sent from the bot, I would ask for the most recent N messages from that channel, match up the join event message id with the message id for that event in the channel (they’re the same!), and we the message data to get the user id.

Take a look at a responses from the :group_id/messages API:

{
  "response": {
    "count": 42,
    "messages": [
      {
        "attachments": [],
        "avatar_url": null,
        "created_at": 1554426108,
        "favorited_by": [],
        "group_id": "231412342314",
        "id": "155442610860071985",
        "name": "GroupMe",
        "sender_id": "system",
        "sender_type": "system",
        "source_guid": "5053cc60396c013725b922000b9ea952",
        "system": true,
        "text": "Bob added Alice to the group.",
        "user_id": "system",
        "event": {
          "type": "membership.announce.added",
          "data": {
            "added_users": [{ "id": 1231241235, "nickname": "Alice" }],
            "adder_user": { "id": 234234234, "nickname": "Bob" }
          }
        },
        "platform": "gm"
      }
    ],
    "meta": { "code": 200 }
  }
}

Surprisingly, each message has an optional event attribute with a type and applicable user ids! I wish the event was included in the callback from the bot.

The updated sequence flow looks like:

Extra Bits

The GroupMe API requires a token for authentication. This token is stored as an environment variable on the CloudFunction and is not stored in version control. Basic stuff.

There is a single http client used across invocations of the cloud function. This allows me to use connection pooling so that I can avoid multiple SSL handshakes when talking to the GroupMe API.

Intentional Holes

This setup works as intended, but there are cases that I purposefully don’t account for.

It may be possible for GroupMe to send duplicate events and the responder does not care. It does not store data on whether it has responded to the same event. I haven’t seen duplicate events yet, but even if they occurred, I deemed “users receiving dupe messages” as OK (low traffic channel).

It is also possible that GroupMe’s bot API may not send events at all. There is no reconcilation process to check that every join-event has been handled.

Hope you enjoyed the writeup. Till next time.

I originally wrote all of this in Elixir/Phoenix and ran it in GCP AppEngine. The problem was that in order to run Elixir code, I needed to run on AppEngine’s Flex Environment, which is not a free tier. Sad, because Elixir is my current favorite language. ↩