Locations: Generating and Receiving Records

My quest to track and publish everywhere I go is easily my longest-running and most mature side project. It’s a constant source of engineering motivation, tweaks, and improvements, and I’ve been meaning to spend some time describing its architecture and operation. It’s a lot to write about all at once, so I’ll break it into three parts:

Generating and receiving (← you are here)
Compiling and preparing
Fetching and presenting

Generating Location Data

An early requirement of this project was automation. I know that if I had to manually register every place I went when I came back from a trip, the project would be an incomplete list of cities and fizzle out quickly.

The first obstacle to overcome was to find a way to automate tracking my location. I initially considered scraping the location data from photos I’ve taken, but I don’t take nearly enough photos for this to generate enough data to be really interesting.

Fortunately, I found Owntracks, a free and open-source application for iOS and Android. I planned on eventually replacing Owntracks with a bespoke application (and still, maybe one day…), but honestly, it’s proven to be nearly perfect for my needs. It supports a bunch of configuration and modes that I don’t use, but does allow me to specify an endpoint that it sends location updates to.

Owntracks' interface and its impact on my phone’s battery life

In Move mode, Owntracks sends a location update to the configured endpoint when I’ve moved about 200 meters. If I’m not moving, it sends an update every three minutes. If I’m in a situation where location data won’t be useful (like when getting on a plane) or when I need to conserve battery, I’ll swith it to Quiet mode, which will pause data collection.

And while we’re on the subject, managing my phone’s battery has become a more frequent concern since using Owntracks. I don’t think this is the fault of the app itself as much as it is the reality of using Location Services all day, but the screenshot above shows how Owntracks is responsible for 28% of battery use over the last 24 hours. It’s a noticeable impact, but I’m willing to charge my phone more often to make it all work.

If my phone doesn’t have a network connection for any reason (like being deep in the woods, on a boat in the ocean, or if it’s in Airplane Mode), Owntracks will cache all the location data it’s collected locally until the phone reconnects. I’m sure there’s an upper limit, but I’ve seen the number of cached locations pass 10,000 before, no problem.

The app hooks into my phone’s location services to determine its location, bundles it with some other data, and shoots it over to my endpoint. It’s in JSON format with this schema:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
{
    "_type": "location",
    "lat": "61.191928", // latitude 
    "lon": "-149.886639", // longitude 
    "acc": 12, // accuracy in meters 
    "alt": 35, // altitude in meters 
    "cog": 270, // heading, in degrees 
    "tid": "8D", // tracker ID
    "tst": 1583326398, // unix timestamp
    "vac": 3, // vertical accuracy in meters 
    "vel": 43, // speed in km/h
    "batt": 82, // battery percentage

    // ...and other fields I don't use.
}

Receiving Location Data

On the phone side, generating location information is completely automated and continuous. This means I need a system that’s ready to receive location updates from my phone 24/7. I turned to AWS and built the rest of the system in the cloud.

I used API Gateway to provide a stable endpoint that I could plug into Owntracks. API Gateway provides a super simple REST API endpoint that supports just one operation: POST /locations.

Location data is then forwarded to the next service, the locations-receiver Lambda. I chose Lambda for a few reasons: first, my load is inconsistent. If I’m sitting at home and sending a single location record every few minutes, I don’t need a dedicated server waiting around all day. Updates are sent more frequently if I’m on the move, but even in the highest-load case (like when my phone reconnects to the network and sends a bunch of cached location records to the server at once) the load is never more than a couple requests per second. Thanks to Lambda’s flexibility, I don’t need to worry about scaling when the load increases. Second, Lambda is super, super cheap. The very generous free tier allows me to run all of the Lambda resources required by this project for free.

locations-receiver invocations over the last three days

The code running on the Lambda is written in Go and available here. It’s a tiny program that basically just parses the input, filters it for the values I care about, calculates the location’s Geohash encoding, then stores it in a database for persistence. I don’t actually use the Geohash yet, but it’s convenient to calculate it now because it becomes really important in the next step.

For persistence, I chose DynamoDB for its convenience, performance, and cost. DynamoDB is a managed NoSQL database service and a great fit for this project because I don’t use the kind of heavy queries that depend on SQL support. What I do require is what DynamoDB excels at: grabbing a lot of records, sorted by an index (like a timestamp), quickly.

Day to day, my loads are really small on the DynamoDB side as well. My consumed write capacity (DynamoDB’s unit of resource consumption) usually hovers just above zero. In situations where I want to read a lot of records from the database, it will quickly auto-scale in response to my demands.

Write capacity consumption for the locations table over the last three days

As location records travel from my phone, via Owntracks, to AWS, API Gateway, Lambda, and finally comes to rest in DynamoDB, we conclude part one of this series. In the next post, I’ll discuss how the data is compiled and prepared.