The tech stack I always reach for first when working on a static site is always S3 for hosting and Cloudfront for distribution. Cloudfront provides a few basic analytics numbers for free, right out of the box. The three most interesting to me are the most requested objects (Popular Objects), which domains are sending the most traffic (Top Referrers), and the proportion of bots, desktop, and mobile users (Viewers). It’s limited, but it’s enough for me.

That is, until recently. I decided that a side project might be a great use case for QR codes. By placing a few in conspicuous public areas, some portion of the passing population might roll the dice and scan the code. The only thing I want to know is: how many people are doing that?

If you’re looking for fully-featured modern web analytics, there are plenty of services out there that will give you more features than you ever knew you wanted, and usually for free. I don’t need all those features. I’m happy enough just knowing how many times a page was loaded, maybe with some awareness of how many of those page loads weren’t bots. I’m a simple man.

A couple things I’m not doing

So why not reach for one of those off-the-shelf analytics platforms? First, it’s not as interesting. But they’re also overkill. I really only care about the raw number of views here. The project’s pretty geographically specific, so I’m not interested in knowing how many of the site’s viewers are from Germany. It’s just static text, so I don’t really care what portion are on mobile, which version of iOS they’re running, or the dimensions of their viewport.

I could instead build my own super basic analytics platform, one with only the features I care about. The QR code could lead to a URL with an appended query string, like ?source=qr, for example. This would then be scraped by the site’s front-end Javscript, quickly sent to a Lambda that would stuff the record into a DynamoDB or Aurora Postgres table, all while existing comfortably within the AWS free tier at the view counts I’m expecting. I’d have to provision those services and write some code to make sense of the data. Easily doable over a weekend, but that sounds like a hassle.

Ultimately, the effort required by this solution should be proportional to the very mild curiosity it satisfies. A one-hour project. Promise.

The one thing I am doing

Here’s the plan:

  1. Write a basic HTML page that will do nothing more than redirect the viewer to the target domain
  2. Create a new Cloudfront distribution to serve this redirect page
  3. Add a CNAME DNS record to point the qr subdomain to the new Cloudfront distribution
  4. Create QR codes that point to the qr subdomain

When a user scans the QR image, they’ll first load the qr.example.org domain. Cloudfront serves the redirect page, immediately sending them to www.example.org. The redirect happens within the browser rather than at the DNS level, so the Referer header value set to qr.example.org — and this is one of the few values Cloudfront’s analytics tracks.

Here’s the updated DNS records, with the new subdomain CNAME at the bottom:

The redirect is simple HTML and Javascript and fast enough to be unnoticable by the users in most cases:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
<!DOCTYPE html>
<html>
    <head>
        <title>Redirecting...</title>
    </head>
    <body>
        <script>
            window.location.href = "https://www.example.org/";
        </script>    
    </body>
</html>

After testing the qr.example.org domain and being redirected a few times, success! Cloudfront’s Top Referrers list shows a few referred by the subdomain.

Now I have a solid grasp on how many people came to the site by scanning the QR code. This approach can be easily extended, too — with a catch-all DNS record pointing to the redirect distribution, I could add an ID to the subdomain each time a QR code is generated to track how many times individual code has been scanned, or generate links on the fly to track the source of any page load.