Tracking user behavior via funnels
In the previous post I explained a bit about my side project. I want to help people avoid surprising expirations of their SSL certificates. While at it, I hope to learn a bit about building a helpful website, and perhaps more.
A quick recap — I created a website that tells you when will the SSL certificate of your server expire. I also present the user with a bunch of relevant hostnames for casual browsing within the website, checking on other hostnames that may be of interest. No ads, no subscriptions, no sign-in, no emails. Just one thing — tell me what hostname you are after, I’ll tell you when its SSL certificate is going to expire. Simple!
Just a bit more ‘building’…
I started with SSL certificate expiration in mind, but there’s another problem I’ve seen (not experienced first-hand) where domain registrations expire without being renewed, causing their owners unnecessary headache. So I thought about stretching my non-existent brand, and chose to go after ‘keep you from facepalm-inducing expirations of things you knew would expire’ rather than just ‘SSL certificate expiration monitor’. A few hours of work, and my site now provides domain expiration information alongside the SSL certificate details.
A few more technical bits to complete, just because I have my habits:
- Logging. I’m using Papertrail as a Heroku addon, and dump a single line of log for every request served by my app. Making note of the User-Agent string, the URL that was requested, and some data on the response being sent — all goes in the log. This is a very blunt way to see what the app is handling, but I fully control the detail resolution, noticing whatever errors occur and so on.
- Uptime. I’m signing up with updown.io and have their service poke my app once a minute so I can rest assured it is up and running. I expose a specific endpoint for this check, so that it doesn’t mess up my logging, and doesn’t masquerade as a user trying to get some value from my app. The free tier of updown.io can suffice for several apps at a high monitoring frequency, check them out!
- Security. I’m adding
helmet
to my app to gain a bunch of security best-practices for a web app.CSP
is driving me up the wall a bit, but I get around it with some more time. Security is table-stakes, and gets harder to do well the longer you wait!
Enough building, let’s see what the users think
So you already know I used a bunch of popular websites to provide some ~light shaming~ advance warning to major brands running the risk of having an expired SSL certificate in production. I also started a tiny AdWords campaign, spending just a few dollars a day promoting my website. All this noise-making has brought around 20–30 unique visitors per day, each stepping through 1 to 5 pages before continuing with their Internet browsing plans.
My logs are showing me what people are searching for, and I notice one small thing. People don’t always act the way they are asked to! I know, big surprise here :) Specifically, some users would paste a whole URL (https://some.website.com/with/path
) in the text field clearly marked with a ‘Hostname’ label. The nerve! :)
Noticing this, I have a ‘should have thought of that!’ moment, and implement some more delicate parsing of user input. After all, it’s better to assume the user is interested in the hostname of the URL they pasted, rather than telling the user ‘this is not what I asked for’. The difference between a consumer and an enterprise product, perhaps? Up to you to decide.
I stall for a minute, thinking about other not-exactly-correct types of input that I can expect to see. For example, the app doesn’t support plain IP addresses, while these can be relevant to my users. I stop right there, deciding to act on actual signal (seeing IP addresses being used) rather than future-proofing the app for every use-case.
Paying tribute to the SEO gods
I mentioned SEO in the previous post, so here come the details of what I’ve done. While my app is very simple, and basically has a single page, it also has a point of strength — it can be seen as an app with as many pages as the amount of websites out there! After all, each public website served over SSL can be used to generate a unique page on my website. I tweak the app a bit to support URL-path-based browsing (as opposed to query params), so /?q=adrukh.medium.com
results in the same as /ssl/adrukh.medium.com
. This is based on my obscure hunch that search engines would score URL-path based pages slightly higher than query-params based pages.
I then create a long-ish sitemap based on this guide, populating it with a few thousands of pages, each with a different hostname. I submit it to Google Search Console, and… wait for the crawlers to come! Apparently, Google have a lot of work to do besides crawling my website, and here is the rate at which they go through my sitemap:
You can see that they crawl some 1K pages once every 3–4 days.
You may wonder what the 21 errors are about! Here’s another piece of learning for me to share with you. I was following my past experience, thinking that if a request was made with a valid hostname, but my app failed to figure out the SSL certificate details, an error message presented on the page should be served with an HTTP 500 response status. While this makes sense in many cases, it glows in red on the Google Search Console, and probably downgrades my app’s search rating. I used this indication to step away from my past practice, and now every successfully rendered page is served with HTTP 200. So no more such errors, and hopefully Google will re-scan the errored pages and give a 0 errors grade sometime soon!
Analytics and funnels
Google Analytics stop working for some reason. Maybe I messed something up with CSP, maybe I just clicked a button I didn’t mean to click. But a day or two go by with Google Analytics telling me I have 0 users, while my logging shows me contradicting data. I decide to go after an alternative, and end up with HeapAnalytics instead. They offer a much more product-management oriented solution, tracking user engagement with specific features. This is better for me right now, as I don’t have a clear ‘conversion’ metric just yet.
Heap Analytics help me think differently about my app, and I come up with a few funnels to help me see what users come for, versus what they actually do on my site.
There are two main entry points to the app — either /
for the homepage (i.e. a user searched for terms relevant to my app and clicked the link), or /ssl/some.host.name
(i.e. a user clicked a link I posted to warn a major brand of an impending SSL certificate of domain registration expiration). It’s reasonable to assume users of both types will be different in their expectations. For example, the first type would be more likely to have their own website in mind, for which they want to know when will their SSL certificate expire. The second type would probably be after browsing a few websites, more popular or less.
Here are the funnels I created:
While this is not exactly professional on my behalf, I’m surprised at the level of detail available when tracking user behavior on a site with basically a single page. This complexity is most likely working against me, but I’m learning!
So my idea is to see how ‘deep’ would users go, checking one hostname after the other. Say a user lands on a page of a specific hostname following a link I posted on Twitter. How many such users will click on a related link? How many will visit a page for a different hostname? And again? You get the point.
While at it, I also want to compare the difference between the two types of users I described above — those starting from the homepage versus those starting from a results page of a specific hostname.
I don’t have any expectations for what ‘good’ or ‘bad’ looks like. Instead, I want to see what the current user behavior is, and call it ‘baseline’. Then, any change I make can be measured in impact compared to the baseline.
Main takeaways for the current baseline:
- A user landing on the homepage is 55% likely to check on a specific hostname
- A user landing on a specific hostname check is 32% likely to check on another hostname
- Both types of users are almost equally likely to ‘click through’ 4 additional hostname pages — 9% of the users go that far
- The means I took to make it easier to stay on the site (present users with popular or somehow related hostnames for single-click navigation) are very negligible in contribution — between 2% and 5% of users actually click on those
So the interesting part now is to decide what ‘needle’ I want to move upwards, and come up with an experiment or two to see what I can do. This is on my todo list, please comment if you have suggestions, and you’ll appear in the next blog post!
Automating the reach-out function
Alerting major brands to the imminent expiration of their SSL certificate or domain registration is bringing in some traffic. I also hope it helps somewhat with my SEO rating (external links pointing towards my website), while being more helpful than irritating :)
To support this without much hassle, I implemented two API endpoints in my app, coupled with a list of relevant hostnames, and I crawl them periodically. The API is not sufficiently stable for me to promote it as part of the service, but I’ll leave a couple of breadcrumbs here should you want to experiment with them by yourself. Just try https://www.haveibeenexpired.com/api/ssl/medium.com
or https://www.haveibeenexpired.com/api/domain/medium.com
and take it from there. Not promising anything stable about these endpoints, buyer beware :)
I’m now scraping my own app using these endpoints every other day or so, generating CSV files that I later process manually with the help of Google Spreadsheets. Interesting pieces of trivia come from these exercises, as well as the ability to alert certain brands to the upcoming expiration of their assets.
An interesting fact in this is my ability to extend the list of relevant hostnames for crawling by relying on the additional hostnames data stored in most of the SSL certificates that I find. Starting with around 6K hostnames, I am now scraping through 72K! I’m sure there’s more interesting data to extract here…
I’m enjoying the manual work around this, not in a rush to fully automate it yet. I learn a lot from looking at the data, and I feel like there’s more value to extract from this scraping before I settle on something sufficient to be automated.
What did I learn?
- Going wide to offer domain expiration monitoring was an interesting decision. It definitely increased the app’s ‘addressable audience’, but probably cost me a bit in diluting the message of what the app is about.
- Setting up the tech bits that give me peace of mind (monitoring, security) was a nice touch to leave the building stage aside for now. It helped me ‘turn the page’ in my mind to seeking user value with the existing offering. Don’t be afraid to ‘build just one more thing’ as long as you can see how it allows you to move forward to the next goal.
- Seek out actions in how your users interact with your app! Finding that bit where someone used a URL in the field where I expected hostnames has gained me a few more points, instead of users giving up because the app didn’t understand what they wanted.
- Analytics is a big deal! Many solutions, many different use-cases. Comparing the different alternatives out there was educational — I learnt what I can expect from analytics solutions. Having a reliable comparison benchmark in my own logs helped validate the analytics data I was seeing. Still a lot of ground for me to cover here, I’m sure I’m making rookie mistakes.
That’s it for now, thanks for reading through it all! Let’s see if I can keep a cadence of posting an update every other week :)