Funnels in experimentation: A perfect pair šŸ

Wed Sep 18 2024

Craig Sexauer

Data Scientist, Statsig

Funnel metrics are a common and powerful tool to track sequences of actions people take through tools and products.

In most analytics platforms, funnels are a table-stakes feature and can offer rich insight into how a productā€™s users behave and where people drop off in their usage.

Unfortunately, funnels havenā€™t been heavily used or invested in by experimentation platforms.

This is due to some mix of complexity, actual limitations, and perceived limitations. We recently invested more into the funnel functionality at Statsig, and Iā€™d like to explain whyā€”and make the case for funnels as a core part of the modern experimenterā€™s toolkit.

What do funnels solve?

Funnels allow you to measure complex relationships with a higher degree of clarity. For example, you see revenue flatten, but product page views are going up. You can infer conversion has gone down, but at what stage?

You can get a good guess by looking at how intermediate steps have changed, but this process is prone to data leaking, and results are usually fuzzy at best.

Funnels streamline this process and add clarity by specifying a specific order events have to take place in, per user or session, and measuring ā€œhow far they make it into the funnel.ā€ You can also measure stepwise conversion, making it easy to understand exactly where users dropped out of a product.

This is true for most products, but especially for those with buyflows, subscriptions, or ā€œdaily habitsā€ users should haveā€”these behaviors will have a clear set of steps for users to complete, and growth teams are already usually thinking of things in terms of a ā€œfunnelā€ from landing-page users to successful conversions.

setting up a funnel in statsig and choosing metric sources and types

When we introduced our initial funnel product to Statsig Warehouse Native, I expected it to be fairly niche tool. However, with more advanced settings added weā€™ve seen broad adoption, and our most sophisticated users have adopted funnels as a way to explore results more deeply - and to share a more intuitive view of whatā€™s happening inside of the data with their teammates.

Drawbacks of funnels

In my career as a Data Scientist, Iā€™ve seen and run experiments across gaming, social media, feed ranking, growth buyflows, media marketing, and more.

Only in growth did I see regular usage of ā€œfunnel metrics,ā€ and it was often done poorly, grouping funnels by test variant in Mixpanel and eyeballing the per-unit ā€œCIā€ provided in the UI.

Based on this, I was initially fairly cynical about funnels. Itā€™s true that funnels do have some fundamental limitations:

  • A funnel rate in the context of an experiment can be tricky (or impossible) to extrapolate out to "topline impact" after launch.

  • Funnels can become fairly complex to calculate, and simple changes to how youā€™re treating them (count unique users? count sessions? does order matter?) can make two analyses of the same dataset quite different.

  • Funnels are rarely treated with statistical rigor: In my previous experience working with growth teams, the data team spent a lot of cycles trying to appropriately qualify funnel-based ship decisions the team made since low-volume, high-noise funnels were being used as decision criteria.

Iā€™ve seen experimentation teamsā€”based on the valid concerns aboveā€”treat funnels with a fairly dismissive attitude.

This ends up with data teams relegating funnels and all of their rich complexity to the land of product analytics, and not worthy of being included in experiment readouts.

Start making your own funnels

Chat with our data team to understand how using funnels can help elevate your own experimentation initiatives.
isometric cta: B2BSaas

How to use funnels properly

Funnels provide a super-rich and intuitive readout of ā€œwhat happened with our users?ā€ Using them well is mostly an exercise in risk mitigation. My (verbose) guide for healthy funnel usage is:

  • Funnels should never be your primary success metric. The bottom of the funnel is what experiments should aim to move. Funnels are to be used as powerful diagnostic tools to help you understand what drove that target behavior (or what didnā€™t)!

  • Use funnels as a powerful post-hoc tool. If you donā€™t understand the relationship between two related observations (e.g., click notification, send message) in an experiment readout, Iā€™d try creating a local funnel metric between the two to see if thereā€™s an obvious drop-off.

  • Carefully scope funnel metrics. Pick an appropriate ā€œtime to completeā€ window, whether for the full funnel or between steps. One common pitfall is an unbounded user-level funnel; over a long experiment, the success rate trends towards 1 since users get more ā€œchancesā€

  • Consider when a user vs. session-level funnel is appropriate. If you want to measure a userā€™s journey to subscription, you care about the user-level data. If you care about improving your checkout flow for products, tracking this data at a session level is more powerful, measuring (successes / tries) instead of (successful users / users who tried)

  • Make sure your experiment tool makes it clear how a funnel is being calculated, and that settings can be standardized across the organization. Donā€™t calculate funnels in ad-hoc notebooks or reports unless thereā€™s a clear standard or process for calculating funnels

  • Make sure your experiment tool treats funnels correctly as ratio metrics, applying the appropriate corrections to variance. Measure the relative change in conversion rigorously, vs. just comparing two conversion rates and eyeballing them.

With these steps taken, the downsides of funnels are addressed, and suddenly you have a very flexible and very powerful tool for digging deeper into experiment results.

Let's talk about funnels

At Statsig, we're always happy to spread the good word about experimentation best practices. If you'd like to chat with an expert on how to get your funnels up and running, give us a call!
isometric cta: New

Important features

Funnel quality varies drastically across platforms: some experimentation platforms only offer ā€œone-step,ā€ unordered funnelsā€”which are basically conversion metricsā€”and others offer basic ordering capabilities but not much else.

Hereā€™s what we recommend looking for:

Table stakes

  • Statistical rigor: Make sure funnel conversions have the delta method applied and have sound practices for ordinal logic.

  • Ordered events: For funnels to be really useful, you should be able to specify that users do events in a specific sequence over time.

  • Multiple-step funnels: Two-step funnels can be useful, but the ability to add intermediate steps as needed for richer understanding is critical.

  • Step-level and overall conversion changes: This is how you can identify where drop-offs happen.

  • Calculation windows: Being able to specify the maximum duration a user has to finish a funnel is critical to running longer experiments.

Important

  • Session breakdowns: Being able to specify session keys or a sessionization method and count multiple funnels per user allows you to examine a much larger variety of product use casesā€”particularly check-out flows, daily tasks, or other recurring flows.

  • Step-level conversion windows: Being able to say ā€œStep B needs to happen within an hour of step Aā€ cuts down meaningfully on noise, and reduces confusion about how a funnel conversion came to be.

  • Time-to-convert functionality: Being able to measure if your changes made the funnel take longer or shorter to complete can help avoid buyflow bloat, or help you slim down your user journeys. The platform should also give context on success rate: Making the funnel shorter by eliminating the slow users is usually not good!

Viewing a checkout funnel with a time to complete delta in Statsig

Nice-to-have

  • Timestamp management: Specifying if events can occur simultaneously, and if thereā€™s allowance for clock speed logging events slightly out of order can be very important in system/performance use cases

Related reading:

Request a demo

Statsig's experts are on standby to answer any questions about experimentation at your organization.
request a demo cta image

Build fast?

Subscribe to Scaling Down: Our newsletter on building at startup-speed.

Try Statsig Today

Get started for free. Add your whole team!
We use cookies to ensure you get the best experience on our website.
Privacy Policy