Platform

Resources

Docs Blog Pricing

Platform

Resources

Platform

Resources

Pricing

Ryan Musser

Head of Solutions Engineering, Statsig

EXPERIMENTATION

Experiment scorecards: Essentials and best practices

Fri May 31 2024

If you’re reading this, you’re probably trying to improve, or are adopting a new experimentation motion.

You probably understand that whether you're testing a new feature, a marketing campaign, or a business process, the success of your experiment hinges on how well you measure and understand the results. This is why you’re searching for information on how to build an experiment scorecard—it’s the medium of reading the results of your experiment.

In this blog, we'll explore the general principles of creating effective experiment scorecards, delve into best practices, and discuss some general key metrics that should be on your radar. Initially, I’ll do my best to avoid Statsig-specific solutions and will focus on the governing principles.

Hopefully by the end, you'll have a solid foundation to apply these concepts to a wide range of experimentation scenarios. And for those curious about how technology can streamline this process, we'll include a bonus section at the end diving into how Statsig's scorecard approaches these challenges.

Crafting your experiment scorecard

An experiment scorecard is more than just a collection of numbers; it's a narrative of your experiment's journey from hypothesis to conclusion.

To construct a scorecard that truly reflects the impact of your experiment, consider the following elements:

Define clear objectives

Before you can measure anything, you need to know what you're aiming for; a hypothesis. Define the goals of your experiment in clear, quantifiable terms. Are you looking to increase user engagement, boost sales, or reduce churn? Your objectives will shape the metrics you choose to track. New experimentation solutions usually provide user inputs for experiment objectives and context.

I also see a lot of customers do this in Google Docs, Notion or Jira.

Related resources:

Statsig's Experimentation Review Template
Optimizely's Experiment Plan and Results Template for Confluence

Select relevant metrics

Choose metrics that directly relate to your objectives. These should be indicators that will move the needle on your goals. For example, if your objective is to improve user retention, metrics like daily active users (DAU) and churn rate are more relevant than page views.

🤖👉 Our article Picking Metrics 101 walks through some thoughts on how to approach picking metrics for an experiment.

Establish baselines and targets

To understand the impact of your experiment, you need a reference point. Establish baselines by measuring your metrics before the experiment begins. Set targets for what you consider success, whether it's a 5% increase in conversion rate or a 10% reduction in support tickets.

We monitor key metrics! (But can you really, though?)

I think everyone understands that it’s important to identify the KPIs that will give you the quickest insight into whether your experiment is on track.

Also, I think everyone understands that these should be metrics that respond rapidly to changes and can be monitored in real-time or near-real-time. What I see people fail to execute on is the “monitored in real-time or near-real-time” piece of it. This is a critical factor in any broad adoption of any experiment scorecard.

Whether it’s a custom script, an experimentation tool, or a homegrown system, if it doesn’t deliver automated results at least daily, you need to set very clear expectations with the scorecard consumer on when they can read the results to keep their trust.

Consider your lagging indicators

Lagging indicators are delayed events or metrics, which can help measure long-term trends.

A balanced scorecard considers lagging indicators, providing a comprehensive view of your experiment's performance. Ultimately, this depends on your type of business, but I’ll share an anecdote with you:

During a POC in the early days of Statsig, a customer wanted to test search functionality in their e-com marketplace. Another vendor suggested measuring immediate clicks (Which I presume was because it was an easier path to implementation?) which favored variant A for its higher click rate. We recommended incorporating the conversion data, which ultimately landed in their data warehouse an hour or so after purchase, into the experiment scorecard metrics.

By analyzing both immediate and delayed outcomes, Statsig revealed that variant B, despite fewer clicks, led to higher purchase rates and customer lifetime value. We learned that variant A's increased clicks were not due to better search results but because users had to search more to find what they wanted.

Best practices for experiment scorecards

Creating an effective scorecard is an iterative process. Here are some best practices to guide you:

Simplicity is key

Avoid cluttering your scorecard with too many metrics. Focus on a handful of measures that truly reflect the success of your experiment. This makes it easier to draw meaningful conclusions and take action. Avoid irrelevant metrics; these will mostly distract experiment scorecard consumers.

Make it digestible

Kinda obvious but… use charts, graphs, and color-coding to make your data easy to digest. Visual representations can help stakeholders quickly grasp the results and implications of your experiment. A lot of people I talk to prefer Bayesian in part for this reason; it’s easier to communicate what to make of the results.

Regularly review and update

An experiment is a living process, and your scorecard should be too. Review it regularly to ensure it remains aligned with your objectives, and don't be afraid to adjust as you learn more about what's working and what isn't.

Communicate across teams

Share your scorecard with all relevant stakeholders. Clear communication ensures that everyone understands the goals, progress, and outcomes of the experiment.

Learn and iterate

Use your scorecard not just as a report card but as a learning tool. Analyze the results to understand why certain changes occurred and how you can apply these learnings to future experiments.

Basic examples of metrics to measure

While the specific metrics you track will depend on your experiment's objectives, here are some relatively universal measures I see experimenters use:

Conversion rate: The percentage of users who take a desired action.
User engagement: Metrics like session duration, pages per session, or feature usage.
Revenue metrics: Sales, average order value, or lifetime value.
Customer satisfaction: Net promoter score (NPS), customer satisfaction score (CSAT), or support ticket trends.
Operational efficiency: Time to complete a process, error rates, or cost savings.

Statsig's scorecard: A solution for experimentation success

For those seeking a streamlined approach to experiment scorecards, Statsig offers a robust solution. Statsig's scorecard is designed to simplify the measurement process, providing real-time insights and advanced statistical treatments like CUPED (Controlled-Experiment Using Pre Experiment Data) to reduce variance and pre-exposure bias.

With Statsig, you can easily define objectives, select relevant metrics from your metrics catalog, and visualize your data with intuitive dashboards.

defining experiment objectives in statsig

The platform encourages collaboration and learning, making it easier to share results and insights across teams.

By integrating best practices and essential metrics into its scorecard, Statsig provides a powerful tool for companies looking to foster a culture of experimentation, shared learning and data-driven decision-making.

Documentation: Statsig's experimentation scorecard

The power of precision

By adhering to best practices and focusing on essential metrics, you can ensure that your experiments deliver actionable insights and drive meaningful improvements.

Whether you're a startup or a Fortune 500 company, the principles outlined in this blog can help you measure what matters and turn experiments into strategic victories. And with solutions like Statsig's scorecard, you can harness the power of technology to make the process more efficient and effective.

Ready to take your experimentation to the next level? Embrace the art of the scorecard. 🤙

Get a free account

Get a free Statsig account today, and ping us if you have questions. No credit card required, of course.

Create my account

Permalink: https://www.statsig.com/blog/experiment-scorecards-essentials

Platform

Resources

Platform

Resources

Docs

Blog

Pricing

Back to Blog home

Ryan Musser

Experiment scorecards: Essentials and best practices

If you’re reading this, you’re probably trying to improve, or are adopting a new experimentation motion.

Crafting your experiment scorecard

Define clear objectives

Select relevant metrics

Establish baselines and targets

We monitor key metrics! (But can you really, though?)

Consider your lagging indicators

Best practices for experiment scorecards

Simplicity is key

Make it digestible

Regularly review and update

Communicate across teams

Learn and iterate

Basic examples of metrics to measure

Statsig's scorecard: A solution for experimentation success

The power of precision

Get a free account

Recent Posts

Continuous promotion for infrastructure with Statsig and Pulumi

Jason Wang

Product Growth Forum 2025: Building for the future

Morgan Scalzo

Addressing complexity in enterprise-scale experimentation

Yuzheng Sun, PhD

How to use AI to enhance your experiments

Yuzheng Sun, PhD

Release pipelines: Safer, staged rollouts across your infrastructure

Shubham Singhal, Sid Kumar

Escaping SDK maintenance hell with a core Rust engine

Jina Yoon, Tore Hanssen, Daniel Loomb