These contextual bandits are a lightweight form of reinforcement learning that gives teams an easy way to personalize user experiences.
Statsig customers have been using Autotune as a way to optimize, explore, and exploit a large number of potential treatments that change over time.
Autotune and Autotune AI both select the best variant when you have multiple potential ideas or treatments to show users. Autotune AI extends on vanilla autotune by personalizing the treatment a user gets in use cases where there’s not a one-size-fits-all solution, and different users will prefer different experiences.
Contextual bandits work by scoring treatments for users based on their individual attributes, and applying the uncertainty from the prediction in order to serve the variant with the highest potential upside for that specific user. We see this as a fast first step for teams starting to dip their toes into—or evaluate—ML for personalization.
This personalization approach is quick to prototype, performs well, and is a good first step towards building robust personalization in your apps, buyflows, and websites.
Contextual bandits are ideal for use cases where users’ characteristics (eg. device, location, language) will have a large influence on the best treatment for them. They also do best if they can get quick feedback - ideally in days or less - to continuously tune the model underneath.
Contextual Bandits require context - information about the actor - so they’re also best when you have quick access to rich metadata about a given user. They won’t have a ton to work with on a new user coming to your website from marketing channels, but after that same user’s answered five questions in a buy flow the bandit should be able to effectively personalize the sixth.
Autotune | Autotune AI | |
---|---|---|
User Context | Globally optimizes | Personalizes product treatments based on user context |
Feedback Length | Requires short term feedback | Requires short term feedback |
User Age | Ideal for new users, since experiment is dynamic. Works for established users, but can lead to experiences changing over time | Ideal for existing users, or users for whom you’ve collected initial context. New users might not have enough context |
Output | Shifts allocation towards best variant, and can identify if there is a statistically “best” variant | Shifts allocation, but there’s no terminal state; there may be a best variant overall, but each variant is the best for the users it is shown to |
Contextual bandits do have some shortcomings. Namely, these aren’t full-fledged ranking algorithms. Bandits ask you to provide a number of pre-determined options, which it will choose between, and they won’t be able to easily process novel choices.
For example, a contextual bandit is a great choice to personalize if a user should see “Sports”, “Science”, or “Celebrities” as their top video unit; but it won’t be a good fit for determining which video (with new candidates every day, and with potentially tens of thousands of options) to show them.
Secondarily, contextual bandits aren’t a good choice if users will be returning over time and need a stable experience. For example, if your bandit is controlling your app layout, the model may change over time and decide to serve them a new layout — leading to user conversion and churn.
You choose what Autotune AI will base its personalization on. All custom attributes attached to your Statsig user object will be considered in model training - though some may be discarded during training if the feature proves to be unimportant.
Statsig also makes sure to determine if features are numerical, or categorical, so you can provide any kind of feature you want and Statsig will handle the appropriate encoding, and choose the best model format based on the inputs and an internal scorecard.
This means you can get as specific as you want - you can start with the basics, like demographics or recent app usage - or integrate it with features that your own personalization models consume or create.
Additionally, since Statsig is choosing the models under the hood, you can choose between personalizing on binary outcomes (if a user clicked a button) or continuous outcomes in either direction (video watch time, spend, or minimizing latency).
Checking a bandit in code is as easy as checking a Statsig experiment.
Statsig.getExperiment('my_bandit')
This action returns the config attached to the best bandit variant, and logs a bandit exposure to Statsig with all of the metadata attached to your current user. If you want to add features just before checking, it’s also trivial:
Statsig.updateUser({custom: {key1: "value1", ...}});
That’s it! Unless you want to, there’s no need to log, train, or handle any of the real-time architecture for keeping your model always-updated.
Contextual bandits are a powerful tool for solving basic personalization problems in your product. This can be a final state or a first step into investing in a personalization platform.
We think bandits are an 80-20 solution for many use cases. They’re trivial to implement, cheap to use, and provide a robust form of personalization. At the same time, contextual bandits will never be able to produce the level of fidelity, use case coverage, and supervised training a dedicated ML team provides.
This makes it a natural fit for companies who:
Know personalization will provide value, but isn’t a core value proposition for their product. Instead of making a huge investment, they can quickly get most of the value on the table with Autotune AI
Don’t yet have the bandwidth to solve these problems, but want a placeholder for personalization as their teams get more mission-critical parts of their product built
Are interested in learning how much personalization matters. Running a few tests with Autotune AI can quickly give signal on how much there is to gain from personalizing product surfaces - potentially justifying investing in a dedicated team
Hundreds of customers already use Statsig to measure improvements to their personalization program. This tight integration means that it’s simple to measure both the short-term and long-term impact of using a contextual bandit. You can get all of the benefits of Statsig’s experimentation tools as you dive into the performance of your bandit.
Internally, we’ve been running this for use cases across the console and website. We’ve seen meaningful increases to downstream events - and sometimes learned a lot from studying which features “mattered” in our model for the different variants.
We’re beginning to open up Autotune AI to Statsig customers today - reach out in slack if you’re interested in trying it out as soon as possible!
Standard deviation and variance are essential for understanding data spread, evaluating probabilities, and making informed decisions. Read More ⇾
We’ve expanded our SRM debugging capabilities to allow customers to define custom user dimensions for analysis. Read More ⇾
Detect interaction effects between concurrent A/B tests with Statsig's new feature to ensure accurate experiment results and avoid misleading metric shifts. Read More ⇾
Statsig's biggest year yet: groundbreaking launches, global events, record scaling, and exciting plans for 2025. Explore our 2024 milestones and what’s next! Read More ⇾
A guide to reporting A/B test results: What are common mistakes and how can you make sure to get it right? Read More ⇾
This guide explains why the allocation point may differ from the exposure point, how it happens, and what you to do about it. Read More ⇾