Back in 2012, testing lived in landing pages and subject lines. Think Optimizely's founding story—testing every piece of marketing in the Obama presidential campaign.
It's always been my impression that marketers were the real experimentation pros because they needed to be, in order to test the impact of the largely intangible and qualitative things that make or break marketing strategies, like:
A/B testing landing page copy and shipping the winning variant sitewide
Experimenting with the length of forms to mitigate user dropoffs
Using 20% of an email audience to suss out the best email subject line and sending it to the remaining 80%
Testing virtually anything in a focus group
Implementing customer surveys to gauge reactions to potential upcoming initiatives
And so on
The research you’re about to read chronicles how A/B testing escaped the marketing basement, sprinted through the product org, and is now rewiring entire companies—from Amazon’s Prime bet to Netflix’s artwork thumbnails. It details how feature flags, in‑house platforms, and off‑the‑shelf tools turned what used to be expensive science‑fair projects into everyday muscle memory.
Ultimately, the current experimentation surge reshapes the concept of growth itself: In the old model we “grew” by shouting louder: bigger budgets, bolder creative, luck. In the new model we grow by learning faster—hundreds of tiny, controlled bets that compound. Software isn’t merely built and shipped; it’s evolved in production, guided by live data and a culture that celebrates being wrong quickly.
That shift lights me up. It means "test and see" is no longer siloed; it’s inseparable from product, data, even finance. It means a junior PM can outperform a HiPPO just by letting a test run. And it means the next breakthrough at your company is probably hiding behind the next experiment queue, not the next synergistic, blue-sky, cross-functional brainstorm.
So dive in. Bookmark the case studies, steal the metrics, show your skeptics the billion‑dollar wins. Most of all, let it challenge how you decide. Because if this decade has taught us anything, it’s that the fastest learners—armed with cheap, relentless experimentation—will write the next decade’s success stories.
Now, let’s see how they did it.
A handful of web companies like Google and Amazon had started running controlled tests, but few others had a true “experimentation culture.”
Around 2010, experimentation was often viewed as a marketing trick (“split testing” button colors on landing pages) rather than a core product strategy, but this began to change as pioneers proved its value.
Google famously tested for its link color and discovered that a slightly more purplish blue attracted more clicks—leading to a trivial UI change that earned an extra $200 million in annual ad revenue. Such results opened eyes across the industry, demonstrating how data-driven tweaks could translate into major business gains.
By the mid-2010s, continuous A/B testing was spreading. Companies like Facebook, Google, Amazon, Booking.com, LinkedIn, and Etsy all made experimentation fundamental to how they built products.
This shift was both cultural and technical. Leading tech firms invested in internal experimentation platforms to enable rapid, large-scale testing. Amazon, running 546 experiments in that first year and growing to over 12,000 experiments annually within a few years.
Microsoft to run controlled experiments across Bing, Office, Xbox, and more—yielding “hundreds of millions of dollars of additional revenue annually” as it scaled. Facebook created robust feature flagging systems (such as Gatekeeper) so that new features could be gradually rolled out as experiments to small user cohorts.
Other companies—Airbnb, Netflix, Uber, LinkedIn, and Booking.com—followed suit, building internal tools to try ideas on a subset of users and measure impact. A 2017 industry paper noted, “controlled experimentation is becoming the norm in advanced software companies,” a striking evolution from the intuition-driven approaches of prior decades.
The rise of cloud and SaaS tools also democratized experimentation. Dan Siroker and Pete Koomen (ex-Googlers) launched , making A/B testing easy for any website. By the late 2010s, a growing ecosystem—Optimizely, VWO, Adobe Target, Google Optimize, LaunchDarkly, Split.io—enabled even small teams to run experiments and toggle features with minimal effort.
Broader adoption spread to e-commerce, media, and enterprise software. What started as a Silicon Valley secret sauce became standard product-development practice worldwide.
As Booking.com’s experimentation director Lukas Vermeer put it, “we call this evidence-based, customer-centric product development... controlled experimentation is the most successful approach to building products that customers want.”
From 2010 onward, continuous experimentation evolved from a rarity to a core strategic practice for leading software companies, enabling them to innovate quickly and consistently.
No one has publicly championed experimentation more than Amazon’s founder, Jeff Bezos. He often credits Amazon’s success to its relentless trial-and-error approach. “Our success at Amazon is a function of how many experiments we do per year, per month, per week, per day,” he once remarked.
His philosophy—that more experiments lead to more innovation—became Amazon’s mantra. He pushed teams to test ideas (both large and small) and not fear the failures that come with experimentation. “If you double the number of experiments you do per year, you’re going to double your inventiveness,” he explained, pointing out that a high experiment rate drives a high innovation rate.
Bezos’s 2015 shareholder letter spelled out why experimentation matters to Amazon’s strategy. “I believe we are the best place in the world to fail (we have plenty of practice!), and failure and invention are inseparable twins,” he wrote. “To invent you have to experiment, and if you know in advance that it’s going to work, it’s not an experiment.”
This fail-fast approach allowed Amazon to discover big wins like Amazon Prime, AWS, or one-click ordering—risky bets that seemed crazy until data proved their value.
Bezos positioned experimentation as a core tenet of Amazon’s Day 1 culture. He contrasted “HiPPO” decision-making (highest paid person’s opinion) with data-driven decisions, suggesting he’d rather trust experiments than any single leader’s intuition. Under his guidance, Amazon tested everything—website design, search algorithms, services, even big strategic moves.
One notable example is the 16-month TV advertising experiment Amazon ran in a few markets, which showed that TV ads delivered only a modest sales bump. That conclusion saved the company from a costly ad budget. “It was a long, expensive test, but we were determined to understand this... once and for all,” Bezos explained.
Bezos’s public statements all echo the same message: experimentation is the bedrock of innovation. His leadership made Amazon a place where running experiments is not only acceptable but expected.
Thousands of tests run each year, perpetually improving the customer experience and fueling new offerings. His influence has persuaded countless executives and founders that an experimental mindset is critical for competitiveness.
The rise of experimentation as a practice mirrors the growth of the experimentation tools market.
In 2010, this market was tiny—just a few consultancies and basic tools like Google Website Optimizer serving a small set of early adopters. As more companies recognized the value of A/B testing, the demand for software to manage and analyze experiments soared.
By 2015, venture capital was pouring into experimentation startups. Optimizely due to rapid growth, serving high-profile customers like CNN, Microsoft, and Airbnb.
Competitors such as VWO, Maxymiser (acquired by Oracle), and Adobe’s Target also established positions in this burgeoning space.
From the late 2010s into the 2020s, analysts attempted to size the market. While estimates vary, the overall trend is up. One analysis placed the global A/B testing software (sans experimentation) market around $2.3 billion in 2022 and predicting ~$6.4 billion by 2030.
The convergence of feature flagging and experimentation is blurring lines: newer platforms like Statsig combine flag management with built-in analytics, aiming to replicate the in-house systems of Facebook or Netflix for any company.
The total addressable market (TAM) for experimentation has exploded because the mindset has shifted from “we test UI tweaks” to “we can experiment with nearly any business decision.”
Che Sharma, an ex-experimentation leader at Webflow, pointed out that early tools covered only a “tiny sliver of the decision TAM,” whereas today companies want to test everything—from UI changes to pricing to offline decisions.
Because nearly any measurable decision can be tested, the real TAM is massive—billions of dollars. Major enterprise software vendors—Adobe, Oracle, Google, Salesforce—have integrated experimentation into their clouds, and Optimizely was acquired in 2020 by Episerver (rebranded as Optimizely) to marry content management with experimentation.
Between 2010 and today, what began as a niche SaaS sector has grown into a core part of the software stack, with standalone vendors and major platforms alike vying to power the world’s experiments.
Experimentation can yield major business or user benefits. Below are a few famous examples:
A Bing program manager proposed a minor UI tweak for how ad headlines were displayed. It sat on the backlog as low priority until an engineer ran it as an A/B test. Results were dramatic: revenue jumped 12% within hours, triggering a “too good to be true” alert. Verified as genuine, the change was worth over $100 million in annual revenue.
It became “the best revenue-generating idea in Bing’s history, but until the test its value was underappreciated.” This single experiment validated Microsoft’s experimentation investment, showing how data can reveal substantial wins that intuition might overlook.
One of Google’s most famous experiments involved 41 shades of blue in the mid-to-late 2000s. Instead of letting executives choose a hyperlink color by gut feel, Google ran a multivariate experiment on dozens of hues.
They found a more purple-tinged blue that consistently earned more clicks, eventually translating to an additional $200 million in annual revenue. This solidified Google’s data-driven ethos and proved that even seemingly trivial UI changes can have an outsized impact.
Netflix’s personalization engine relies on thousands of tests. A high-profile example is its experimentation with artwork for movies and shows. Different images for the same title can affect viewing rates.
In one case, a test image for The Short Game boosted viewer engagement by over the original poster. Rolling out winning variants significantly increases overall watch time. Netflix frequently cites these tests as proof that data-driven content presentation yields higher customer satisfaction and retention.
Booking.com attributes much of its success to its experimentation-driven culture. Simple A/B tests in the mid-2000s evolved into an “universal, complete, and fundamental” approach to product decisions. The company runs over —everything from button placement to hotel sort algorithms.
Gillian Tans, former CEO, once remarked, “We grew like this, without any marketing or PR, just testing what our customers liked.”
Incremental improvements piled up into a multibillion-dollar travel leader. Many of their user-interface details—like urgency messages (“Only 2 rooms left!”)—were proven via A/B tests.
As Facebook grew to billions of users, it moved from “move fast and break things” to “test fast and learn things.” Facebook uses feature flags to ramp changes to small slices of users first, measuring every aspect of engagement.
A notable example was the “Explore Feed” test in 2017. Rolled out in six small-country markets, it reduced engagement by 60–80%. Facebook scrapped the idea before a full global launch, saving a potentially catastrophic dip in usage.
Conversely, features that do prove beneficial (like Instagram Stories) are rolled out gradually, tested with smaller cohorts, and refined. Andrew Bosworth, a VP at Facebook, summed it up: “Technology co-evolves with people,” so continuous experimentation is the only way to know what users truly want.
Frontline tech workers—engineers, product managers, designers, data scientists—now see experimentation as a critical skill.
The “let’s test it” mentality once championed by Bezos and Zuckerberg has become part of everyday product work. Designing, running, and interpreting experiments is a key capability for modern tech roles.
Job postings reflect this shift. Roles like . Many large firms now have entire teams focused on growth experimentation. Even engineers are hired for familiarity with A/B testing frameworks and feature flags, a stark contrast to a decade ago.
Practitioners praise experimentation for its career-boosting and exciting nature. One analytics professional noted, “A/B testing is considered an essential step in growth for major tech companies. It can really help boost your career.”
Mastery of experimentation shows you’re data-driven rather than guess-driven—highly valued in results-oriented cultures. Product managers emphasize it in interviews and may take courses to refine their skills.
Product School, for instance, includes A/B testing as a core module in its PM curriculum. Designers use it to validate usability hypotheses, while data scientists have always seen controlled experiments as the gold standard of causal inference.
Many top companies now induct new hires into the experimentation mindset from day one. Rather than view it as red tape, frontline workers consider it empowering—they can propose bold ideas and “let the data decide.” Teams learn continuously since every experiment, success or failure, yields insights.
Firms like Microsoft and Booking.com share results internally via newsletters and training sessions, building an experimentation knowledge base. This solidifies the idea that experimentation is integral to tech craftsmanship, akin to version control or agile processes.
Being fluent in experimentation is also a competitive edge for individual careers. Skills gained at one company (e.g., Uber) transfer to startups or other large organizations. Recruiters recognize this as proof a candidate can drive growth systematically.
Thought leaders see an “experimenter’s mindset” as key to tech leadership. Experimentation and data literacy go hand in hand; teams adept at both make better decisions and avoid expensive mistakes. In short, the tech workforce doesn’t just accept experimentation—they expect it, and many are eager to learn if they haven’t already. As one expert said, “It’s about how you can solve your customers’ problems, not about how fast you can roll out new features.”
Over the past decade, experimentation has grown from a curiosity into a cornerstone of the software industry. From 2010 to today, there has been an explosive rise in both the practice and the business of experimentation.
Leaders like Jeff Bezos championed its virtues, startups built tools to scale it, and countless product decisions became data-driven. The total addressable market ballooned from negligible to billions, establishing “experimentation platform” as a must-have category.
Most importantly, a new generation of tech workers has embraced the philosophy that constant experimentation fuels continuous innovation. Successful companies often run the most experiments, and staying competitive means adopting a culture that measures, tests, and learns on repeat.
Looking ahead, experimentation will likely expand even further—into AI, offline domains, and every corner of business—cementing its status as a dominant strategy in software.
What began as a scrappy tactic for optimizing landing pages has become the default operating model for modern product teams. Experimentation isn’t just how tech companies build software—it’s how they grow. If you’re not testing, you’re guessing.