Last week, I attended the SF Global Meetup in San Francisco. Jean Aurambault from Pinterest provided some insights into how they reach their global audience, and how they refine their content through Copy Optimization.
Optimizing the copy in key moments in their User Experiences increases user engagement on their platform. They have developed tooling to both run the experiments that drive better engagement, and also to measure the impact of different copy has on users and equate these to their core business metrics.
In my mind, attributing value to content is a key enabler for both consumers and those who create content. Measurement is a key component to set up a virtuous circle between your users and their experiences. I was excited to see the presentation.
After framing the discussion, Jean provided some clear examples of how this was working. The team worked to identify their primary levers to enhance several key user experiences. An example of which was the sign-up screen.
Sign up user engagement levers:
Assuming that the Performance, Layout, colors used were solid, these static elements could be considered independently from changes to the content. A simple experiment was devised to change the Call to Action (CTA) on the button – by changing only the label text for different segments of users, to hone in on the best CTA message.
What CTA label works best
- Sign up
In their process, the content that has the best results in English gets translated. However, this may not be the best variant for another language, so initial experiments in English are followed by testing if the best message in English is the best message in each language. Variants of the English are kept for this purpose. If the non-english testing is skipped, you may miss the best CTA labels.
Other key areas included:
- Push Notifications
- Email subject lines (offline system, no performance issues)
- Big volume – High impact elements
High performing variants are identified and translated. They then run experiment to find the best CTA, and propagate this to all valid targets, serving up multiple winners at run time. In this way, “Sign Up” and “Continue” may both be used in different lanes, where the target is not a translation of the source. This is quite complex from a build and maintenance perspective so they built tooling to eliminating the problem. They do 3-4 variants in the web app at a time, and up to 10 on notifications
- Support more surfaces
- Support for all languages
- Improved targeting based on user meta data
- Self-service platform with no code requirements
- Allow for multiple test iterations
- Optimal performance – can’t impact product performance with these changes and tests
Key Takeaways from Early Experiments
Initial experiments quickly brought results and challenges. Some Key takeaways were:
- A big population is needed to run A/B tests and n-way tests require even more.
- The best-practice was to Copytest pages earlier in the flow first, then propagate results through the experiences.
- Iterate on small sets that focus on a few variants at a time so that the differences will have statistical significance.
- Optimize also by gender and age (their content is demographic centric).
Their main challenge was a very familiar one to me: UI strings come from many sources.
String reuse creates problems when shared strings are copy tested. To solve this, Pinterest built a Google Chrome extension that identifies the source of the strings. This allows them to see which strings are shared, and which can be copytested. The Google Chrome extension for identifying strings also enables team members to kick off a copy testing experiment, which produces content variants in the same manner as for translation.
A Dashboard was also created for viewing and managing live experiments, showing gain and losses per language, and per experiment, with simple color highlighting to see where results have statistical significance. Localization and product teams are directly involved, the platform is used by both organizations.
- Data validation, and determining the significance of each individual outcome.
- Usability of the testing platform and understandability of the results.
- Performance was a significant architecture challenge.
- More variants in the target languages mapping back to limited forms in English.
They also face the challenge of managing all of these strings, for this they use Mojito (http://mojito.global), which plugs into their workflow: GIT > Jenkins > Mojito. In addition, they have built a Command-line interface for developers to push from GIT to Mojito, and Jenkins calls this to push and pull strings for localization.
All components are connected via API, no copy paste is needed.
They have also developed a workbench for string lookup as well as metadata management for their strings to further mitigate some of the challenge, though translation is done in a TMS based on traditional models.
Reducing usability overhead, the potential for errors, and increasing velocity were achieved as well. Experiments run for 2-3 weeks. until statistically relevant results are achieved, are compared to the control metrics, and represent a solid win – or not.
Provable Bottom-line Results
Using multi-variant testing has had demonstrable, scalable, and positive results for Pinterest. They have run 150+ experiments and have scaled out to 8 teams, spanning 80 locales. Resulting in a lift of 1.5M weekly active users from this effort. Having proved the concept and the platform, they envision some further enhancements on alerts, the Chrome extension, high availability and the quick identification of string owners.
It’s rare to see this done well, at scale, and all without a lot of manual overhead.
Many thanks for sharing, and good luck to Pinterest and Jean. This was a great session!