Simultaneous Triangulation: Mixing User Research & Data Science Methods

Simultaneous Triangulation: Mixing User Research & Data Science Methods

Colette Kolenda and Kristie Savage discuss how a three-step process for mixing qualitative and quantitative methods can avoid data discrepancies and fuel product decisions.

Mixed Methods Research at Spotify.

At Spotify, our approach to insights is grounded in our belief in applying multiple complementary research methodologies. Our research team, Product Insights, is made up of both User Researchers and Data Scientists.

As we can see from visualizing our methodologies in this “What-Why Framework,” User Researchers and Data Scientists are natural partners. Data Scientists look at the large-scale, overarching trends in user behavior through methods such as A/B tests and statistical modeling. User Researchers apply methods such as interviews and surveys, to explore the self-reported listener experience to understand the mental models and perceptions of Spotify.

The “What-Why Framework,” highlighting some methodologies in each quadrant. We adapted this framework from the Nielsen Norman Group article by Christian Rohrer.

Together, User Research and Data Science provide complementary perspectives that mutually enhance each other. By combining these disciplines, we can gain a holistic understanding of multiple forms of data and mitigate the blindspots of a single research method alone.

When your data says one thing but your users say another.

What happens when the different methods you employ yield different, even contradictory results? Well, this happened to our team, when we were researching a test of skippable ads in Australia. In this test, Spotify Free listeners could skip the audio and video ads that come between their songs.

As a Data Scientist, Kristie ran an A/B test where the experimental group received the skippable ads feature and the control group had a normal Spotify ad experience. She identified three groups of users with different levels of engagement with this new feature: power skippers, medium skippers and those who never skipped a single ad. To better understand why each group used the feature as they did, User Research followed up with interviewing listeners from each group.

We were surprised to learn that the A/B test data said one thing, but our users in the interviews said another. Some listeners we labeled as a “power skipper” from our A/B test actually had some confusion around which ads they could skip. These confused listeners likely would not have labeled themselves as power skippers, as our data did.

Mental models are something we could only learn from qualitative data from User Research. What we thought was the truth from an A/B test was actually an incomplete story. This difference between what our data showed and what our listeners said was hard to reconcile.

Discrepancies between A/B test data and user data were irreconcilable.

This discrepancy was discovered because we were passing learnings between teammates: from Data Science with the A/B test to User Research with the user interviews. This is the typical way of working together, but in reality, the two disciplines were not truly collaborating. Although we used complementary methods in this project, they yielded contradictory insights.

We needed to take collaboration to the next level, to simultaneously triangulate User Research and Data Science by pointing different methodologies at the same group of listeners at the same time.

3 steps for mixing methods effectively: simultaneous triangulation.

We delineated a process to more effectively mix methods. We call it “simultaneous triangulation.” It generates comprehensive insights without confusing discrepancies:

Step 1: Hone your research questions.

Clearly defining the research objectives makes it easier to identify opportunities to collaborate.

Step 2: Mix methods in different quadrants of the “What-Why Framework.”

Find complementary methods in different quadrants to counterbalance the strengths and weaknesses of each.

Step 3: Implement methods simultaneously to yield comprehensive insights. Rather than deploying each method separately, design your study to have all methods point at the same group of users at the same time. This will mitigate risks for unexplainable discrepancies.

What this looks like in practice.

Step 1: Honing our research questions

Despite some discrepancies in the first round of research, we still got useful learnings about issues that were preventing listeners from fully adopting the feature.

With the second round of this research, we wanted to uncover all of the drivers and blockers for feature usage to understand the ‘why’ behind the ad skip behavior. Building off of the previous learnings, we crafted specific research questions for round two that would yield a deeper understanding of those focus areas: awareness, understanding, and usability of skippable ads.

Step 2: Mixing methods in different quadrants of the “What-Why Framework”

To holistically understand these research questions, we combined a diary study with data tracking. Diary studies are in the qualitative-attitudinal quadrant of the “What-Why Framework” while data tracking is in the quantitative and behavioral quadrant.

We employed two methodologies simultaneously: data tracking and a diary study.

These two methods provide complementary perspectives. In the diary study, we recruited participants to take part in a three-week study, where we asked them to tell us about their daily listening experiences and their reactions to the ads. For the data tracking, we asked for each participant’s consent to look at their behavioral log data that was pertinent to the listening sessions. We were able to understand their overall experience from the behavioral data and their perceived experience through the diary study.

Step 3: Implementing methods simultaneously to yield comprehensive insights

The rich understanding of our users’ experiences came from pointing both methods at the same group of listeners at the same time. While participants filled out the diary study entries, we could also check the dashboard to assess their behavioral data experience. For each diary study entry, we not only got a sense of their self-reported experience of where and how they listened and their reaction to the ads, but we also could understand the behavioral data experience – such as how long they listened for, how many ads they received and which ads they skipped.

One method alone would have fallen short. Without the data dashboard, we would have been blind to their behavioral experience and without the diary study, we would not have had insight into their perceived experience.

Comparing dashboard and diary study data together.

This time, discrepancies were interesting follow ups, not dead ends.

In this new study design, discrepancies between the behavioral data and the diary study were actually really useful insights! If we saw an interesting trend in listeners’ data, we could follow up with them in the diary study to learn more about the ‘why’ behind their behavior.

For example, Kristie noticed that one participant only ever skipped a maximum of six ads in a day. No matter how many he got, he would never skip more than six. This was interesting to us, because there was no limit to the number of ads a user could skip. But his behavioral data would suggest that he was hitting some sort of cap. However, he never mentioned this in his diary study entries. Again, we faced another data discrepancy!

The data said one thing—that he was experiencing a limit. But the user said another – that the feature was working fine for him. Through simultaneous triangulation, we were able to follow up with him to understand his mental model and thus the reason behind this discrepancy. He responded, “I can only skip six ads, because I only get six song skips. I guessed ad skips must follow the same rule.” On Spotify Free, a user can only skip six songs per hour. He knew about this rule and misattributed it to ads as well. He made his own mental model about an ad skip limit.

One participant’s dashboard and corresponding diary study response. We observed a trend in their skip behavior and followed up to learn more in the diary study.

This time, discrepancies weren’t a confusing dead end, but rather, interesting new opportunities to dig deeper. Through simultaneously combining the behavioral data dashboard and the attitudinal diary study, we learned nuanced drivers and blockers for ad skipping.

Verified insights to fuel product decisions.

As a result, User Research identified different mental models of awareness, understanding, and usability of this new feature. However, the diary study method is not equipped to quantify how many people outside of the diary study would be affected by these issues.

So, we collaborated to define proxies in behavioral data that would size the impact of each insight. For example, we grouped those who had confusion around an “ad skip limit” by identifying users who always plateaued at a number of ad skips per day.

With product and design stakeholders, we brainstormed solutions to address each issue uncovered in the research. We decided to send educational messages to groups of listeners who displayed those proxy metrics for awareness, understanding, and usability issues. For listeners who were confused about an “ad skip limit,” we could send them a message informing them that there was not actually a limit on the number of ad skips.

An example of an educational message we sent to listeners who might be confused about an ad skip limit.

These targeted educational messages reduced user confusion and, as a result, we saw our feature success metrics double. In mixing methods, we can have greater confidence in our insights and ensure product decisions are evidence-based.

Wrapping it all up.

Simultaneous triangulation is an incredibly powerful tool to generate comprehensive and verified findings. If you only use one method, you could end up with blindspots. If you employ methods sequentially rather than simultaneously, you could run into unexplainable contradictions, like we did at first.

The solution is simultaneous triangulation. Next time you have a complex research question, consider using the three-step process to mitigate blindspots and turn discrepancies in learning opportunities. Choose methods from different quadrants of the “What-Why Framework” to provide complementary perspectives. Apply these methods at the same time, to the same group of users.

You can apply this strategy to your insights practice: just follow the steps with any of the research methods you have access to. You don’t have to be on the Product Insights team at Spotify, and you don’t even need both User Researchers and Data Scientists. You can do simultaneous triangulation with the methods in your toolkit.

With many research questions ahead, we are excited to continue utilizing simultaneous triangulation to influence complex product, design, and tech strategy decisions. We got here by experimenting with mixing different methods and we encourage you to do the same. Tell us how this process works for you, as you apply this in your research! You can reach out to us at colettek@spotify.com and ksavage@spotify.com.

Thank you to George Murphy who led this research project and Peter Gilks for presenting this case study with us at the Qualtrics Conference earlier this year.

Spotify + Inclusive Design:
Global Accessibility Awareness Day Round Up
Design

Spotify + Inclusive Design: Global Accessibility Awareness Day Round Up

Using Service Design to Create 
Better, Faster, Stronger Designers
Design

Using Service Design to Create Better, Faster, Stronger Designers