Measuring the Wrong Thing: Data-Driven Design Pitfalls

Yesterday I got the chance to give a brand-new talk at the Midwest UX conference, here in Indianapolis this year. It covers some of my experiences working with data in the design and project process: trying to get it, struggling to interpret it, and using it to make decisions.

Slides from my talk are now up on Slideshare, but I’ve posted my complete transcript of speaking notes below. Thanks to everyone who was able to be there, especially the folks with smart questions and comments on Twitter and in-person.

Measuring the Wrong Thing: Data-Driven Design Pitfalls from Jen Matson

Transcript:

[Slide 1 – Title]

[Slide 2]
Hi, my name is Jen Matson and I work as a Senior User Experience Designer at Amazon. And for most of the past two and a half years I’ve been there I’ve focused on creating solutions for third-party sellers – everyone from big businesses to mom-and-pop shops to individuals looking to unload some used books or CDs.

I’ve been designing and building web sites since the dawn of the web, for both product companies and agencies. So I’ve seen our understanding of web site and user data evolve over the past 20 years. From the number of “hits” a web site gets and what other sites might be sending us traffic via URLs in our referrer logs, to sophisticated tracking of specific users across pages and their microinteractions within a given page.

And yes, I am a data junkie. In that picture of me, I’m sitting in front of part of my record collection, which I not only listen to, I mine for data. Here’s a better picture of that.

[Slides 3-5]
Starting at age 15, I created my first digital database of my collection – here’s a sample of the flat file containing my 7″ singles. And of course it’s not just about the data itself, but the story it tells. So I’ve used data visualization software to give me insights about my collection. I clearly like a lot of music released in the late 80s to early 90s.

[Slide 6]
So, I do that for fun. But I apply the same curiosity about data in my job. Because it’s key to not only understanding the effectiveness of my work as a designer, but in helping to set the direction of a project from the outset.

But many people still either don’t fully understand the data they are collecting, or misuse it – intentionally or not – to draw conclusions about what our users want or need. And I am here to share some of these missteps, and how to do better, how to use data to inform your designs, to create work that better serves your business and your users.

[Slide 7]
Now, Amazon is a company that highly values data. In fact, fellow speaker Jared Spool has spoken in the past about some of the ways in which Amazon gathers data and uses it to improve both the customer experience and the company’s bottom line. Most product companies today not only actively collect data, but seek to use it to improve their products in an ongoing fashion. Project methodologies like Lean and Agile emphasize customer value and validation of hypotheses, and we need data to understand both.

[Slide 8]
What about agencies, or client services companies? I’ve worked at my share of agencies, and traditionally they haven’t been able to get the same access to information that can be used to shape and improve design. Either the data is considered company confidential (not shareable with the agency), or once the agency delivers the end product, their engagement with the client ends, with no further insight into how the product performed with users.

[Slide 9]
So in some companies data is the driving force in product design, whereas others are more focused on the feature idea. Cennydd Bowles, a Design Manager at Twitter, created this chart illustrating these concepts as part of a blog post on the topic. (Though I don’t necessarily agree with the distinction between product vs. web.)

Using data to drive design would seem to be the safer route to product success. But when data is used as a starting point, we can fall into the trap of molding our ideas to fit what can be easily gathered.

And beginning with an idea, we have a clear vision for what we want to deliver. But without customer data to validate that we’re building the right thing, we may just end up creating something we (not our users) happen to think is cool.

And of course, the worst situation of all to be in to fail to gather the data we need to make smart decisions about what to design and build BEFORE we release something to our customers.

[Slide 10]
So I’d like to walk you through a few different projects where I’ve encountered these issues. These were not shining moments. I’ve anonymized these examples, as while I’m happy to share my own mistakes, these companies and clients might not be so willing.

Let’s start with a close call where we almost didn’t get the data we needed before launch…

[Slide 11]
Case study 1 – The Meaning of A Click (or Tap)

[Slide 12]
Company: A movie listings web site
Project: Create a mobile-optimized version of the movie detail page.

The project seemed straightforward enough: take an existing “desktop” page meant to show information about a specific movie – plot synopsis, cast information, showtimes, trailers and reviews – and make a mobile-friendly version optimized for smaller screen sizes and touch interaction. The goal was simple have the same great experience, same content and features, on the mobile version of the site.

As part of the process, a usability study was planned prior to launch, where we’d see how the design prototype performed with customers seeking movie information on their phones.

[Slide 12]
As the day progressed, an interesting pattern emerged: customers would tap on the reviews widget, then immediately navigate back to the previous page without seeming to read anything.

[Slide 13]
When asked by the researcher to talk about what it was they were doing and looking for, customers said that they hadn’t meant to tap on reviews; they were trying to view movie showtimes.

This was odd – the button to view movie showtimes was further down the page, offscreen. But it turns out that users thought the “Movie Showtimes” heading was a link, as that heading text was purple, matching one of the company’s brand colors. And it was the first element they saw that matched their task (find movie showtimes). Since the ratings widget was positioned so close to the Movie Showtimes heading, customers were accidentally tapping on the widget link, taking them to content they didn’t want.

[Slide 14]
So what seemed to be a tap for one thing ended up being a thwarted tap for another. If we’d relied upon quantitative data alone, which would show us link taps and navigation behavior, we’d have gotten bad data. And yes, those elements probably needed to be spaced a bit farther apart as well, to avoid accidental taps.

Now, thankfully this issue was caught prior to launching the mobile design. But it definitely happens – maybe frequently in your organization – where the product owner says “No time for usability testing: ship it and we’ll look at the data post launch” And they honestly may believe that will give them the necessary insights about what works and what doesn’t.

However, if we’d launched without testing, a couple of things would have happened:

[Slide 15]
– First, we’d have shipped an experience that would frustrate a good chunk of our users. (And I am definitely not a fan of using your customers – and their good will – as guinea pigs for half-baked products.)

– But more importantly, the data would never have alerted us to the issue. In fact, the big takeaway would have been “our mobile customers LOVE reviews!” And maybe we’d have built out even more features to cater to that accidental behavior.

[Slide 17]
How were we able to fix this?

– Team had an understanding of the importance of gathering both qualitative AND quantitative data. In addition to measuring objective things like task completion time, number of taps, and what was tapped, we asked contextual questions of users as they performed their tasks in order to round out the picture.

– The schedule included time for usability testing up-front. We had the support of the entire team not just for testing, but for the additional design and development time needed to make changes based on what we learned.

[Slide 18]
Case study 2 – Throwing Stuff Against the Wall

[Slide 19]
Company: A mobile service provider
Project: Redesign the help portal to offer personalized help content.

Customers of this mobile service provider would arrive at the site and log in to manage their account: check their plan usage, pay their bills. So we should know exactly who they are when they go to access help content.

However, that entire section of the site was largely ignorant of the specifics of the individual visiting it. While we could display content based on whether the user had a business or personal account, any personalization beyond that was lacking.

[Slide 20]
I had started work on an approach that would tailor content based on known user attributes, such as plan type and account age. However. mid-way through the project I was told we needed to incorporate a widget built by another team in our group that provided personalized help in Q&A format.

[Slide 21]
I tested the suggestions given by this widget using my own mobile account, only to find that it was providing advice that didn’t apply to me. The questions I saw were: “How do I reactivate my phone?” (my phone was working just fine, thanks), and “How do I add more data to my plan?” (I already had a largest data plan they offered).

[Slide 22]
This help widget wasn’t helping me.

[Slide 23]
I was getting a false positive result. Because the team responsible for the widget was doing two things that didn’t quite make sense:

1. Surfacing questions to users based on “other users like them” and not their specific account situation, and

2. Taking clicking on a question, as I’d done, as a “win”. The assumption was that a click equaled interest in a topic, and interest therefore equaled relevance.

But there is only a loose connection between a click and interest. After all, I clicked just because I was curious. And since the two teams weren’t aware of what the other was doing, we never had a chance to have that conversation until it was too late.

[Slide 24]
The impact on users would be significant. While we couldn’t measure it without logging into multiple accounts we were already very familiar with, a brief audit of a few other coworkers’ accounts showed the same issues. We could expect users to be confused, and perhaps be less likely to trust our ability to help.

And the cycle would continue until we came up with a better way to measure success.

[Slide 25]
Because to fix this, we’d need to come up with a way to audit real customer accounts, noting specific lifecycle events they had experienced (service disconnection, data plan overage), then validating whether or not relevant questions were surfaced for matching accounts, or were suppressed for non-matching ones. This we should do BEFORE launching new questions.

Even better, let’s use that lifecycle model as the sole way to determine relevance, not the black box of “other users like you”.

And ultimately, we’d need to have a unified team to solve personalized help, to avoid clashing approaches.

[Slide 26]
Case study 3 – Unclear Cause and Effect

[Slide 27]
Company: A TV manufacturer
Project: Update the site support search engine to make content easier to find.

We’re responsible for the web site for a product company that mades high-end flat screen TVs. But some of our customers have trouble with setup and installation and come to our web site looking for help. And a large percentage of our customers are choosing to contact customer support not long after they arrive on our site, even though many of them are using our search engine to search for support content.

So the primary business goal was to reduce contacts. How do we align that with user goals? Make it easier for them to solve their problems on our site by finding the info they need.

[Slide 28]
I discussed this with the product manager, sketching out some of the steps in the user’s problem-resolution experience once they landed on our web site:

Find article via browse or search of support section -> Read article/forum answer or watch video -> Use solution information or self-service tool to solve problem

[Slide 29]
And in order to enable a successful task flow, our content at various stages needed to be:

Findable -> Consumable -> Actionable

So to best achieve our contact reduction goal, we’d ideally want to reduce the number of people abandoning at each stage. Move people through that funnel.

[Slides 30-31]
As I mentioned, the product manager was focused on tackling the search experience first. Findability. And the simplest way to measure success would be to track how many users clicked on the “Contact Us” link from a search results page now, then see if we can reduce that percentage as we roll out the redesigned search to our users. For a user to click on contact us instead of engaging with content that should be relevant and visible is our sign of failure.

[Slides 32-33]
Problem is, the project team had already committed to measuring contacts from users who tried the new search within a 20-minute session. Much weaker connection between cause and effect. In 20 minutes, a user could read an article, try a tool… and fail at those tasks. And we weren’t yet going to address problems in those experiences. And there were problems: from videos that included outdated information, to articles that failed to provide links to self-service tools. But by measuring the contacts per session, we put ourselves on the hook for things we didn’t’ have the time, the resources, or even the plan to fix.

[Slide 34]
As a result, we couldn’t accurately measure the impact of the search redesign relative to other factors in the support experience.

So why did the product manager pick the 20-minute session measurement? Because that was the thing that was easiest to measure. We didn’t have the ability to track a user’s path through a funnel. Nor could we see exactly which UI element was clicked on.

And we ended up working on search, when perhaps the real problem was with content. But we didn’t have the data to show what was the biggest problem for our users.

[Slide 35]
We also couldn’t get the data – that was an unfortunate by-product of the launch. Without the right instrumentation on our pages to get that funnel data, to see where the drop-off was, or without at least interviewing some users, we were leading with a idea without the data to validate it. And as much as I may have felt some of the content was poor, I couldn’t help make the business case to prioritize that work without objective data.

[Slide 36]
So, how could we fix this?

Mainly by working with the product manager as early as possible in the process, to ensure that we decided what to build based on the evidence to meet that contact reduction goal. And for me to create task flows and use those to help illustrate our users’ likely behavior, and how that should connect to the type of data we wanted to collect.

[Slide 37]
What else?

So you can see how, even on teams that have an awareness of the importance of data, different factors can cause things to go wrong. In addition to some of the project lessons I’ve shared, you can do a few more things to help yourself as a data-informed designer.

[Slide 38]
Understand your company culture
– Before you do anything, get a sense of whether data is valued, and what kind. Even if you think interviewing your users might be the best way to get the data you need, in some cultures that might be considered “anecdotal”. Or perhaps someone has some marketing research they think is great starting point. Or they want to set up a focus group to ask users what they want. In these environments, it might be challenging to steer the data ship in the right direction.

[Slide 39-40]
Learn more about what you can measure, and how
You’ll want to familiarize yourself with the full capabilities of what can be measured online. It’s a lot more than you may think:
+ Referring page. The page the user visited immediately prior to landing on the subsequent page.
+ Site path. What paths are users following through your site, either page-to-page or within a page?
+ Clicks. What UI elements the user clicked on, whether or not that resulted in a new page to load.
+ Hovers and Mouse Path. Where is the user mousing onscreen? Over what elements, following what paths?
+ Scroll depth. How far down the page is the user scrolling? Combine with data about screen size (width/height), and you can get a clear idea of exactly what the user viewed.
+ Dwell or user focus time. How long did the user spend viewing a certain segment of the screen? Combined with scroll metrics, you can see where they paused to stop and read and interact.

[Slide 41]
Use what you learn to improve your designs and increase your influence
– The best way for you to get better as a designer is to see tangible evidence of the impact of your work on your users. And good data provides that. You should be able to put your best effort out there, learn what works and what doesn’t, then try again, and continue to iterate. And with data, you gain an objective tool you can use to better demonstrate your value to your business. Which means greater likelihood of getting that strategic seat at the table earlier.

[Slide 42-43]
Because we’re all very fluent in asking who are our users, what are their needs, what problems are we trying to solve for them. These questions are a standard part of every UX designer’s playbook. All I ask is we add to that: “what data do we have to support this?” or “how will we get data to validate this?” Learn from my mistakes, and use data to inform your designs to better delight your users.

[Slide 44]
Thank you.


Comments are closed.

Or reply to me on Twitter: @nstop