Most qualitative usability testing studies that I do involve the creation of a screener early on in the research process. The screener is the product of a necessary effort to determine not only who specifically the expected users are of a product, but also how to best allow a recruiter to find appropriate participants that represent those expected users.

Ultimately, a key goal for typical usability testing is to determine whether expected users can complete key tasks with (to quote ISO 9241-11) “effectiveness, efficiency and satisfaction.” In other words, can they successfully complete the tasks that they need to complete and feel good about the process?

Topic of the day

When constructing the recruiting criteria, what matters most for research accuracy is what I like to think of as the topic of the day. The topic of the day is always different and usually stands out pretty clearly as that thing that we need to recruit for. If we are doing a study of a mobile online banking resource, then most likely we need those who do mobile online banking. If we are doing a research study for an innovative new social media platform, we probably want those who use existing social media resources.

Other demographics

Beyond the topic of the day, there are typically other demographics that matter.  For these demographics, having a range of backgrounds can be useful. For example, adding filters for age, technological sophistication, educational attainment, reading ability, income, or employment could be useful in making sure that the appropriate representative users are on hand to participate in a study.

Race and gender

Inevitably, however, I run into a situation where a stakeholder says that they understand their market and it’s critical that the recruitment mirror the demographics of their user base. They sometimes follow this request by specifically noting, among other things, that the screener include an appropriate mix by race and by gender.

While it’s always possible that there is a case when race and gender do matter as categories and could even be the topic of the day (for example, if there is an app that is focused on dealing with racism, sexism or intolerance), this is not typically the case.

To put it bluntly, I tell them that in all my years of doing usability testing, I have never seen either race or gender really matter when seeing whether participants can successfully complete tasks.

Why not?

  • This isn’t a quantitative study nor is it a market research study – you’re not understanding attitudes and behaviors of a target market, stratified by any kind of statistically significant demographic. So let’s not get stuck conflating usability testing and market research.
  • Taken to a logical conclusion, could you imagine noting that people of race X and Y have more trouble with task completion than people of race Z? Could you see yourself concluding that the site navigation just isn’t comprehensible by women? What would happen if you said that in a report anyway?
  • If the point of race in a usability study is that it represents a defined cultural group, this also doesn’t quite make sense as a recruiting strategy when thinking of the wide range of cultural groups within ethnic, religious, and geographic categories, among others. Let’s not use race to mean culture when thinking of product usage, and let’s only consider culture itself when it really does matter.
  • If the point of gender is that men and women are expected to understand an interface differently, is there a valid hypothesis as to why this would be the case? And in a more culturally sensitive world, would gender definitely remain binary anyhow? Taken to a logical conclusion, do we introduce a 7-point scale of manliness or femininity when understanding product usage? Sounds pretty strange, huh?

But the stakeholder wants it: Is there really any reason to say no?

The biggest challenge about recruiting for usability studies is that the recruiter or recruiting service doesn’t just magically find all the right people immediately. Depending on how many constraints are added, it becomes more and more difficult to find the right people. By forcing specific race or gender balance into screeners, it’s possible that some participants who would otherwise be ideally representative (using those demographics that matter most) are eliminated because of these additional constraints. In their place could be participants who aren’t quite as useful. In the worst case, it’s also possible that not all the session time slots will be filled because the recruit ends up taking too much time or becomes too complex.

What do I do?

Race: Personally, if stakeholders asks for a race category as screening criteria, I will usually suggest that they do not and explain why.

Gender: If they request a gender category, I’ll tell them that instructions for a recruiter will be “recruit a mix of gender to whatever extent possible,” which basically tells the recruiter that gender should only be considered when everything else is accounted for. It also tells the recruiter that if we don’t have gender balance, no big deal.

One general caveat beyond task-based usability testing: If the intent of a research study is to get attitudes, beliefs and opinions about an interface, for example in a focus group setting, then I have occasionally seen a race or gender-based perspective matter. For example, by hearing a complaint during a focus group that the chosen stock photography on an investment website included only white men. And in a focus group for a home improvement website, a woman said that the site didn’t do enough to convey to women that the site was for their home improvement needs too.

What should you do?

For any given task-based study, there are likely going to be demographics that matter and those that don’t. While race and gender stand out to me as the demographics that are most frequently included but don’t usually make sense, there are apt to be others. When you do encounter these demographics requests, consider whether the demographics increase the complexity of the recruit. If they do, either try explaining to the stakeholders that they don’t matter, or at most, include them with a “recruit to whatever extent possible” criteria.

Always remember that a good recruit lays the solid foundation for the quality of any usability study so do your best to navigate the politics of preexisting stakeholder beliefs about the study that could unnecessarily raise the complexity of what you want to do.

Image: Kamaga /