Skip to main content
UsabilityNews.com - for all the latest in usability and human-computer interaction
The British HCI Group
 
 
The All the Latest section presents all general usability news articles


 
  advanced search
 

All the Latest

Caroline's Corner: Designing comparative Evaluations


Source: UN, 31 August 2004
Submitted by Caroline Jarrett

Caroline's picture

It was one of those calls that is simultaneously good news and bad news: 'We’d like you to do an evaluation for us. We have two designs here and we want to know which one is better'.

The good news: well, I’m a consultant so phone calls offering work are always good, right?

The bad news: comparative evaluations. Ugh. So I thought I’d at least make use of the pain by writing a few notes on them here.

IS A BETTER THAN B?
The first challenge of a comparative evaluation is that the client wants a nice clear answer: A is better than B. Or perhaps: B is better than A. The problem is that the actual answer is usually more complicated. Parts of A are better than parts of B. Parts of B are better than parts of A. Some bits of A are execrable. Some bits of B, usually but not always different ones, are also execrable. There’s probably an approach C that is better than A or B, and the final answer is probably D: a bit of C, plus some of the good points from A and from B. It’s not exactly a nice clean story, is it?

'BETWEEN SUBJECTS’ or 'WITHIN SUBJECTS’
I don’t like to use the term 'subject’ for the participant in a test, because my view is that the system is the subject not the person. But here we need to turn to the design of psychological experiments here where the subject of the experiment is the person. If you have two designs to test, are you going to get the same participants to test both designs ('within subjects’) or are you going to do two rounds of testing: one group of participants gets A, and another gets B ('between subjects’)?

The problem with 'within subjects’ design is that nearly all systems have some learning effects. If you ask the participants to try the same or similar tasks with both systems then they learn about the task with the first system and can’t unlearn that knowledge before they try the second system. If they try different tasks with each system then are they really comparing like with like? I’ve known participants who had a hard time with the task on A so they were adamant that they preferred B even though it was downright horrible to do the task with B. And we also get into much larger sample sizes because we have to vary the order of presentation of systems so that the one group of participants get A then B and an equal group gets B then A.

The problem with 'between subjects’ design is that you can’t ask the participants which they preferred. And surely that is one of the main reasons why we’re doing an evaluation anyway, to establish preference? So we end up in the murky world of inferential statistics: trying to figure out what the population of whole as a whole might prefer on the basis of the two samples from that population who tried these two interfaces. And now we’re into the issues of random sampling and statistical tests that require much larger sample sizes than we normally use in usability testing.

MINIMAL OR RADICAL DIFFERENCES?
My third recurring problem with comparative evaluation is the 'identical twins’ problem. The client knows these babies and sees all the subtle and, to them, important differences that they want to explore. The participants see them as identical twins: both products look pretty much the same. For example, we were looking at three different versions of a form that is much hated by the general public. The client could see all sorts of really, really major differences between then. The participants just saw the form they loathed.

SOME TIPS
If you do have to undertake a comparative evaluation, maybe these tips will help:

1. Prepare your client for a complicated answer that picks elements from the different approaches.

2. Be prepared to undertake far more tests. You’ll probably need at least three times the number of participants you usually work with rather than just twice the number.

3. Dust off your statistics books. You really do need to think about what assertions are supported by your sample size.

4. Try to make sure that the differences you are exploring really do seem like differences to your participants.

If you have any comments or suggestions about this article then please contact Caroline at:

Caroline.Jarrett@Effortmark.co.uk

Caroline Jarrett
www.effortmark.co.uk

 


External link to another web site Associated Link:
Effortmark


Other News

Six Metrics for Managing UI Design
Source: Russell Wilson, 28 August 2008
 
A proposal of six metrics to be used for managing a user interface design department.

Don't Judge a Form by its Cover
Source: Formulate Information Design, 27 August 2008
 
The saying "don't judge a book by its cover" reminds us that looks are deceptive. It turns out that this idiom applies to forms too.

Beijing Olympics - special State of the eNation report
Source: www.abilitynet.org.uk, 26 August 2008
 
In this special report AbilityNet asked disabled users to try out the Beijing Olympics website in our interaction lab.

It's Who You Know (Or Don't)
Source: Stanford Magazine, 23 August 2008
 
Online social networks are powerful and ineffectual all at once.

Winning Considerations for Interactive Content
Source: UXMatters, 22 August 2008
 
Rich options for interactively presenting content also come with a challenge.

Microsoft sees end of Windows era
Source: BBC, 20 August 2008
 
Microsoft has kicked off a research project to create software that will take over when it retires Windows.

News you can Use
Source: Gerry McGovern, 18 August 2008
 
When the homepage is dominated by news you are not necessarily communicating more.

Feeling Through your Computer
Source: Discoveries and Breakthroughs Inside Science, 16 August 2008
 
A newly designed device lets computer users feel the texture and movement of what they are seeing in front of them.

User interviews - A basic Introduction
Source: Webcredible, 15 August 2008
 
It's surprising how few people have a real understanding of who's using their site.

Helping Visitors find Information
Source: UN, 13 August 2008
 
A new report outlines the key findings from surveys that explored factors which influence the quality of online experience.

 
 

 

home | contribute | subscribe | news feed/RSS | search | contact us | disclaimer

UsabilityNews.com (version 1.4), along with its associated web site and content,
are all strictly © Copyright of the British HCI Group 2001-2008. All rights reserved.

Joanna Bawa (editor), Dave Clarke (founder, designer and developer). Ian Parry (graphics).