01. Introduction to Human-Computer Interaction
Andy Cockburn: Room 313, working Thursdays and Fridays
Tutors: team368@cosc.canterbury.ac.nz
Course breakdown:
- Labs: 9%, 1% per lab
- Usability analysis and storyboard
- 25%, 5pm 22 September
- Design specification and rationale
- 15%, 5pm 20 October
- Exam: 51%
Goals:
- Understand key human factors influencing HCI
- Know and apply guidelines, models, methods that aid in interface design
HCI: discipline concerned with the design, evaluation and implementation of interactive computing systems for human use.
There should be a cycle of evaluating, designing and implementation.
Usability
Three key pillars:
- Learnability: rapid attainment of some level of performance
- Can be modelled as the inverse of time spent on the interface
- Efficiency: can get a lot of work done per unit time
- Subjective satisfaction: how much you enjoy using it
Two minor pillars:
- Errors: should be few errors in an efficient interface.
- Memorability: should be memorable if the interface is learnable.
Trade-offs: efficiency and learnability (inverse of time spent) are often at odds with each other. The performance/efficiency ceiling is often lower for more learnable interfaces.
Preliminary Factors
- Safety considerations
- Need for throughput (efficiency)
- Frequency of use
- Physical space, lighting, noise, pollution
- Social context
- Cognitive factors: age, fatigue, stress, focus
Usability is like oxygen: you only notice it when it’s absent. See: doors with handles that you need to push.
Managing Complexity
The job of HCI is to manage complexity: designing an object to be simple and clear; the relentless pursuit of simplicity.
Interface
Complexity
^
| ____
| ____/
| Poorly designed ____/
| UIs; complexity ____/
| amplified ____/
| ____/ Well designed UIs
| ____/
| ____/
| ____/
| ____/
|/
+--------------------------------------------------> Domain
Door Word CAD Nuclear Complexity
Processor power plant
Models
Models are simplifications of reality that (should) help with the understanding of a complex artifact.
Don Norman’s Model of Interaction
From ‘The Psychology/Design of Everyday Things’, 1988.
This helps understand the designer’s role in creating a system that is used by a thinking person.
constructs
Designer/ -------------> System/system image
designer model ^
Provides | Provides input based on
feedback/ | their prediction of how
output | to achieve their goal
v
User/
user model
The designer tries to construct a system that they have not fully defined. The designer’s model is their conception of interaction; often incomplete, fuzzy or compromised in the actual implementation.
System image: how the system appears to be used (by the user); this does not necessarily reflect the truth of the system.
The user’s model begins very weak, coming from familiarity with the real world or other similar systems. They will use this experience to try and interact with the user system, building their model based on feedback from the system.
Ideally, there should be conformance between the designer and user’s model.
There is no direct communication between the designer and user; the designer can only communicate with the user through the system.
Execute-Evaluate Cycle
Execute:
- Goal -> Intention -> Actions -> Execution
- The user has a goal and knows the outcome they want
- They form an intention to complete the goal with the system and translate this to the language of the user interface; one or more actions
- They then execute the actions
- ‘Gulf of Execution’: problems executing intentions/actions
Evaluate:
- Perceive -> Interpret -> Evaluate
- Perceive the response/feedback by the system to their actions
- Evaluate; determine the effect of their action. Did it meet their goal?
- ‘Gulf of Evaluation’: problems assessing state, determining effect etc.
UISO Interaction and Framework
Abowd and Beale, 1991.
User, System, Input and Output.
Emphasizes translation during interaction:
- Articulation: user translates task from task language to input language
- Performance: system acts on the user input (callbacks etc.); translates input language into core language and modifies the system state
- Presentation: show the new state to the user; translate the core (system) state into output language
- Observation: user interprets the new system output
User has some low level task (e.g. saving a file); they need to translate their intention to an input language; this is one of the most difficult parts of user interface design.
--> Output ---
Presentation / \ Observation
/ \
/ v
System User
(Core) (Task)
^ /
Performance \ / Articulation
\--- Input <---/
Mappings
Good mappings; the relationship between controls and their effects, increase usability.
Affordances
Objects afford particular actions to users; there is a strong correlation between how it looks like it should be used and how it is used:
- Door handles afford pulling
- Dials afford turning
- Buttons afford pushing
- Bush shelters
- Glass affords smashing
- Plywood affords graffiti
Poor affordances encourages incorrect actions, but strong affordances may stifle efficiency.
Over-/Under-determined Dialogues
- Well-determined: natural translation from task to input language
- Under-determined: user knows what they want to do but not how to do it
- e.g. command line
- Over-determined: user forced through unnecessary or unnatural steps
- e.g. ‘Click OK to proceed’, lengthy wizards
- User turns into a robot; no freedom in what to do
Beginner user interface designers tend to think about the interface in terms of system requirements: the system needs x, y, z information so lets ask the user about these things up-front. These over-determined dialogues lead to horrible design.
Direct Manipulation
- Visibility of objects
- Direct, rapid, incremental and reversible actions:
- Reversibility allows users no-risk exploration of the user interface
- Rapid feedback
- Syntactic correctness
- Disable illegal actions (e.g. greying buttons out when action not available)
- Tooltips can help with the problem of not knowing why the action is not available
- Disable illegal actions (e.g. greying buttons out when action not available)
- Replace language with action
- Language needs to be learned and remembered (e.g. command lines)
- Actions; see and point
Advantages:
- Easy to learn
- Low memory requirements
- Easy to undo
- Immediate feedback to user actions
- Users can use spatial cues
Disadvantages:
- Consumes more screen real estate
- High graphical system requirements
- May trap users in ‘beginner mode’
The Human
- Input: vision, hearing, haptics
- Output: pointing, steering, speech, typing etc.
- Processing: visual search (slow), decision times (fast), learning
- Memory
- Phenomena and collaboration
- Error (predictably irrational behavior)
Fun Example
A trivial task that many humans will get wrong.
Count the number of occurrences of the letter ‘f’ given a set of words:
Finished files are the results of years of scientific study combined with the experience of many years
Three phonetics Fs: ‘finished’, ‘files’, ‘scientific’, are easily found.
But three non-phonetic Fs in ‘of’ are often forgotten.
Click
Even a blank graphic has affordances on where people usually click: on or near the center, or along the diagonals or corners.
Human Factors
Psychological and physiological abilities hae implications for design:
- Perception: how we perceive things
- Cognitive: how we process information
- Motor: how we perform actions
- Social: how we interact with others
The Human Information Processor
Card, Moran, Newell 1983.
Eyes/Ears
│
▼
┌──── Perceptual Processor ────┐
│ │
▼ ▼
Visual Image ──────┬─────── Auditory Image
Storage │ Storage
│
▼
┌─────Working Memory ◄─────────┐
▼ ▲ ▼
Motor │ Long-Term
Processor │ Memory
│ │ ▲
| | |
▼ │ ▼
Movement └──────► Cognitive
Response Processor
Human Input
Vision
Cells:
- Rods: low light, monochrome, 100 million rods across the retina
- Cones: color, 6 million rods in fovea
- S/M/L for short/medium/long approx blue/green/reddish-yellow sensitivity
Areas:
- Retina: ~120 degree range, sensitive to movement
- ~210 degrees with both eyes
- Notifications popping up in corners etc. will distract user
- Fovea: detailed vision, area of ~2 degrees
1 degree = 60 arcminutes, 1 arcminute = 60 arcseconds
Visual Acuity:
- Point acuity; maximum angle between two dots before they become indistinct: 1 arcminute
- Grating acuity; maximum angle between alternating bars before they become indistinct: 1-2 arcminutes
- Letter acuity: 5 minutes of arc
- Vernier acuity: given two parallel lines, minimum angle of separation in normal axis (e.g.
---___) before they are perceived as a continuous line (colinear): 10 arcseconds
Eye movement:
- Fixations: visual processing occurs only when the eye is stationary
- Saccades: rapid eye movements; about 900 degrees per second
- Blind while saccades are in progress
- Eye movement as input; difficult as people don’t have much control over where they are looking (e.g. accidentally looking at ‘delete all my files’ button)
- Smooth-pursuit: ability to tracking moving objects (up to 100 degrees per second)
- Cannot be induced voluntarily - can’t imagine a moving dot and track it
- Relevant in scrolling
Size/depth cues:
- Familiarity
- Linear perspective; straight lines getting closer together
- Horizontal distance
- Size constancy: if object gets bigger/smaller, it’s probably the object moving closer/further away, not the object changing size
- Texture gradient: texture getting bigger/smaller
- Occlusion: occluded items further away
- Depth of focus: blurrier the further you go away
- Aerial perspective: blurrier and bluer from atmospheric haze
- Shadows/shading
- Stereoscopy (best within 1m, ineffective beyond 10m)
Muller-Lyer illusion:
<->
>---<
Bottom one looks further away and is subtending the same angle, so brain perceives it as bigger.
3D, depth-based UIs:
- The world is 3D so all interaction should be 3D, right?
- Occlusion, far-away things being smaller, navigation/orientation etc. impedes usability unless the domain is 3D (e.g. gaming, 3D modelling)
- Zooming is useful though
- Overview of the data first
- Zoom in to progressively add detail about what they are interested in and filter information they are not
- Allows UI to provide details on demand.
Color:
- 8% males, 0.4% females have some form of color-deficiency:
- Types:
- Protanomaly: red
- Deuteranomaly: green
- Tritanomaly: blue
- Least sensitive to blue
Reading:
- Saccades, fixations (94% of the time), regression
- Approx. 250 words/minute initially
- READING SPEED REDUCED BY ALL CAPS
Auditory
- Used dramatically less than vision
- About 20 Hz to 15-20 kHz
- Can adjust many parameters; amplitude, timbre, direction
- Filtering capabilities (e.g. cocktail party effect)
- Problems with signal interference and noise
Haptics
- Proprioception: sense of limb location
- Kinaesthesia: sense of limb movement, often more of a conscious decision
- Tactition: skin sensations
Haptic feedback: any feedback providing experience of touch
Human Output
Motor response time depends on stimuli:
- Visual: ~200 ms
- Audio: ~150 ms
- Haptics: 700 ms
- Faster for combined signals
Muscle actions:
- Isotonic: little resistant to movement (e.g. mouse)
- Isometric: force but little motion (e.g. keyboard, ThinkPad TrackPoint™)
- Better for velocity/rate control (e.g. self-centering joysticks)
Fitts’ Law
A very reliable model of rapid, aimed human movement.
- Predictive of tasks descriptive of devices
- Derived from Shannon’s theory of capacity of information channels
- Signal: amplitude
of movement (or for distance) (middle of target) - Noise: width
of target
- Signal: amplitude
Index of difficulty (ID) measures difficulty of rapid aimed movement:
Measured in ‘bits’.
Fitt’s law: movement time (MT) is linear with ID:
-
is typically 200-500 ms -
is typically 100-300 ms/bit
Typical velocity profile, validated for many types of aimed pointing:
Speed
^
| Open-loop,
|balistic impulse
| /\
| / \ slow, closed-loop
| / \ corrections
| / \ /\
|/ \/ \/\___
+------------------------>
Time
Input Devices; Pointing & Scrolling
Human output is received as system input. There must be some sort of translation hardware to achieve this, which have many properties:
- Direct vs indirect
- Touchscreens have perfect one-to-one correspondence
- Trackpads indirect: mouse movement does not directly map to cursor movement
- Absolute vs relative
- Touchscreens, pen-tablets
- Trackpads (mostly) relative; finger location on the trackpad does not matter when moving the cursor
- Control
- Position (zero-order) e.g. absolute pointing, dragging the scrollbar
- Rate (first-order) e.g. holding down on mouse wheel and dragging up/down on Windows
- Acceleration (second-order)
- Note: having lots of modes and hence complexity may decrease number of interactions while making the task take longer due to the overhead of making decisions
- Isotonic: force with movement
- Isometric: force without movement e.g. 3D touch to control object size
- Control-display gain/transfer functions
- Magic sauce of iOS inertial scrolling, Mac trackpads, etc.
The control-display transfer function:
- The input device (e.g. capacitive trackpad) sends device units
- The gain function scales the input in accordance with the user or environment settings
- Persistence is used to continue output even when there is no ongoing input, adding features such as inertia
Transfer Function
+-------------------------------------------------------------------------+
| e.g. scroll inertia |
| device --------------- display -------- --------------- |
Device -+--------> | Translation | ---------> | Gain | ------> | Persistence | --+---> Output
Input | units --------------- units -------- --------------- |
| ^ ^ ^ |
| ------- Environment/User Settings ------------ |
+-------------------------------------------------------------------------+
Scrolling transfer function for iOS:
- When the finger is on the screen there is direct mapping
- After the finger leaves, the speed slowly decays
- After four quick scroll gestures in quick succession, the scroll rate increases in an almost vertical line after the finger leaves
- For all subsequent scroll gestures, the maximum velocity (immediately after the finger leaves) increases until a maximum scroll velocity is reached
Input Devices: Text Input
- Alternative keywords (e.g. Dvorak)
- Chord keys (e.g. stenographers)
- Constrained keyboards (e.g. T9 keyboards on old mobile phones)
- Reactive/predictive systems (autocomplete)
- Gestural input (e.g. swipe keyboards)
- Hand-writing recognition
Input expressibility: how well can you discriminate inputs? e.g. Google Glass had a tiny capacitive surface; doing text entry on that posed challenges.
Steering Law
Model of continuously controlled ‘steering’: moving an item across a given path, called a ‘tunnel’:
Where
This is important in cascading context menus, where hovering over an item overs a submenu to the left or right. Done naïvely, while travelling to the newly-opened submenu, the cursor must always stay above the item or the submenu will disappear. macOS appears to take into account the angle of travel to determine if the submenu should be hidden or not.
Human Processing
Visual Search Time
If a person has to pick out a particular item out of
This is slow, so the UI should aim to reduce the amount of searching the user must do. To achieve this, ensure there is spatial stability; items appear in the same place every time.
Hick/Human Law of Decision Time
Choice reaction time when optimally prepared:
Where
For item
For
Implications:
- Decisions are fast -
- Applies to name retrieval (commands) and location retrieval
- In GUIs, replace visual search (
) with decision through stable stability - Don’t order commands by most recently/commonly used - forces user to visually search
Spatially Consistent User Interfaces
Pie menu: items are sectors making up a circle centered around the cursor (possibly with multiple layers of items through nesting):
- Minimum of one pixel of cursor movement required for fast selection
- Allows for easy advancement from visual search to muscle memory
Ribbon: spatial stability within each tab, but requires visual search and mechanical interactions to find a new item. ‘Solution’: show all tabs at once.
Search: macOS menu bar search does not run searched command, only show you where the item is located. Menu items also show the keyboard shortcut.
Torus pointing: wraps cursor around screen, gives multiple straight paths to an item. Giving users choice may help with Fitts’ law, but increase decision time.
Power Law of Practice
Performance rapidly speeds up with practice:
Where:
-
is the time taken for trial -
is the time taken on the first trial -
is the learning rate
This applies both to simple and complex tasks.
Novice to Expert Transitions
People use the same tools for years/decades, but often continue to use inefficient strategies.
Shortcut vocabularies are small and are used infrequently. Factors:
- Satisficing; good enough
- Lack of mnemonics (for keyboard shortcuts)
- Lack of visibility
How do you support transitions to experts?
When switching between modes, there is a performance dip. Since people use software to do their jobs, not use software as their jobs, this causes a chasm that the user must take the time to cross.
^ Performance Modality
│ Switch
│ | xxxxx
│ xxxxxxxxx
│ | xxxxx
│ xxxx
│ | xxx
│ xx
│ | xx
│ Ultimate xxx
│ xxxxxxxxxxxx| xx ─┐
│ xxxxxx Performance x │ Performance
│ xxxx |x │ Dip
│ xx Extended x │
│ x Learnability |x ─┘
│ xx
│ x |
│x
│x Initial |
│Performance
│ |
│ First Modality Second Modality
└──────────────────────────-────────────────────────────>
Time
Domains of Interface Performance Improvement
- Intra-modal improvement
- Make the user an expert within the mode the user is comfortable working in
- e.g. guidance techniques where you show items the user is likely to use
- Inter-modal improvement
- Make the user aware of faster ways of doing the task (e.g. file, print to ctrl P)
- e.g. skillometers
- e.g. AutoCAD shows the text command being used in the background when using UI buttons
- Vocabulary extension
- e.g. track and show community command use to let users learn the most useful commands
- Task strategy:
- Intelligent UI that picks up on the task the user is trying to do suggests more efficient sequences of commands to achieve this
Human Pattern of Behavior
Zipf’s Law: given a cohort of text,
where
Pareto Principle/80-20 Rule: 80% of usage is made up of only 20% of items.
The UI should attempt to surface these 20% of items.
Human Memory
maintenance
rehearsal
┌────┐
│ │
│ ▼ elaborative
Sensory Memory Short-term rehearsal Long-term
iconic, echoic,──────► memory ──────────────► memory
and haptic │ ◄──────────────
│ │ retrieval
│ │
│ │
│ │
│ │
▼ ▼
masking decay displacement or
interference decay
Sensory memory: stimulation decays over a brief period of time; loud noises, bright lights, pain persists for some time after the stimulation disappears.
Short-Term Memory
- Input from sensory or long term memory
- Capacity of 7 ± 2 ‘chunks’/abstractions
- Chunks aid storage and reconstruction
- Fast access: ~70 ms
- Rapid decay: ~200 ms
- Constant update and interference
- Maintenance rehearsal: e.g. repeating a number a few times in your mind
Long-Term memory
- Input through elaborative rehearsal and extensive repetition
- Elaborative rehearsal: restructuring information instead of just mindlessly repeating it
- Slow access: > 100 ms, sometimes days (tip of the tongue phenomenon)
- Decay?
- Good at recognition but bad at recall
- Supports spatial processing
Human Error
Mistakes
Errors of conscious decisions; when they act according to their an incomplete/incorrect model.
Only detected with feedback.
Human Error: Slips
Errors of automatic and skilled behavior.
Capture error:
- Two action sequences with common starting point(s)
- Captured into the wrong (and usually more frequent) path
- Used to be common e.g. in dialogue boxes with generic button labels (‘Cancel’ and ‘Ok’)
Description error:
- More than one object allowing the same/similar action
- Execute the right action on the wrong object
- e.g. lighting panel with multiple switches
Data-driven error:
- External data interfering with short-term memory
- e.g. entering unrelated file name when saving a document
Loss-of-activation error:
- Goal is displaced/decayed from short-term memory before it is completed
- e.g. walking into room then forgetting why you entered
- Want to complete task that requires subtasks and sub-subtasks to be completed, overflowing short-term memory
Mode error:
- Right action in the wrong system state
- Modes are system partitions with:
- Different set of commands
- Different interpretation of the same commands/actions
- Different display methods
- Ensure modes are visible and noticeable
- Modal dialogues are example of bad modes
Motor slip:
- Pointing/steering/keying error
Premature closure error:
- ‘Dangling’ UI actions required after perceived goal completion
- e.g. forgetting to save, attach attachments to emails
Human Phenomena
Homeostasis
People maintain equilibrium:
- If a system makes something easier, people will use it to do more difficult things
- If a system makes something safer, people will use it to do more dangerous things
Satisficing
People are satisfied with what they can do now and don’t bother to optimize:
- People that ‘hunt-and-peck’ instead of learning to touch type
- People that don’t bother to learn keyboard shortcuts for tasks they do frequently
Hawthorne Effect
The act of measuring changes results (Heisenburg uncertainty principle of HCI).
People like being involved in experiments and change their behavior during experiments, complicating results.
Explaining Away Errors
Blaming the user is often easiest party to blame, but the user may have the mistake because the interface is designed poorly.
Peak-End Effects
Peak effect: people’s memories of experiences are influenced by the peak/most intense moments of an experience (e.g. combos attacks in games, casino games).
End effect: people’s memories of experiences are predominantly influenced by the terminating moments (e.g. good vaccation ruined by missed flight home, survey with many questions on the last page).
Negativity Bias
Magnitude of sensation with loss greater than the same amount of gain: bad is stronger than good.
e.g. single coin toss, win $110 on heads but lose $100 on tails, autocorrect ‘correcting’ a correct word feels much worse than how good it feels when it corrects a mis-spelt word.
Communication Convergence
Similarity with pace, gestures, phrases, etc. enhances communication. Could interfaces measuring (e.g. long press duration, mouse speed) and matching (e.g. animation speed, timeout, speech rate) help?
02. Interface Design
Design -> Implementation -> Evaluation -> Design -> …
Design Process
Saul Greenberg
Articulate
Articulate:
- Who the users are
- Their key tasks
Then design:
- Task-centred system design
- Participatory design
- User-centred design
This should lead to user and task descriptions.
Then, evaluate the tasks and repeat the process, refining goals.
Brainstorm Designs
When designing, consider:
- The psychology of everyday thing
- User involvement
- Representation and metaphors
Create low-fidelity prototyping methods.
Then, create throw-away paper prototypes.
NB: ‘prototype’ has multiple meanings, one of which implies executability.
Evaluate the designs:
- With respect to the tasks identified
- Participant interaction: get users involved
- Task scenario walk-through: in order to do X, Mary will press this button …
Repeat steps if required, further brainstorming more designs.
A reviewer should be able to unambiguously understand how the interface operates and works.
Refined Designs
Create:
- Graphical screen design
- Interface guidelines
- Style guides
Then use high-fidelity prototype methods and create testable prototypes.
Use usability testing and heuristic evaluation to further refine design if required.
Completed designs
Create alpha/beta systems or complete specifications. Do field testing if necessary.
Iterative Design
Iteratively refine design based on evaluative feedback.
A common mistake is to get an idea and hill climb on that single idea. Leads to:
- Tunnel vision
- Premature commitment
- Local maxima
- Stops early bad decisions from being fixed
Elaborative/Reduction Tension
Elaboration: get the Right Design; explore the full space of possible designs.
Reduction: get the Design Right; polish the solution. This may be done on the best solutions simultaneously.
The Design Funnel
_______
------/
----- /---------/ Sales
---\ \___|--------------------\
--- Management/Marketing ----------\_______
\_________________________________________
Design ----------------------------------
/------/ |----------------
--- /| --------|
---/ -- | Engineering /
-----/ |----/-------/
Supporting Rapid Iterations
Fudd’s first law of creativity: to get a good idea, get lots of ideas.
Lots of ideas take lots of time to build/test, so we need rapid creation, evaluation and prototyping.
Prototyping
After user/task identification, prototyping can occur.
Low-fidelity paper prototypes (elaboration):
- Brainstorm different representations
- Choose a presentation
- Rough out interface style
- Task-centred walk-through and redesign
Medium-fidelity prototypes (reduction):
- Fine-tune interface, screen design
- Heuristic evaluation and redesign
High-fidelity prototypes/restricted systems:
- Usability testing and redesign
Working systems:
- Limited field testing
- Alpha/beta testing
Low-Fidelity Prototypes: Sketches
Outward appearance and structure of intended design.
Necessarily crude and scruffy:
- Focus on high-level concepts
- Fast to develop
- Fast to change
- Low change resistance; you only put in a few minutes of effort
- Delays commitment
Use annotations/sequences to show UI progression.
Cross reference with other zoomed in/out sketches.
Sequential sketches: show state transitions; what interaction causes the state change?
Focus on the main transactions (Zipf’s law) - clearly convey how the user achieves the 20% of the most frequent interactions.
Medium-Fidelity Prototypes: Wizard of Oz
Have a person emulate the functionality.
IBM speech editor (1984): user would give audible commands to edit a text document, which the wizard implement. This gave IBM a good understanding of the user experience, allowing them to see if the idea was any good without investing a large amount of effort into actually implementing it.
Walk-through evaluation:
- Facilitator gives the user tasks and prompts them for their thoughts
- User looks at current system state
- Component updates system state following some pre-determined algorithm
- All UI states/components must be sketched/printed out
- Observer takes notes
Refinement (e.g. PowerPoint):
- Facilitates motion paths
- Links between states etc.
- Many wireframing tools available (eg. moqups, blsamiq, axure)
Precise medium-fidelity prototypes:
- For very small but important portions of the UI
- e.g. slide to unlock animations etc.
Photo traces:
- If you suck at sketching
- Take a photo, trace it out; captures the essence of the interaction without the exact representation
Simulations and animations:
- Works well for second round evaluation
- Horizontal prototype: surface-layer/sketch prototype of entire range of functionality
- Vertical prototype: much of the functionality for a small set of features
- Scenario: intersection of horizontal and vertical prototypes
- Beware of:
- Inflated expectations - perception of it being ‘nearly completed’
- Reluctance to change - the more it looks finished, the less willing stakeholders may be to recommend changes
- Excessive focus on presentation rather than approach
Task-Centered System Design (TCSD)
TCSD is the HCI equivalent of requirements analysis/use cases.
It asks exactly and specifically who are the users and what will they use the system for? There is a critical difference between:
- The User - a pretend person who will adapt to the system and go on a two week training session to live with the designer’s pet system
- A real, busy person doing their job
TCSD acts as a reality-based sanity check for designers.
Good book on system design: Task-Centered User Interface Design by Clayton Lewis and John Rieman.
How NOT to approach design:
- Focus on system and designer needs
- Ask what can we easily build
- Ask what is possible/easy with the tools we know/have?
- Ask the programmer what they find interesting?
UC SMS
UC’s student management system (from the mid 2000s) was a multi-million dollar, unusable disaster.
Example task: Andy is teaching COSC225 next semester; he wants to know how many students are enrolled to see how many printouts he needs. To achieve this:
- Click on ‘Navigate’ button in the toolbar; opens ‘System Navigator’ window
- Expand ‘Searches’ menu (hierarchical menu system)
- Click on ‘Course Occurrence Search’; opens new window
- Enter course code, hit return
- Select the right occurrence
- A window with a huge mess of text fields (mostly disabled) and 13 tabs opens
- …
The company that delivered it had a system that was similar to what UC needed; they did what was easy, not what the end user needed.
TCSD Phase 1: User Identification
Identify categories of end-users with specific exemplars - typical and extremes.
Talk to them!
- If they won’t give you the time to talk to you, they probably won’t use your system either
- If they really don’t exist (no existing system):
- Worry
- Describe your assumed users and tasks
- Learn about people in the task chain: who do inputs come from, where do outputs go?
- Why does the user need to do this? What do they do with the information?
TCSD Phase 2: Task Identification
- Record what the user wants to do, minimizing the description of how they do it
- No interface assumptions; tasks are independent of the interface they will use to complete it
- Can be used to compare alternative designs
- Don’t write ‘display ${something} to the user’, write ‘do ${something}’: the user wants to get information about something; the system displaying it is just a way they can do it
- Record the complete task: input source, output identification
- Identify users
- Design success depends on what users know
- Test against specific individuals; name names
- Uniquely enumerate tasks for identification
- Giving tasks a unique identifier helps with communicating problematic tasks with the team
- Identified tasks can be circulated for validations
- Interview the users with the tasks you identified; they can help spot omissions, corrections, clarifications and unrealistic tasks
- Identify broad coverage of users and tasks
- Create matrix with the axes of unimportant/important and infrequent/frequent tasks/users
Example: John Smith arrives at student services trying to enrol in a course online, but refused as he lacked a pre-requisite course. He has a letter from the HoD allowing him to enrol. He has forgotten his ID card and cannot remember his student ID or user code (<- this is an interface assumption; does the system have IDs or user codes?).
TCSD Phase 1/2 Outcomes
A report should state:
- User categories (and their priorities)
- Specific personas exemplifying each category
- Task categories and priorities
- Concrete representative task scenarios (with name of the owner)
- Enumerated with unique identifiers for use in UI validation
- Explicit identification of groups/tasks that will not be supported and reasons for this
TCSD Phase 3: Design
Use task categories/scenarios to generate and evaluate designs.
Strive to make the workflow natural to the user. For each design and task scenario ask how the user would complete the task.
TCSD Phase 4: Walk-through Evaluation
Interface design debugging: select a task scenario and for each step:
- Ask what the user would do given what they know
- Ask if the task is believable
- If not, it is an interface bug. Record it and assume it is fixed when going through the next steps
Cautions on TCSD
- It is hard to record and identify task scenarios that are independent of the interface
- The more the interface and task are interlinked, the more difficult it is to identify alternative/better ways of achieving the task
- It can be hard to find people ‘responsible’ for new tasks in a system: who do you interview, how do you validate the interface?
User-Centred System Design
Know the user: design should be based around user needs, abilities, context, tasks etc. and should be involved in all stages of design: requirements, analysis, storyboards, prototypes etc.
UCSD/Participative Design: Involving the User
Talk to users:
- Interview them about culture, requirements, expectations
- Contextual inquiry: observe them doing their job; a few hours of observations can give a lot of insight
- Explain designs: get input at all stages, show visual prototypes and demos
- Walk-throughs: the user knows what they will do the best
UCSD: Participatory Design
Problem:
- Designers’ intuitions can be wrong
- Interviews lack precision/context and can mislead
- Designers cannot know user needs well enough to answer all questions that are likely to arise during design
Solution:
- Designers having access to a pool of representative end users: not management; real users
- These users are full members of the design process
The users:
- Are excellent at responding to suggested designs (they must be concrete and visible)
- Bring in important knowledge of work context that only someone that has lived in the role can learn
- Will often have greater buy-in into the system
However:
- It is difficult (and expensive) to get a good pool of representative end users - you are taking people out of their regular jobs
- They are not expert designers - they probably won’t be able to come up with design ideas from scratch with an understanding of the constraints of the technology, budget, time etc.
- The user is not always right - they may not know what they want
Erskine: member of Math/COSC departments became members of the design and judging team, gave suggestions to architects etc. (e.g. less glass - too much glare). When finished the staff had buy-in into the building; it was their building, not one built by management.
Usability Heuristics
AKA User-Interface Guidelines, Style Guides.
Usability heuristics:
- Encapsulates best practices and ‘rules of thumb’
- Identify common pitfalls
- Define simple ‘thinking hats’ - specific areas (e.g. memory load) to evaluate the interface
Formative heuristics guide design decisions while summative heuristics evaluate existing systems.
Advantages:
- Minimalist: easy remembered and applied, with just a few guidelines covering most problems
- Cost: cheap and fast, and can be done by novices (e.g. end users)
Disadvantages:
- Heuristics can be broad, redundant and obvious
- Some subtleties in their application
Nielsen’s Ten
The original set defined Jakob Nielsen’s Usability Engineering:
01. Simple and Natural Dialogue
Manage complexity: make it as simple as possible, but no simpler (match the complexity of the domain).
Organization of the interface: make the presentation (appearance of each state) and navigation (between states) simple and natural.
Graphic design: organize, economize, communicate.
Use windows frugally - fewer windows are almost invariably better.
See: Google vs Yahoo search page, iPhone vs feature phones
02. Speak the User’s Language
Affordances, mappings and metaphors.
Terminology (words, colors, graphics, animations etc.) should be based on the user’s task language (and not based on system internals).
e.g. error messages should be useful to the user, not just the programmer/designer.
‘Language’ is textual and iconic (e.g. ‘Save’ (natural language) can be Ctrl-S , floppy disk icon).
03. Minimize The User’s Memory Load
Recall is slow and fragile; use recognition wherever possible:
- In font menus, show the font name using that font
- Show input formats and provide defaults
- e.g. date inputs with defaults tell you the format the date should be in
- Support reuse and re-visitation
- e.g. browsers show commonly visited pages in omni-bar
- Support exchange of units - don’t force the user to do unit conversion themselves
- Support generalization techniques:
- The same command should be able to be applied to all objects (e.g. cut/copy/paste on characters, text boxes)
- The same method/modifier being generalized (e.g. circles are constrained ellipses, squares constrained rectangles)
04. Consistency
Consistency everywhere:
- Graphic design
- Command structure (e.g. always select object then command to act on it)
- Internally (within the application)
- Externally (within the platform)
- Beyond computing (e.g. red for stop, green for go)
05. Feedback
Continually inform the user about what the system is doing and the system’s interpretation of their input.
e.g. in PS, cursor icon matches selected tool
The feedback should:
- Be specific (e.g. name of file being opened/saved)
- Consider the context of the action - only disrupt the user when necessary
- e.g. save progress bar at the bottom of the window
- Consider feed-forward: show the effect of the action before they commit to it
- e.g. in Word, on hover over font, update selected text with that font (although this particular case was distracting)
- Offer choices based on partial task completion
- e.g. autocomplete
- This should be relatively stable and predictable, allowing the user to act on muscle memory rather than reading
Response times:
- < 0.1s: perceived as instantaneous
- < 1s: delay noticed, flow of thought uninterrupted
- 10s: limit for keeping attention on the dialogue
- 1-5s: e.g. spinning cursor
- > 5s: percentage
- If just guessing progress, prefer speed up near the end rather than a slow down
- ‘Working’ dialogues for unknown delays (e.g. throbbers)
- > 10s: user will want to perform other tasks and may have lost their train of thought
Consider feedback persistence: how heavy/disruptive and enduring should it be?
06. Clearly Marked Exits
Avoid trapping the user; offer a way out whenever possible:
- Cancel button
- Universal undo (return to previous state)
- Interrupt (mostly for longer operations)
- Higher precedence for more recent actions - if user does one action then another action that overrides the previous one, fulfil the latter action
- Quit
- Defaults (e.g. losing form data after submitting with one bad field)
e.g. ‘Do you want to save the changes made to ${}’: Don’t Save, Cancel, Save (don’t just use ‘yes’/‘no’/‘cancel’)
Windows 10 volume control: area around the volume bar is untouchable for a few seconds Also placed in the top left corner where a lot of important user elements are.
07. Shortcuts
Enable high performance for experienced users:
- Keyboard accelerators
- Command completion
- Function keys
- Double clicking (shortcut for some menu item)
- Type-ahead (offer most likely prediction)
- Gestures
- History (repeat actions done by the user previously)
- Customizable toolbars
08. Prevent Errors and Avoid Modes
People will make errors:
- Mistakes: conscious deliberation leading to incorrect action (bad mental model)
- Slips: unconscious behavior that gets misdirected (or mis-click/typo)
General rules:
- Prevent slips before they occur (e.g. syntactic correctness, disable items that can’t currently be used)
- Feedback: allow slips to be detected when they occur
- Support easy correction (e.g. universal undo)
- Commensurate effort: difficult states (e.g. document with of unsaved work) should be hard to irreversibly leave (e.g. warning dialog box)
Forcing functions (syntactic correctness):
- Prevent continuation of a wrong action
Warnings:
- Can be irritating when overused
- Can be ‘heavy’ (e.g. alert box)
- Make them subtle unless there is a really good reason for it to be heavy
Ignore illegal actions:
- Not great as user must infer what happened
- e.g. typing alphabetical character in number input
Mode errors:
- Have as few modes as possible
- Distinct states of the system where the commands available to the user are different or where the commands produce different results
- Allow user to easily determine current mode
- Spring-loaded modes: ongoing action maintains mode
- e.g. user must hold down control key to stay in a mode
- Good solution to people forgetting they are not in the default mode
Bad behavior:
- UCSMS Web student search has radio buttons for search via username and student number - two different modes because there is clearly no way for the program to determine the input is a 8 digit student number or a username beginning with alphabetical characters.
Good example:
- World used to ‘unnatural’ scrolling direction, iPhone’s rubber-banding acted as feedback when the user accidentally scrolled down from the top of a list
- User can swipe between photos and also drag when zoomed into a photo - what should the behavior be when swiping to the edge of a photo when zoomed in?
Possible solutions:
- Self-correct/auto-correct
- Requires trust in the system
- Negativity bias - incorrectly correcting correct input is far worse than not correcting incorrect input
- Auto-suggest
- Dialog that allows the user to fix an issue
- e.g. Squiggly line under mis-spelt text
- User instructs system
- System asks if the input was intended
- e.g. add to dictionary
- System instructs user
- System guesses user intentions and instructs user on the proper way to achieve it
- e.g. Clippy! - condescending, wrong, tedious, boring
09. Deal with Errors in a Positive and Helpful Manner
Error messages should:
- Use clear language, not codes
- Be precise - rephrase user input (e.g. cannot open ${document name} because ${it is not a supported file})
- Be constructive - suggest and offer solutions where possible
10. Help and Documentation
Documentation and manuals:
- Documentation is no excuse for interface hacks
- Write the manual before the system
- Task-centred manuals (especially for beginners)
- Quick reference cards as a reference to aid novice to expert transition
Tutorials:
- Short introductory guides and overviews
- Video walk-throughs
- Simple task walk-throughs
Reminders:
- Tooltips
- Short reference cards
Wizards:
- Walk user through typical tasks
- Don’t overuse - system in control
- Dangerous if the user gets stuck
Māori Issues and User Interface Design
Te Taka Keegan - University of Waikato
Usability principles
Shneiderman, Nielson have a few relevant usability principles:
- Strive for universal usability
- Match between system and the real world
- Recognition, not recall
Know your audience:
- Ethnicity: group people who identify with each other by genealogy, language/dialect, history, society, culture etc.
- How is the Māori world view different from the Pākehā perspective? A few important values:
- Manaakitanga: showing respect, care for others
- Whanaungatanga: building up relationships/kinship/closeness (e.g. introductions include mountain/rivers as a point of similarity and a way to bond with each other)
- Tiakitanga: looking after the world and each other
- Rangatiratanga: acknowledging/respecting chieftainship
- Aroha: love
- Language:
- ‘Soft’ - every syllable ends with vowel
- 10 vowels: a/e/i/o/u/ā/ē/ī/ō/ū (short and long/accented with macron)
- ‘au’ sounds more like an ‘o’
- 10 consonants: h/k/m/n/ng/p/r/t/w/wh (wh sound differs with dialect)
- Vowel length important for pronunciation and meaning
Something is usable if person of average ability and experience accomplish the task without more trouble than it’s worth.
- Default languages are important
- Interface language affects software usage patterns
- Lack of vocabulary is barrier to using Māori interfaces
- Māori has long words which can cause UI issues
- When using Māori imagery, get feedback and ensure it is appropriate
Inspection Methods
Systematic inspection of a user interface. It:
- Attempts to find usability methods
- Works at any stage in the design process
- Most commonly heuristic evaluation, where 3-5 evaluators inspect the system
Heuristic Evaluation
Each inspector initially works alone. They traverse the interface several times with a specific scenario/task in mind and:
- Inspect UI components and workflow
- Flow between UI states
- Compare them with heuristics
- Find non-compliance/problems
- Add notes, magnitude of problems, frequency
It often uses a two pass approach, focusing on specific UI elements/states in the first pass while the second focuses on integration and flows between states.
Results Synthesis
After each inspector does their individual evaluation, the inspectors come together and assess the overlap in problems they found.
Severity rankings can be reviewed and compared, and problems ranked in order of importance.
Severity: (small impact on those encountering it, few users) = low severity. (large impact, many users) = high severity.
Inspectors
Different perspectives will catch different problems so the inspector team should be diverse.
Example:
- Developer
- Designer
- Beware of vested interest in their design
- Usability expert
- Domain expert
- User
All inspectors should be trained in Nielson’s heuristics.
Nielson claims 3 inspectors should be able to find ~60% of problems, 5 ~70%.
Graphical Screen Design
Gestlat Laws of Perceptual Organization
How humans detect visual groups/relationships/patterns:
- Proximity
- Similarity
- In shape, color etc.
- Continuity
- Dots placed along a curve: brain sees the curve as a object
- Symmetry
- Objects seen as closed when placed in symmetric boundaries
- Closure
- Brains automatically attempt to ‘close’ objects e.g. semi-circle
Smooth continuity (e.g. smooth curves vs straight lines with right angles) easier to perceive, but less ‘neat’.
PARC Principles
From The Non-Designer’s Design Book by Robin Williams.
PARC:
- Proximity
- Group related elements
- Separate unrelated elements
- Instead of putting ugly borders around two groups, separate them!
- Alignment
- Visually connect elements to create a visual flow
- This is why grids are useful
- And mis-align unconnected elements (use with caution)
- Visually connect elements to create a visual flow
- Repetition
- Repeat design aspects (e.g. font, color, shape) throughout the interface for unity/consistency
- Contrast
- Different things should look different
- Bring out dominant elements, mute lesser ones
Misc
Grids:
- Use horizontal and vertical alignment to group related components
- Make minimal use of explicit structure (i.e. borders and boxes)
Navigational Cues:
- Provide an initial focus (top left for western cultures)
- Group related items
- Visual flow should follow logical flow
Economy of visual elements:
- Minimize the number of controls
- Include only those necessary and relegate others
- Minimize clutter
- Experiment with whitespace
- e.g. headings/labels above or to the left
03. User Interface Evaluation
Designers have complete and comprehensive knowledge of their interface and hence are uniquely unqualified to assess usability.
This makes them blind to the mismatch between the user and designer models. In order to find these, it is important to record realistic interactions; simple observation is insufficient.
Designers must mistrust their interfaces; what is difficult for a user may be obvious to them.
“Think Aloud” Evaluation
Prompt subjects to verbalize their thoughts as they work through the system:
- What they are trying to do
- Why they did the action they did
- How they interpret feedback from the systems
It is hard to talk and concentrate on the task at the same time - you may get a lot of incomprehensible mumbling so the facilitator must ensure they give good and continual prompts to the user.
Apart from the prompts, it should be one-way communication from the subject - otherwise, you will pollute the user’s model.
It is also likely to be very uncomfortable, unpleasant and difficult for the subjects - do your best to make them comfortable.
Cooperative Evaluation
A variation of “think aloud”. In “think aloud”, it feels as if the user is being studied while with cooperative evaluation, two subjects study the system together (with natural two-way communication).
Sometimes, one of the subjects is a confederate - someone involved with the system.
The two subjects work together to solve the problem. It is more comfortable to the subjects and comments about failures of the system emerge much more naturally.
Interviews
The more obvious the technique appears, the less preparation designers intuitively think they need to put into it: designing good interviews (and questionnaires) is difficult and are expensive in terms of time for both the designers and users.
Interviews are:
- Good for probing particular issues
- Can lead to constructive suggestions
- Prone to post-hoc rationalization
Plan a central set of questions in for consistency between interviewees and to focus the interview, but still be willing to explore interesting leads.
Questionnaires
Expensive to prepare but cheap to administer - evaluator not required.
NB: ~20% response rate.
Questionnaires can give quantitative (e.g. 30% of users xyz) and qualitative (why did you like x). Question types:
- Open-ended comments give important insights
- Closed questions restrict responses and give qualitative data - make sure there is no ambiguity in the options
- Likert items: level of agreement with a statement
- Ranked choice questions are good for forcing comparisons
- e.g. ‘Was A better than B?’ is preferred over 'How much did you like ‘A’ and 'How much did you like ‘B’ asked together; comparison on the latter often contains a lot of noise
Questionnaires are over-determined user interfaces - a badly-designed question may ‘box in’ the user. Hence, when designing questions:
- What purpose does the question serve? What information are you hoping to get?
- Know how will you analyze the results
- For each quantitative question, consider adding a qualitative one asking why they picked the result
- Iterate
- Know the dissemination method
Continuous Evaluation
Monitoring actual system use:
- Field studies
- Design team goes to users and see if they use the system as you expected
- Diary studies
- Users write out a few lines describing their experience with the system over the last few hours
- Logging and ‘Customer Experience Programs’
- LOG EVERYTHING!
- Exploratory questions: hope something interesting shows up
- Difficult to analyze
- Aside: in controlled experiments, log everything (until the point at which it slows down the UI)
- Targeted data collection
- How often are specific features used?
- Characterize their activities
- LOG EVERYTHING!
- User feedback and gripe lines
Crowd-Sourced Experiments
Mechanical turk et al.:
- Workers complete ‘Human intelligence tasks’
- They have a HIT approval rating that can be used for filtering
- Problems with noisy data and criteria for exclusion
- Include ‘attention check’ questions
- A significant proportion of ‘workers’ are bots
- Great with COVID - can’t do face-to-face studies
Formal Empirical Evaluation
When you want to see how a small number of competing solutions perform.
This requires strict, statistically testable hypotheses: better/worse or no evidence/difference.
Measure the participants’ response to manipulation of experimental conditions.
The results should be repeatable - the experimental methods must be defined rigorously, but are also time-consuming and expensive.
Ethics
Testing can be distressing.
As an experimenter you care about overall, not individual results, but if a subject makes a mistake, it can make them feel embarrassed and inadequate, especially if there are other other subjects that can see what they are doing.
Treat subjects with respect; at the very least, ensure the experience is not negative.
Before the test:
- Don’t waste their time; use pilots to debug experiments/questionnaires and ensure everything is ready when they arrive
- Make them comfortable
- Emphasize the system, not the user is being tested
- Let them know they can stop at any time
- Privacy: let them know individual test results will be confidential
- Inform: explain what is being monitored and answer their questions
- Only use volunteers: informed consent form required
During the test:
- Make them comfortable
- Relaxed atmosphere
- Never indicate displeasure with the subject’s performance
- Avoid disruptions
- Stop the test if it becomes too unpleasant
- Privacy: do not allow management to observe the test
Controlled Experiments
Characteristics:
- Lucid and testable hypothesis
- Know exactly why you are conducting it and what data you are hoping to get out of it to expose the success/failure of the hypothesis
- Quantitative measurements
- Measure of confidence in results (statistics)
- Is A > B, A < B or is there no discernable results
- Does the experiment successfully discriminate between outcomes?
- Replicability
- Control of variables and conditions
- Removal of experimenter bias; ensure it is objective
Research Questions
Congratulations! You have invented ABC. Now you need a research question/hypothesis:
Lets do a user study of ABC because it’s required for my PhDIs ABC any good?Does ABC beat the competition?Is ABC faster than the competition?- Is ABC faster than XYZ after 10 minutes of use?
- Is ABC faster and less error prone than XYZ after 10 minutes of use?
Most research questions are comparative:
- Is it faster, more accurate, preferred etc. (in relation to the baseline(s))
- Is there a difference when compared to the baseline?
- How big is the difference (and is it a practical difference)?
- How likely is it that the results were due to chance
Null Hypothesis Significance Testing (NHST):
- Widely used set of techniques for dichotomous testing
- The hypothesis should be expressed as a negative (e.g. XYZ is not faster than the baseline, ABC)
-
(average performance of 1 and 2 are the same)
-
- Reject null hypothesis,
, when - Given that the null hypothesis is true, the probability of observing data as extreme as what we saw should be very low (
) -
is usually
- Given that the null hypothesis is true, the probability of observing data as extreme as what we saw should be very low (
- Failure to reject does not mean that ‘they are the same’
- It could be that they are the same or that the experiment was not sensitive enough (e.g. too few participants)
- Reject or fail to reject; never accept the null hypothesis
Where:
-
is the signal: the magnitude of difference between the means -
is the standard deviation -
is the number of data points
We want to increase the signal-to-noise ratio, so we need to reduce the denominator:
- Reduce
- Better training:
- There will be a large amount of variance in the first few trials (power law of learning)
- If you only care about performance in proficient users, the first few trials are just noise
- Hence, more training will get the participants out of this area, reducing noise
- Outlier removal
- Log transformation
- Better training:
- Increase
- Diminishing returns due to square root
Aside - the ‘file drawer’ effect:
- ‘Unsuccessful’ experiments - those that fail to reject the null hypothesis, are ‘uninteresting’ and tend to go unpublished
- Survivorship bias: 19 studies correctly failing to reject null hypothesis go unpublished while one study that (by chance) that claims a significant effect gets published
- Do enough experiments, some will get lucky
- https://xkcd.com/882
- https://xkcd.com/1478
Internal vs external validity:
- External validity:
- Broad truth of the interface: if people used ABC, would the world be better?
- Findings are broad/real (e.g. is ABC any good?)
- Makes the world better
- Internal validity:
- Precise and replicable, but gets away from the fundamental truth we are trying to get too
- Findings are valid under specific circumstances that may not reflect real world usage by the general population
- e.g. valid for undergraduate psychology students at UC
Using multiple experiments, some with high internal validity and others with high external validity, can be used to overcome the shortcomings of both.
Be careful in generalizing conclusions:
- ABC was better than XYZ; ensure you identify the right cause for the improvement
- When generalizing, identify the human factor underlying the difference and rephase the research question around the human factor
- e.g. list of bookmarks vs 3D, spatial layout where all the items were shown at once
- Can’t conclude that 3D is better than 2D; would need to compare against a 2D, spatial layout rather than a list
Point analysis versus depth/theory/model:
- Identify and include salient secondary factors: is the result generally true or only true under the tested conditions?
Experimental Terminology
Independent variables:
- Controlled conditions
- Manipulated independent of behavior
- May arise from participant classification
- e.g. male/female
- Discrete values: independent variable levels
- Called ‘Factors’ in ANOVA
Dependent variables:
- Measured variables
- Dependent on participant’s response to manipulation of IVs
Within vs. between subjects:
- How IVs are administered between/within subjects
- Within subjects: each participant tested on all levels
- Use this whenever you can
- Participants act as a control for their own variability (some people just fast, some people just slow)
- Can measure relative performance for each subject
- Fewer participants required
- But need to account for learning/fatigue effects
- Every participant must be tested on every single level; their data must be thrown out otherwise
- Between subjects: each participant tested on a single level
- Sometimes necessary if using participant classification (e.g. male/female)
- Unmoderated variability
- Don’t mix within and between subject treatments within a single factor
Counterbalancing:
- When using within-subjects, need to control order of exposure to control for learning/fatigue effects
- Participants divided into groups; different order for each group
- Group becomes a between subjects factor
Tiny
Population --------> Sample
+ noise
^ |
| Inference |
| about the |
| population |
| |
---- Statistics <-----
Data Analysis
T-Test
Determines if two samples are likely to be from different populations?
Paired T-Test (within subjects): each participant is tested under both conditions.
Unpaired T-Test (between subjects): independent samples; each participant is only tested under one condition.
data <- read.table('filename', header=TRUE)
t.test(data$conditionA, data$conditionB, paired=TRUE|FALSE)
# If paired=TRUE, values on each row must belong to the same participant
# t-ratio: signal to noise. The bigger (the absolute value), the better
# p-value: can reject null hypothesis if p is less than $\alpha = 0.05$
Lots of additional information available through pairing, dramatically increasing sensitivity: t-ratio will usually be much larger and p-value smaller.
Correlation: Relating Datasets
Determining the strength of the relationship between variables (e.g. is typing and pointing speed correlated?).
Many different models available (e.g. linear, power, exponential), but always look at the graph to see if the model fits.
Common models:
- Pearson’s
(for linear correlation) - Correlation coefficient between -1 and 1
- Cohen’s rule of thumb:
- 0.1 - 0.3 is ‘small’
- 0.3 - 0.5 is ‘medium’
- 0.5 - 1.0 is ‘large’
- Spearman’s
(for ranked data)
Remember that correlation does not mean causation.
Regression: Relating Datasets
Predicting one value from another.
Line of best fit:
- Linear
-
: coefficient of determination - (same
as Pearson’s, but upper case for some reason)
- (same
- Between 0 and 1
- Proportion of variability explained by the model
- A value of 0.8 or larger is good for human performance
- Fitts’ law experiments usually give values around 0.95
Analysis of Variance (ANOVA)
T-tests allow us to compare between two samples with different values for an independent variable. But what about if the independent variable (factor) can take on more than two values?
We could simply exhaustively compare all pairs, but if the IV can take on
ANOVA supports factors with more than two levels of a factor and handle multiple factors, while reducing the risk of incorrectly rejecting the null hypothesis by asking if all conditions are from the same population:
If there is only one factor (independent variable), it is called one way ANOVA. Factors can be either within or between subjects (although you cannot do both within a factor).
COSC368 Exam Notes
Pillars of usability:
- Learnability: users rapidly attain some level of performance
- Efficiency: users can get a lot of work done per unit time
- Often, more learnable = lower efficiency ceiling
- Subjective satisfaction: users enjoy using the software
HCI should aim for simplicity; aim to make the UI match the complexity of the domain.
Don Norman’s Model of Interaction:
constructs
Designer/ -------------> System/system image
designer model ^
Provides | Provides input based on
feedback/ | their prediction of how
output | to achieve their goal
v
User/
user model
Designer model:
- Conception of how the interface works
- May be fuzzy and not fully defined
- May be compromised in the actual system
User model:
- User’s conception of how the system works
- Initially based on previous experiences with similar systems
- Grows with use and feedback from the system
System Image:
- How the system appears that it should be used to the user
- The system is the actual hardware or the software
Execute-Evaluate Cycle:
- Execute:
- User has goal and forms an intention to complete the goal
- Intention translated to multiple actions in the language of the user interface
- The user executes the actions
- Gulf of Execution: problems executing intention/action
- Evaluate:
- User perceives response from the system
- User interprets the response
- User evaluates the response with respect to their goal and expectations
- Gulf of Evaluation: problems assessing the system’s state, determining its effect
UISO:
- Task is a low-level task (e.g. save file as PDF)
- Articulation: user task language -> system input language
- Performance: system acting on the user input
- Presentation: system updates it’s state (and visible state)
- Observation: user views and interprets the new visible state
Mappings:
- Affordances:
- How it looks and how it works are similar
- e.g. door handles affords pulling, plate affords pushing
- Over/Under-determined dialogues:
- Under-determined: gulf of execution (e.g. CLI)
- Over-determined: forced through lengthy, unnatural or unnecessary steps (e.g. wizards)
- Direct Manipulation
- Rapid, incremental, reversible actions
- Encourage exploration
- Syntactic correctness: disable illegal actions
- Fast to learn but not always the most efficient
- Requires more screen space and system resources
- Rapid, incremental, reversible actions
Humans Input:
- Eyes:
- Sensitive to movement
- Fixations: when eye stationary
- Saccades: rapid eye movements; blind
- Smooth-pursuit: tracking moving objects
- Reading speed reduced by all caps
- Auditory:
- 20 Hz to 15-20 kHz
- Filtering (e.g. cocktail party effect)
- Haptics:
- Proprioception: sense of limb location (mostly unconscious)
- Kinaesthesia: sense of limb movement
- Tactition: skin sensations
Human output:
- Response time: ~200 ms for visual, ~150 for auditory, ~700 for haptics
- Isotonic: input that movement (e.g. moving mouse)
- Isometric: input through force (e.g. keyboard)
Fitt’s Law
-
is amplitude/distance of movement -
is width of target - Index of Difficulty
- Movement Time
-
called throughput or bandwidth
-
Steering Law:
- Steering a mouse cursor across a path/tunnel of width
and length -
Hick/Hyman Law of Decision Time:
- Visual search time usually
- that is, - Hick/Hyman models reaction time when optimally prepared (i.e. expert with a spatially stable UI)
-
- For
equally probable items, - To pick item
with probability : - Average time:
- For
Power Law of Practice:
-
-
is learning curve - Applies to both simple and complex tasks
Novice to Expert:
- Stagnation at some point:
- Satisficing: good enough
- And performance dip when switching to a new mode
- Lack of mnemonics
- Lack of visibility
- Satisficing: good enough
- Supporting transitions:
- Intra-modal: guidance to help user move towards the ceiling of performance within a mode
- Inter-modal: make user aware of existence of different, faster modes
- Vocab expansion: make user aware of most common commands
- Task strategy: intelligent UIs that figure out what the user is trying to do and suggests more efficient strategies to achieve it
Human Memory:
- Short-term:
-
‘chunks’ - Fast access: ~70 ms
- Rapid decay: ~200 ms
- Maintenance rehearsal: repeat chunk a few times to prevent decay
- Displacement/interference decay
-
- Long-term:
- Short-term -> long-term through elaborative rehearsal + extensive repetition
- Slow access: > 100 ms
- Good at recognition but non recall
- Spatial processing
Slips:
- Mistake is a conscious decision; bad user model
- Slip is automatic behavior:
- Capture error:
- Two action sequences, user captured into wrong (more frequent) sequence
- Description error:
- Multiple objects allowing same/similar action
- Right action, wrong object
- Data-driven error:
- Correct value kicked out of short-term memory by external data
- Incorrect value entered
- Loss-of-activation error:
- Forget what you are doing mid-flow
- Mode error:
- Right action, wrong state
- Make states highly visible and noticeable
- Reduce states where possible
- Motor slip:
- Problem between brain and input device
- Premature closure error:
- ‘Dangling’ UI action after user’s perceived goal completion
- Capture error:
Human phenomena:
- Homeostasis; equilibrium
- Make a task easier; people will attempt harder tasks with the system
- Satisficing
- Making do; why improve?
- e.g. hunt-and-peck typing, not bothering to learn keyboard shortcuts
- Hawthorne effect:
- People like being involved in experiments; behavior here not reflective of behavior in real world
- Peak-end effects
- Most intense or terminating moments of an experience have an excessive influence over people’s memories of the experience
- Negativity bias:
- Bad is stronger than good
- Communication convergence
- Similarity in pace, gestures, phrases etc. enhances communication
Top-level design process:
- Articulate
- Who are the users and what are the key tasks?
- Task-centered, participatory and/or user-centered design
- Generate user and task descriptions, then evaluate
- Brainstorm
- User involvement, representations/metaphors, the psychology of everyday things
- Low-fidelity sketches:
- Focus on high-level concepts
- Fast to develop, change, little change resistance
- Delays commitment
- Sequential sketches: shows state transitions and actions that trigger state change
- Zipf’s law (or Pareto principle): focus on 20% of most frequent interactions; they account for 80% of usage
-
th most frequent item appears with probability where
-
- Medium-fidelity, paper prototypes:
- Fine-tune interface, screen design
- Do heuristic evaluation and redesign
- Walk-through evaluation:
- User tasked to do some task
- Is the story believable?
- If so, ask how they will do it
- User tasked to do some task
- Further evaluation:
- Participatory interaction
- Task scenario walk-through: to to X, A will press this button then…
- Refinement
- Graphical screen design, interface guidelines and style guides
- Generate high-fidelity, testable prototypes, then:
- Usability testing
- Heuristic evaluation
- Completion
- Generate alpha/beta systems or a complete specification
- Then do field testing
Iterative design: don’t find a single idea and improve on that: leads to premature commitment, local maxima, and tunnel vision
Elaborative/reduction: first explore the full design space, then refine the design(s)
Task-Centered System Design (TCSD):
- User identification:
- Talk to users
- Difficult if the system/task is new
- Learn about the task chain; what are the inputs, where do the outputs go?
- What purpose does the task achieve?
- Talk to users
- Task identification
- What the user wants to do
- Not a description of how they (will) do it
- Identify users
- Name individuals
- Give each task a unique ID
- Validate tasks: talk to relevant users to help spot issues
- Determine what tasks and users will be covered; rank based on importance and task frequency
- What the user wants to do
- Design:
- Iterative design, walk-through evaluations
User-Centered System Design (UCSD):
- Users know their own needs better than anyone else
- Involve representative end-users as full members of the design process
- Great at:
- Responding to suggested designs
- Not so great at coming up with new designs
- Bringing in invaluable knowledge of work context
- Leading to greater user buy-in
- Responding to suggested designs
- The user is not always right - they may not know what they want
Nielson’s Ten Heuristics:
- Simple and natural dialogue
- Make it as simple as possible but no simpler
- Presentation + navigation should be natural and consistent
- Design: organize, economize, communicate
- Speak the user’s language
- Affordances (it is used the way it looks like it should be used)
- Mappings
- Metaphors
- Base terminology on user’s task language, not implementation
- Minimize memory load
- Recall slow; use recognition where possible
- Show input formats, provide defaults (e.g. date fields - what format is it supposed to be entered in, can a sensible default be provided?)
- Support reuse/re-visitation (e.g. show a few of the most commonly or recently used)
- Support unit exchange
- Support generalization: universal commands, modifiers
- Consistency
- In graphic design
- In command structure (e.g. pick command then select object or select object and run command)
- Internally
- Externally (within the platform)
- Beyond computing
- Feedback
- Continuous feedback about the system state and system’s interpretation of user input
- Feedback should be:
- Specific
- Consider feed-forward: show effect of action before it is committed
- Autocomplete
- Must be stable and predictable - muscle memory, not reading
- Consider persistance: how disruptive and enduring should the feedback be?
- Clearly-marked exits; don’t trap the user
- Cancel buttons, universal undo, interrupt long-running operations etc.
- More recent actions should override older ones
- Quit
- ‘Do you want to save changes to ${filename}?’: ‘Don’t Save’, ‘Cancel’, ‘Save’; should be specific
- Shortcuts
- Keyboard accelerators
- Command completion, type-ahead
- Function keys
- Double clicking
- Gestures
- History
- Customizable toolbars
- Prevent errors, avoid modes
- Syntactic correctness - disable items that aren’t valid
- Feedback reduces chance of slips
- Easy correction - universal undo
- Commensurate effort: states difficult to get to should be difficult to irreversibly leave
- Forcing functions: prevent behavior until problem corrected
- Interlocks: force right order of operations (e.g. remove card before ATM dispenses cash)
- Lock-ins: force user to remain in space (e.g. would you like to save changes dialog on close)
- Lock-outs: force user leaving space or prevent event from occurring
- Don’t just ignore illegal actions - user must infer what is wrong
- Mode errors:
- Have as few modes as possible
- Make current mode easily apparent
- Spring-loaded modes: ongoing action required to stay in mode
- Deal with errors positively and helpfully
- Clear language, not codes
- Precise
- Constructive - offer solutions
- Help and documentation
- Documentation is not permission to design a crappy UI
- Write the manual before the system
- Reminders: tooltips
- Wizards: puts system, not user in control. Don’t overuse
- Tutorials
Heuristic evaluation:
- Inspectors: developers, usability experts, domain experts, users, designers
- Warning for designers: vested interest in their own deigns
- With a specific scenario in mind:
- Inspect UI components, workflow, state transition
- Compare against heuristics
- Two-pass approach: focus on specific UI elements on first pass, then integration and state transitions
- Result synthesis: inspectors come together and assess overlap
Gestalt Laws of Perceptual Organization:
- Proximity
- Similarity (color, shape, etc.)
- Continuity: brain see dots etc. form a larger shape
- Symmetry: objects seen as being ‘closed’ when placed in symmetric boundaries
- Closure: brain automatically ‘closes’ objects
PARC Principles:
- Proximity
- Group related elements, separate unrelated
- Use whitespace over borders
- Alignment
- Grids, tables etc. visually connect elements
- Mis-align unconnected elements
- Repetition
- For consistency
- Contrast
- Different things should look different
Misc:
- Visual flow should follow logical flow
- Controls: minimize, include only what is necessary
- Smooth continuity (e.g smooth curves vs right angled lines) less ‘neat’ but easier to parse.
UI Evaluation:
- Designers uniquely unqualified to assess usability; can’t fathom what a typical user’s model is like
- ‘Think Aloud’ evaluation:
- Subjects prompted to verbalize thoughts while using a system
- What they are trying to do
- What the action did
- How they interpret feedback from the system
- Subjects prompted to verbalize thoughts while using a system
- Cooperative evaluation:
- Feels less like the subject is being study; the two subjects are studying the system together
- Interview:
- Prepare: have a central set of questions for consistency between interviews
- Be willing to explore interesting leads
- Good for probing particular issues
- Prone to post-hac rationalization
- Prepare: have a central set of questions for consistency between interviews