Andy Cockburn: Room 313, working Thursdays and Fridays
Tutors: team368@cosc.canterbury.ac.nz
Course breakdown:
- Labs: 9%, 1% per lab
- Usability analysis and storyboard
- 25%, 5pm 22 September
- Design specification and rationale
- 15%, 5pm 20 October
- Exam: 51%
Goals:
- Understand key human factors influencing HCI
- Know and apply guidelines, models, methods that aid in interface design
HCI: discipline concerned with the design, evaluation and implementation of interactive computing systems for human use.
There should be a cycle of evaluating, designing and implementation.
Usability
Three key pillars:
- Learnability: rapid attainment of some level of performance
- Can be modelled as the inverse of time spent on the interface
- Efficiency: can get a lot of work done per unit time
- Subjective satisfaction: how much you enjoy using it
Two minor pillars:
- Errors: should be few errors in an efficient interface.
- Memorability: should be memorable if the interface is learnable.
Trade-offs: efficiency and learnability (inverse of time spent) are often at odds with each other. The performance/efficiency ceiling is often lower for more learnable interfaces.
Preliminary Factors
- Safety considerations
- Need for throughput (efficiency)
- Frequency of use
- Physical space, lighting, noise, pollution
- Social context
- Cognitive factors: age, fatigue, stress, focus
Usability is like oxygen: you only notice it when it’s absent. See: doors with handles that you need to push.
Managing Complexity
The job of HCI is to manage complexity: designing an object to be simple and clear; the relentless pursuit of simplicity.
Interface
Complexity
^
| ____
| ____/
| Poorly designed ____/
| UIs; complexity ____/
| amplified ____/
| ____/ Well designed UIs
| ____/
| ____/
| ____/
| ____/
|/
+--------------------------------------------------> Domain
Door Word CAD Nuclear Complexity
Processor power plant
Models
Models are simplifications of reality that (should) help with the understanding of a complex artifact.
Don Norman’s Model of Interaction
From ‘The Psychology/Design of Everyday Things’, 1988.
This helps understand the designer’s role in creating a system that is used by a thinking person.
constructs
Designer/ -------------> System/system image
designer model ^
Provides | Provides input based on
feedback/ | their prediction of how
output | to achieve their goal
v
User/
user model
The designer tries to construct a system that they have not fully defined. The designer’s model is their conception of interaction; often incomplete, fuzzy or compromised in the actual implementation.
System image: how the system appears to be used (by the user); this does not necessarily reflect the truth of the system.
The user’s model begins very weak, coming from familiarity with the real world or other similar systems. They will use this experience to try and interact with the user system, building their model based on feedback from the system.
Ideally, there should be conformance between the designer and user’s model.
There is no direct communication between the designer and user; the designer can only communicate with the user through the system.
Execute-Evaluate Cycle
Execute:
- Goal -> Intention -> Actions -> Execution
- The user has a goal and knows the outcome they want
- They form an intention to complete the goal with the system and translate this to the language of the user interface; one or more actions
- They then execute the actions
- ‘Gulf of Execution’: problems executing intentions/actions
Evaluate:
- Perceive -> Interpret -> Evaluate
- Perceive the response/feedback by the system to their actions
- Evaluate; determine the effect of their action. Did it meet their goal?
- ‘Gulf of Evaluation’: problems assessing state, determining effect etc.
UISO Interaction and Framework
Abowd and Beale, 1991.
User, System, Input and Output.
Emphasizes translation during interaction:
- Articulation: user translates task from task language to input language
- Performance: system acts on the user input (callbacks etc.); translates input language into core language and modifies the system state
- Presentation: show the new state to the user; translate the core (system) state into output language
- Observation: user interprets the new system output
User has some low level task (e.g. saving a file); they need to translate their intention to an input language; this is one of the most difficult parts of user interface design.
--> Output ---
Presentation / \ Observation
/ \
/ v
System User
(Core) (Task)
^ /
Performance \ / Articulation
\--- Input <---/
Mappings
Good mappings; the relationship between controls and their effects, increase usability.
Affordances
Objects afford particular actions to users; there is a strong correlation between how it looks like it should be used and how it is used:
- Door handles afford pulling
- Dials afford turning
- Buttons afford pushing
- Bush shelters
- Glass affords smashing
- Plywood affords graffiti
Poor affordances encourages incorrect actions, but strong affordances may stifle efficiency.
Over-/Under-determined Dialogues
- Well-determined: natural translation from task to input language
- Under-determined: user knows what they want to do but not how to do it
- e.g. command line
- Over-determined: user forced through unnecessary or unnatural steps
- e.g. ‘Click OK to proceed’, lengthy wizards
- User turns into a robot; no freedom in what to do
Beginner user interface designers tend to think about the interface in terms of system requirements: the system needs x, y, z information so lets ask the user about these things up-front. These over-determined dialogues lead to horrible design.
Direct Manipulation
- Visibility of objects
- Direct, rapid, incremental and reversible actions:
- Reversibility allows users no-risk exploration of the user interface
- Rapid feedback
- Syntactic correctness
- Disable illegal actions (e.g. greying buttons out when action not available)
- Tooltips can help with the problem of not knowing why the action is not available
- Disable illegal actions (e.g. greying buttons out when action not available)
- Replace language with action
- Language needs to be learned and remembered (e.g. command lines)
- Actions; see and point
Advantages:
- Easy to learn
- Low memory requirements
- Easy to undo
- Immediate feedback to user actions
- Users can use spatial cues
Disadvantages:
- Consumes more screen real estate
- High graphical system requirements
- May trap users in ‘beginner mode’
The Human
- Input: vision, hearing, haptics
- Output: pointing, steering, speech, typing etc.
- Processing: visual search (slow), decision times (fast), learning
- Memory
- Phenomena and collaboration
- Error (predictably irrational behavior)
Fun Example
A trivial task that many humans will get wrong.
Count the number of occurrences of the letter ‘f’ given a set of words:
Finished files are the results of years of scientific study combined with the experience of many years
Three phonetics Fs: ‘finished’, ‘files’, ‘scientific’, are easily found.
But three non-phonetic Fs in ‘of’ are often forgotten.
Click
Even a blank graphic has affordances on where people usually click: on or near the center, or along the diagonals or corners.
Human Factors
Psychological and physiological abilities hae implications for design:
- Perception: how we perceive things
- Cognitive: how we process information
- Motor: how we perform actions
- Social: how we interact with others
The Human Information Processor
Card, Moran, Newell 1983.
Eyes/Ears
│
▼
┌──── Perceptual Processor ────┐
│ │
▼ ▼
Visual Image ──────┬─────── Auditory Image
Storage │ Storage
│
▼
┌─────Working Memory ◄─────────┐
▼ ▲ ▼
Motor │ Long-Term
Processor │ Memory
│ │ ▲
| | |
▼ │ ▼
Movement └──────► Cognitive
Response Processor
Human Input
Vision
Cells:
- Rods: low light, monochrome, 100 million rods across the retina
- Cones: color, 6 million rods in fovea
- S/M/L for short/medium/long approx blue/green/reddish-yellow sensitivity
Areas:
- Retina: ~120 degree range, sensitive to movement
- ~210 degrees with both eyes
- Notifications popping up in corners etc. will distract user
- Fovea: detailed vision, area of ~2 degrees
1 degree = 60 arcminutes, 1 arcminute = 60 arcseconds
Visual Acuity:
- Point acuity; maximum angle between two dots before they become indistinct: 1 arcminute
- Grating acuity; maximum angle between alternating bars before they become indistinct: 1-2 arcminutes
- Letter acuity: 5 minutes of arc
- Vernier acuity: given two parallel lines, minimum angle of separation in normal axis (e.g.
---___) before they are perceived as a continuous line (colinear): 10 arcseconds
Eye movement:
- Fixations: visual processing occurs only when the eye is stationary
- Saccades: rapid eye movements; about 900 degrees per second
- Blind while saccades are in progress
- Eye movement as input; difficult as people don’t have much control over where they are looking (e.g. accidentally looking at ‘delete all my files’ button)
- Smooth-pursuit: ability to tracking moving objects (up to 100 degrees per second)
- Cannot be induced voluntarily - can’t imagine a moving dot and track it
- Relevant in scrolling
Size/depth cues:
- Familiarity
- Linear perspective; straight lines getting closer together
- Horizontal distance
- Size constancy: if object gets bigger/smaller, it’s probably the object moving closer/further away, not the object changing size
- Texture gradient: texture getting bigger/smaller
- Occlusion: occluded items further away
- Depth of focus: blurrier the further you go away
- Aerial perspective: blurrier and bluer from atmospheric haze
- Shadows/shading
- Stereoscopy (best within 1m, ineffective beyond 10m)
Muller-Lyer illusion:
<->
>---<
Bottom one looks further away and is subtending the same angle, so brain perceives it as bigger.
3D, depth-based UIs:
- The world is 3D so all interaction should be 3D, right?
- Occlusion, far-away things being smaller, navigation/orientation etc. impedes usability unless the domain is 3D (e.g. gaming, 3D modelling)
- Zooming is useful though
- Overview of the data first
- Zoom in to progressively add detail about what they are interested in and filter information they are not
- Allows UI to provide details on demand.
Color:
- 8% males, 0.4% females have some form of color-deficiency:
- Types:
- Protanomaly: red
- Deuteranomaly: green
- Tritanomaly: blue
- Least sensitive to blue
Reading:
- Saccades, fixations (94% of the time), regression
- Approx. 250 words/minute initially
- READING SPEED REDUCED BY ALL CAPS
Auditory
- Used dramatically less than vision
- About 20 Hz to 15-20 kHz
- Can adjust many parameters; amplitude, timbre, direction
- Filtering capabilities (e.g. cocktail party effect)
- Problems with signal interference and noise
Haptics
- Proprioception: sense of limb location
- Kinaesthesia: sense of limb movement, often more of a conscious decision
- Tactition: skin sensations
Haptic feedback: any feedback providing experience of touch
Human Output
Motor response time depends on stimuli:
- Visual: ~200 ms
- Audio: ~150 ms
- Haptics: 700 ms
- Faster for combined signals
Muscle actions:
- Isotonic: little resistant to movement (e.g. mouse)
- Isometric: force but little motion (e.g. keyboard, ThinkPad TrackPoint™)
- Better for velocity/rate control (e.g. self-centering joysticks)
Fitts’ Law
A very reliable model of rapid, aimed human movement.
- Predictive of tasks descriptive of devices
- Derived from Shannon’s theory of capacity of information channels
- Signal: amplitude
of movement (or for distance) (middle of target) - Noise: width
of target
- Signal: amplitude
Index of difficulty (ID) measures difficulty of rapid aimed movement:
Measured in ‘bits’.
Fitt’s law: movement time (MT) is linear with ID:
-
is typically 200-500 ms -
is typically 100-300 ms/bit
Typical velocity profile, validated for many types of aimed pointing:
Speed
^
| Open-loop,
|balistic impulse
| /\
| / \ slow, closed-loop
| / \ corrections
| / \ /\
|/ \/ \/\___
+------------------------>
Time
Input Devices; Pointing & Scrolling
Human output is received as system input. There must be some sort of translation hardware to achieve this, which have many properties:
- Direct vs indirect
- Touchscreens have perfect one-to-one correspondence
- Trackpads indirect: mouse movement does not directly map to cursor movement
- Absolute vs relative
- Touchscreens, pen-tablets
- Trackpads (mostly) relative; finger location on the trackpad does not matter when moving the cursor
- Control
- Position (zero-order) e.g. absolute pointing, dragging the scrollbar
- Rate (first-order) e.g. holding down on mouse wheel and dragging up/down on Windows
- Acceleration (second-order)
- Note: having lots of modes and hence complexity may decrease number of interactions while making the task take longer due to the overhead of making decisions
- Isotonic: force with movement
- Isometric: force without movement e.g. 3D touch to control object size
- Control-display gain/transfer functions
- Magic sauce of iOS inertial scrolling, Mac trackpads, etc.
The control-display transfer function:
- The input device (e.g. capacitive trackpad) sends device units
- The gain function scales the input in accordance with the user or environment settings
- Persistence is used to continue output even when there is no ongoing input, adding features such as inertia
Transfer Function
+-------------------------------------------------------------------------+
| e.g. scroll inertia |
| device --------------- display -------- --------------- |
Device -+--------> | Translation | ---------> | Gain | ------> | Persistence | --+---> Output
Input | units --------------- units -------- --------------- |
| ^ ^ ^ |
| ------- Environment/User Settings ------------ |
+-------------------------------------------------------------------------+
Scrolling transfer function for iOS:
- When the finger is on the screen there is direct mapping
- After the finger leaves, the speed slowly decays
- After four quick scroll gestures in quick succession, the scroll rate increases in an almost vertical line after the finger leaves
- For all subsequent scroll gestures, the maximum velocity (immediately after the finger leaves) increases until a maximum scroll velocity is reached
Input Devices: Text Input
- Alternative keywords (e.g. Dvorak)
- Chord keys (e.g. stenographers)
- Constrained keyboards (e.g. T9 keyboards on old mobile phones)
- Reactive/predictive systems (autocomplete)
- Gestural input (e.g. swipe keyboards)
- Hand-writing recognition
Input expressibility: how well can you discriminate inputs? e.g. Google Glass had a tiny capacitive surface; doing text entry on that posed challenges.
Steering Law
Model of continuously controlled ‘steering’: moving an item across a given path, called a ‘tunnel’:
Where
This is important in cascading context menus, where hovering over an item overs a submenu to the left or right. Done naïvely, while travelling to the newly-opened submenu, the cursor must always stay above the item or the submenu will disappear. macOS appears to take into account the angle of travel to determine if the submenu should be hidden or not.
Human Processing
Visual Search Time
If a person has to pick out a particular item out of
This is slow, so the UI should aim to reduce the amount of searching the user must do. To achieve this, ensure there is spatial stability; items appear in the same place every time.
Hick/Human Law of Decision Time
Choice reaction time when optimally prepared:
Where
For item
For
Implications:
- Decisions are fast -
- Applies to name retrieval (commands) and location retrieval
- In GUIs, replace visual search (
) with decision through stable stability - Don’t order commands by most recently/commonly used - forces user to visually search
Spatially Consistent User Interfaces
Pie menu: items are sectors making up a circle centered around the cursor (possibly with multiple layers of items through nesting):
- Minimum of one pixel of cursor movement required for fast selection
- Allows for easy advancement from visual search to muscle memory
Ribbon: spatial stability within each tab, but requires visual search and mechanical interactions to find a new item. ‘Solution’: show all tabs at once.
Search: macOS menu bar search does not run searched command, only show you where the item is located. Menu items also show the keyboard shortcut.
Torus pointing: wraps cursor around screen, gives multiple straight paths to an item. Giving users choice may help with Fitts’ law, but increase decision time.
Power Law of Practice
Performance rapidly speeds up with practice:
Where:
-
is the time taken for trial -
is the time taken on the first trial -
is the learning rate
This applies both to simple and complex tasks.
Novice to Expert Transitions
People use the same tools for years/decades, but often continue to use inefficient strategies.
Shortcut vocabularies are small and are used infrequently. Factors:
- Satisficing; good enough
- Lack of mnemonics (for keyboard shortcuts)
- Lack of visibility
How do you support transitions to experts?
When switching between modes, there is a performance dip. Since people use software to do their jobs, not use software as their jobs, this causes a chasm that the user must take the time to cross.
^ Performance Modality
│ Switch
│ | xxxxx
│ xxxxxxxxx
│ | xxxxx
│ xxxx
│ | xxx
│ xx
│ | xx
│ Ultimate xxx
│ xxxxxxxxxxxx| xx ─┐
│ xxxxxx Performance x │ Performance
│ xxxx |x │ Dip
│ xx Extended x │
│ x Learnability |x ─┘
│ xx
│ x |
│x
│x Initial |
│Performance
│ |
│ First Modality Second Modality
└──────────────────────────-────────────────────────────>
Time
Domains of Interface Performance Improvement
- Intra-modal improvement
- Make the user an expert within the mode the user is comfortable working in
- e.g. guidance techniques where you show items the user is likely to use
- Inter-modal improvement
- Make the user aware of faster ways of doing the task (e.g. file, print to ctrl P)
- e.g. skillometers
- e.g. AutoCAD shows the text command being used in the background when using UI buttons
- Vocabulary extension
- e.g. track and show community command use to let users learn the most useful commands
- Task strategy:
- Intelligent UI that picks up on the task the user is trying to do suggests more efficient sequences of commands to achieve this
Human Pattern of Behavior
Zipf’s Law: given a cohort of text,
where
Pareto Principle/80-20 Rule: 80% of usage is made up of only 20% of items.
The UI should attempt to surface these 20% of items.
Human Memory
maintenance
rehearsal
┌────┐
│ │
│ ▼ elaborative
Sensory Memory Short-term rehearsal Long-term
iconic, echoic,──────► memory ──────────────► memory
and haptic │ ◄──────────────
│ │ retrieval
│ │
│ │
│ │
│ │
▼ ▼
masking decay displacement or
interference decay
Sensory memory: stimulation decays over a brief period of time; loud noises, bright lights, pain persists for some time after the stimulation disappears.
Short-Term Memory
- Input from sensory or long term memory
- Capacity of 7 ± 2 ‘chunks’/abstractions
- Chunks aid storage and reconstruction
- Fast access: ~70 ms
- Rapid decay: ~200 ms
- Constant update and interference
- Maintenance rehearsal: e.g. repeating a number a few times in your mind
Long-Term memory
- Input through elaborative rehearsal and extensive repetition
- Elaborative rehearsal: restructuring information instead of just mindlessly repeating it
- Slow access: > 100 ms, sometimes days (tip of the tongue phenomenon)
- Decay?
- Good at recognition but bad at recall
- Supports spatial processing
Human Error
Mistakes
Errors of conscious decisions; when they act according to their an incomplete/incorrect model.
Only detected with feedback.
Human Error: Slips
Errors of automatic and skilled behavior.
Capture error:
- Two action sequences with common starting point(s)
- Captured into the wrong (and usually more frequent) path
- Used to be common e.g. in dialogue boxes with generic button labels (‘Cancel’ and ‘Ok’)
Description error:
- More than one object allowing the same/similar action
- Execute the right action on the wrong object
- e.g. lighting panel with multiple switches
Data-driven error:
- External data interfering with short-term memory
- e.g. entering unrelated file name when saving a document
Loss-of-activation error:
- Goal is displaced/decayed from short-term memory before it is completed
- e.g. walking into room then forgetting why you entered
- Want to complete task that requires subtasks and sub-subtasks to be completed, overflowing short-term memory
Mode error:
- Right action in the wrong system state
- Modes are system partitions with:
- Different set of commands
- Different interpretation of the same commands/actions
- Different display methods
- Ensure modes are visible and noticeable
- Modal dialogues are example of bad modes
Motor slip:
- Pointing/steering/keying error
Premature closure error:
- ‘Dangling’ UI actions required after perceived goal completion
- e.g. forgetting to save, attach attachments to emails
Human Phenomena
Homeostasis
People maintain equilibrium:
- If a system makes something easier, people will use it to do more difficult things
- If a system makes something safer, people will use it to do more dangerous things
Satisficing
People are satisfied with what they can do now and don’t bother to optimize:
- People that ‘hunt-and-peck’ instead of learning to touch type
- People that don’t bother to learn keyboard shortcuts for tasks they do frequently
Hawthorne Effect
The act of measuring changes results (Heisenburg uncertainty principle of HCI).
People like being involved in experiments and change their behavior during experiments, complicating results.
Explaining Away Errors
Blaming the user is often easiest party to blame, but the user may have the mistake because the interface is designed poorly.
Peak-End Effects
Peak effect: people’s memories of experiences are influenced by the peak/most intense moments of an experience (e.g. combos attacks in games, casino games).
End effect: people’s memories of experiences are predominantly influenced by the terminating moments (e.g. good vaccation ruined by missed flight home, survey with many questions on the last page).
Negativity Bias
Magnitude of sensation with loss greater than the same amount of gain: bad is stronger than good.
e.g. single coin toss, win $110 on heads but lose $100 on tails, autocorrect ‘correcting’ a correct word feels much worse than how good it feels when it corrects a mis-spelt word.
Communication Convergence
Similarity with pace, gestures, phrases, etc. enhances communication. Could interfaces measuring (e.g. long press duration, mouse speed) and matching (e.g. animation speed, timeout, speech rate) help?