01. Introduction to Human-Computer Interaction

Andy Cockburn: Room 313, working Thursdays and Fridays

Tutors: team368@cosc.canterbury.ac.nz

Course breakdown:

Goals:

HCI: discipline concerned with the design, evaluation and implementation of interactive computing systems for human use.

There should be a cycle of evaluating, designing and implementation.

Usability

Three key pillars:

Two minor pillars:

Trade-offs: efficiency and learnability (inverse of time spent) are often at odds with each other. The performance/efficiency ceiling is often lower for more learnable interfaces.

Preliminary Factors

Usability is like oxygen: you only notice it when it’s absent. See: doors with handles that you need to push.

Managing Complexity

The job of HCI is to manage complexity: designing an object to be simple and clear; the relentless pursuit of simplicity.

Interface
Complexity
    ^
    |                                              ____
    |                                         ____/
    | Poorly designed                    ____/
    | UIs; complexity               ____/
    | amplified                ____/
    |                     ____/     Well designed UIs
    |                ____/
    |           ____/
    |      ____/
    | ____/
    |/
    +--------------------------------------------------> Domain
     Door      Word        CAD          Nuclear          Complexity
             Processor                power plant

Models

Models are simplifications of reality that (should) help with the understanding of a complex artifact.

Don Norman’s Model of Interaction

From ‘The Psychology/Design of Everyday Things’, 1988.

This helps understand the designer’s role in creating a system that is used by a thinking person.

               constructs
   Designer/ -------------> System/system image
designer model                ^
                    Provides  | Provides input based on
                    feedback/ | their prediction of how
                     output   |  to achieve their goal 
                              v
                             User/
                          user model

The designer tries to construct a system that they have not fully defined. The designer’s model is their conception of interaction; often incomplete, fuzzy or compromised in the actual implementation.

System image: how the system appears to be used (by the user); this does not necessarily reflect the truth of the system.

The user’s model begins very weak, coming from familiarity with the real world or other similar systems. They will use this experience to try and interact with the user system, building their model based on feedback from the system.

Ideally, there should be conformance between the designer and user’s model.

There is no direct communication between the designer and user; the designer can only communicate with the user through the system.

Execute-Evaluate Cycle

Execute:

Evaluate:

UISO Interaction and Framework

Abowd and Beale, 1991.

User, System, Input and Output.

Emphasizes translation during interaction:

User has some low level task (e.g. saving a file); they need to translate their intention to an input language; this is one of the most difficult parts of user interface design.

              --> Output ---
Presentation /              \ Observation
            /                \
           /                  v
    System                      User
    (Core)                      (Task)
           ^                  /
Performance \                / Articulation
             \--- Input <---/

Mappings

Good mappings; the relationship between controls and their effects, increase usability.

Affordances

Objects afford particular actions to users; there is a strong correlation between how it looks like it should be used and how it is used:

Poor affordances encourages incorrect actions, but strong affordances may stifle efficiency.

Over-/Under-determined Dialogues

Beginner user interface designers tend to think about the interface in terms of system requirements: the system needs x, y, z information so lets ask the user about these things up-front. These over-determined dialogues lead to horrible design.

Direct Manipulation

Advantages:

Disadvantages:

The Human

Fun Example

A trivial task that many humans will get wrong.

Count the number of occurrences of the letter ‘f’ given a set of words:

Finished files are the results of years of scientific study combined with the experience of many years

Three phonetics Fs: ‘finished’, ‘files’, ‘scientific’, are easily found.

But three non-phonetic Fs in ‘of’ are often forgotten.

Click

Even a blank graphic has affordances on where people usually click: on or near the center, or along the diagonals or corners.

Human Factors

Psychological and physiological abilities hae implications for design:

The Human Information Processor

Card, Moran, Newell 1983.

               Eyes/Ears
                   │
                   ▼
    ┌──── Perceptual Processor ────┐
    │                              │
    ▼                              ▼
Visual Image ──────┬─────── Auditory Image
 Storage           │           Storage
                   │
                   ▼
      ┌─────Working Memory ◄─────────┐
      ▼                  ▲           ▼
    Motor                │        Long-Term
  Processor              │         Memory
      │                  │           ▲
      |                  |           |
      ▼                  │           ▼
  Movement               └──────► Cognitive
  Response                        Processor 

Human Input

Vision

Cells:

Areas:

1 degree = 60 arcminutes, 1 arcminute = 60 arcseconds

Visual Acuity:

Eye movement:

Size/depth cues:

Muller-Lyer illusion:

 <->
>---<

Bottom one looks further away and is subtending the same angle, so brain perceives it as bigger.

3D, depth-based UIs:

Color:

Reading:

Auditory

Haptics

Haptic feedback: any feedback providing experience of touch

Human Output

Motor response time depends on stimuli:

Muscle actions:

Fitts’ Law

A very reliable model of rapid, aimed human movement.

Index of difficulty (ID) measures difficulty of rapid aimed movement:

ID=log2(AW+1) \mathrm{ID} = log_2\left(\frac{A}{W} + 1\right)

Measured in ‘bits’.

Fitt’s law: movement time (MT) is linear with ID:

MT=a+bID=a+blog2(AW+1) \begin{aligned} \mathrm{MT} &= a + b \cdot \mathrm{ID} \\ &= a + b \cdot log_2\left(\frac{A}{W} + 1\right) \end{aligned}

1/b1/b, the reciprocal of slope, is called throughput, or the bandwidth of the device in bits/second.

aa and bb are empirically determined. For a mouse:

Typical velocity profile, validated for many types of aimed pointing:

Speed 
  ^
  |   Open-loop,
  |balistic impulse
  |    /\
  |   /  \   slow, closed-loop
  |  /    \    corrections
  | /      \  /\
  |/        \/  \/\___
  +------------------------>
           Time

Input Devices; Pointing & Scrolling

Human output is received as system input. There must be some sort of translation hardware to achieve this, which have many properties:

The control-display transfer function:

                                 Transfer Function
        +-------------------------------------------------------------------------+
        |                                                     e.g. scroll inertia |
        | device   ---------------  display   --------          ---------------   |
Device -+--------> | Translation | ---------> | Gain |  ------> | Persistence | --+---> Output
Input   | units    ---------------   units    --------          ---------------   |
        |                ^                       ^                     ^          |
        |                ------- Environment/User Settings ------------           |
        +-------------------------------------------------------------------------+

Scrolling transfer function for iOS:

Input Devices: Text Input

Input expressibility: how well can you discriminate inputs? e.g. Google Glass had a tiny capacitive surface; doing text entry on that posed challenges.

Steering Law

Model of continuously controlled ‘steering’: moving an item across a given path, called a ‘tunnel’:

MT=a+bAW\mathrm{MT} = a + b \cdot \frac{A}{W}

Where AA is the tunnel length and WW is the width. If the thickness varies, use the integral of the inverse of path width.

This is important in cascading context menus, where hovering over an item overs a submenu to the left or right. Done naïvely, while travelling to the newly-opened submenu, the cursor must always stay above the item or the submenu will disappear. macOS appears to take into account the angle of travel to determine if the submenu should be hidden or not.

Human Processing

Visual Search Time

If a person has to pick out a particular item out of nn randomly ordered items, the average time TT taken to find the item increases linearly: T=a+bn+12T = a + b \frac{n + 1}{2}. However, pop-out effects where one item is visually distinct, reduces this to O(1)O(1). However, this requires the interface to predict what the user wants to select.

This is slow, so the UI should aim to reduce the amount of searching the user must do. To achieve this, ensure there is spatial stability; items appear in the same place every time.

Hick/Human Law of Decision Time

Choice reaction time when optimally prepared:

T=a+bH T = a + b \cdot H

Where HH is the ‘information entropy’; inpiHi\sum_i^n{p_i H_i}

For item ii with probability pip_i of being selected:

Hi=log2(1pi) H_i = \log_2 \left( \frac{1}{p_i}\right)

For nn equally probable items, H=log2(n)H = \log_2(n).

Implications:

Spatially Consistent User Interfaces

Pie menu: items are sectors making up a circle centered around the cursor (possibly with multiple layers of items through nesting):

Ribbon: spatial stability within each tab, but requires visual search and mechanical interactions to find a new item. ‘Solution’: show all tabs at once.

Search: macOS menu bar search does not run searched command, only show you where the item is located. Menu items also show the keyboard shortcut.

Torus pointing: wraps cursor around screen, gives multiple straight paths to an item. Giving users choice may help with Fitts’ law, but increase decision time.

Power Law of Practice

Performance rapidly speeds up with practice:

Tn=Cnα T_n = Cn^{-\alpha}

Where:

This applies both to simple and complex tasks.

Novice to Expert Transitions

People use the same tools for years/decades, but often continue to use inefficient strategies.

Shortcut vocabularies are small and are used infrequently. Factors:

How do you support transitions to experts?

When switching between modes, there is a performance dip. Since people use software to do their jobs, not use software as their jobs, this causes a chasm that the user must take the time to cross.

^  Performance          Modality
│                        Switch
│                          |                       xxxxx
│                                          xxxxxxxxx
│                          |           xxxxx
│                                   xxxx
│                          |      xxx
│                                xx
│                          |    xx
│                Ultimate     xxx
│              xxxxxxxxxxxx| xx  ─┐
│        xxxxxx Performance  x    │ Performance
│    xxxx                  |x     │    Dip
│   xx  Extended            x     │
│  x  Learnability         |x    ─┘
│ xx
│ x                        |
│x
│x Initial                 |
│Performance
│                          |
│     First Modality             Second Modality
└──────────────────────────-────────────────────────────>
                            Time
Domains of Interface Performance Improvement

Human Pattern of Behavior

Zipf’s Law: given a cohort of text, nnth most frequently occurring word appears with a probability of:

Pnnα P_n \approx n^{-\alpha}

where α1\alpha \approx 1.

Pareto Principle/80-20 Rule: 80% of usage is made up of only 20% of items.

The UI should attempt to surface these 20% of items.

Human Memory

                         maintenance
                          rehearsal
                           ┌────┐
                           │    │
                           │    ▼       elaborative
  Sensory Memory          Short-term     rehearsal    Long-term
  iconic, echoic,──────►    memory    ──────────────►  memory
  and haptic                  │       ◄──────────────
       │                      │          retrieval
       │                      │
       │                      │
       │                      │
       │                      │
       ▼                      ▼
   masking decay       displacement or
                      interference decay

Sensory memory: stimulation decays over a brief period of time; loud noises, bright lights, pain persists for some time after the stimulation disappears.

Short-Term Memory

Long-Term memory

Human Error

Mistakes

Errors of conscious decisions; when they act according to their an incomplete/incorrect model.

Only detected with feedback.

Human Error: Slips

Errors of automatic and skilled behavior.

Capture error:

Description error:

Data-driven error:

Loss-of-activation error:

Mode error:

Motor slip:

Premature closure error:

Human Phenomena

Homeostasis

People maintain equilibrium:

Satisficing

People are satisfied with what they can do now and don’t bother to optimize:

Hawthorne Effect

The act of measuring changes results (Heisenburg uncertainty principle of HCI).

People like being involved in experiments and change their behavior during experiments, complicating results.

Explaining Away Errors

Blaming the user is often easiest party to blame, but the user may have the mistake because the interface is designed poorly.

Peak-End Effects

Peak effect: people’s memories of experiences are influenced by the peak/most intense moments of an experience (e.g. combos attacks in games, casino games).

End effect: people’s memories of experiences are predominantly influenced by the terminating moments (e.g. good vaccation ruined by missed flight home, survey with many questions on the last page).

Negativity Bias

Magnitude of sensation with loss greater than the same amount of gain: bad is stronger than good.

e.g. single coin toss, win $110 on heads but lose $100 on tails, autocorrect ‘correcting’ a correct word feels much worse than how good it feels when it corrects a mis-spelt word.

Communication Convergence

Similarity with pace, gestures, phrases, etc. enhances communication. Could interfaces measuring (e.g. long press duration, mouse speed) and matching (e.g. animation speed, timeout, speech rate) help?