01. Introduction
Course Administration
Course objectives:
- MR theory
- MR technology (mainly VR + AR)
- Designing and building MR experiences
- Learn about the theories and technologies used to create MR experiences
Topics:
- Introduction to Mixed Reality
- AR development tools, and designing effective AR experiences
- VR development tools, and designing effective VR experiences
- Tracking, calibration and registration for AR
- With Richard Green
- Mixed reality displays
- Interaction in VR
- Interaction in AR
- Collaboration in mixed reality
- Creating multiple-sensory VR experiences
- Haptics etc.
- Human Perception and Presence in mixed reality
- Data Visualization in Mixed Reality
- Multi-dimensional data sets
- Evaluating immersive experiences
Staff:
- Stephan Lukosch:
- Course coordinator, main lecturer, currently acting director for HIT lab
- Adrian Clark
- Rory Clifford
- Richard Green
- Rob Lindeman (HIT lab director, on sabbatical)
- Tham Piumsomboon
- Lecturer in product design
- Yuanjie Wu
Labs:
- HIT lab 2nd floor (John Britten building)
- Limited lab assignments: focus on project work
- 9 workstations (27 students, 3 people per group)
- Admin rights
- Must sign HIT lab equipment-use policy
- TAs will provide technical support of setting up development environment (Unity 3D)
- Extremely full-featured: focus on the features that are important
- Stephan will give general feedback on research project
Assessment:
- Research project:
- 30% of course grades
- Max. 3 students
- Commented, documented source code
- Demonstration/video of project
- Research paper:
- 30% of course grades
- 6-page conference-style paper (< 4000 words)
- No contribution sheet for groups: assume equal effort
- Exam: 2 hour open-book exam
Research project:
- Teams of 3
- Hybrid tabletop game: physical game elements augmented with virtual information (e.g. 3D objects, animations)
- Requirements:
- Visualize several digital game elements anchored in the real world
- Support some form of interaction with the digital game elements (touch interaction, interaction between multiple targets, based on distance between device and marker)
- Players can see and interact with the digital game elements via their smartphone
- Not enough HMDs for everyone
- Unity runs on macOS, Linux, Windows
- Vuphoria runs on Android and iOS, but iOS deployment requires a Mac
- Unity integrates with Plastic SCM: free for <= 3 people
- Can borrow a webcam if required
- Can try use HMDs, but probably difficult
Mixed Reality
A continuum:
- Real environment
- AR: augmented reality
- Add digital information to the real world
- AV: augmented virtuality
- In a virtual world, apart from a few things
- e.g. VR car simulator, but user can see the real steering wheel
- In a virtual world, apart from a few things
- VR: virtual environment
- Everything is virtual
In terms of interactions:
- Reality: ubiquitous computers which you interact with alongside the real world
- AR: augmented using input from both the user and the environment
- VR: completely cut off from the real world: only interaction with the computer
Virtual Reality
VR:
- Replicates an environment, real or imagined
- Simulates a user’s physical presence and environment to allow for user interaction
Defining characteristics of VR:
- Environment simulation
- Presence
- Interaction
AIP Cube (Zeltzer, 1992)
Three axes:
- Autonomy:
- User can react to events and stimuli
- Head tracking, body input
- User can change their viewpoint
- Interaction
- User can interact with objects environment
- User input devices, HCI
- Presence
- User feels immersed through sensory input and output channels
VR is at extreme end of all three axes of the AIP cube.
Very hyped in 1980s/1990s:
- Lagging technology
- Lack of understanding, usability
- No ‘killer app’
- Except some specific scenarios
- Surgical simulation
- Military training
- Phobia therapy
Keys to success:
- High fidelity/realism: graphics, audio, haptics, behaviors
- Low latency: tracking, collision detection, rendering, networking
- Ease of use: for programmers and users
- Compelling content
- Responsive expressiveness (natural behaviors)
Current state of senses:
- Visual: good
- Hard to match eye’s FoV though
- Aural: good spatialized audio
- Olfactory (Smell): too many types of receptors; very hard
- Haptics: application-specific and cumbersome
- Gustatory (taste): base tases are known, but very hard
Simulator sickness:
- General discomfort
- Fatigue
- Headache
- Eye strain
- Difficulty focusing
- Salivation increasing
- Sweating
- Nausea
- Difficulty concentration
- ‘Fullness of the head’
- Blurred vision
- Dizziness with eyes open
- Dizziness with eyes closed
- Vertigo
- Stomach awareness
- Burping
Factors negatively influencing VR:
- Latency
- Miss-calibration of tracking
- Low-tracking accuracy
- Low-tracking precision
- Limited FoV
- Low refresh rate
- Low resolution
- Flicker/stutter
- Real-world stimuli
- Lack of depth cues
- Device weight
- Heat
- Fogging of screens
Delay/latency is one of the main contributing factors to simulator sickness. The system must complete several tasks in series, which can lead to noticeable high latency:
- Tracking delay
- Application delay
- Rendering delay
- Display delay
VR Output
Sound:
- Display techniques:
- Multi-speaker output
- Headphones
- Bone conduction
- Spatialization vs localization
- Spatialization: processing of sound signals to make them seem to emanate from a specific point in space
- Localization: our ability to identify the source position of a sound
Smell:
- Two main problems
- Scent generation
- The nose has tens of thousands of receptor types
- Delivery
- How to deliver the scent to the user (and hopefully only to them) and remove it quickly
- Scent generation
Touch:
- Haptic feedback comes in many difference senses:
- Force/pressure
- Slipperiness
- Vibration
- Wind
- Temperature
- Pain
- Proprioception
- Balance (?)
- Most density populated area: fingertips, lips, tongue
- Two-point discrimination: how far away do two points need to be in order to sense them as two separate touches rather than one single touch?
- 2-3 mm in finger
- 6 mm on cheek
- 39 mm in back
- Cyberglove:
- ~100K
- Tracks hand motion
- Contains motors to block finger movement: creates impression of actually grabbing something
- Force-feedback arms:
- Stylus attached to robot arm
- Can be used for sculpting: resistance varies with the material
VR Interaction
Interaction with VR:
- Keyboard/mouse not very attractive:
- Cannot see them
- Don’t want to be anchored to a desk: want to move around
- No good 3D mappings
Basic VR interaction tasks:
- Object selection and manipulation
- Problems:
- Ambiguity
- Judging distance
- Selection approaches
- Direct/enhanced grabbing
- Ray-casting techniques
- Image-plane techniques
- Manipulation approaches:
- Direct position/orientation control
- Worlds in miniature (God mode)
- Skewers
- Surrogates
- Problems:
- Navigation
- Wayfinding: how do I know where I am, how do I get there?
- People get lost/disoriented easily: need maps
- Limited physical space; possibly infinite virtual space
- Not a 1:1 mapping between their physical and virtual position, making it easy to get disoriented
- Different types of travel
- Walking/running
- Turning
- Side-stepping
- Back-stepping
- Crawling
- Quick start/stop
- Driving
- Flying
- Teleporting
- Need to do other things while traveling
- Impossible spaces
- Change blindness redirection: change the geometry of the space behind them
- Suma, E. A.; Lipps, Z.; Finkelstein, S. L.; Krum, D. M. & Bolas, M. T., Impossible Spaces: Maximizing Natural Walking in Virtual Environments with Self-Overlapping Architecture, IEEE Trans. Vis. Comput. Graph., 2012, 18, 555-564
- Humans trust their visual sense more than their memory
- Changing rotation angle?
- Change blindness redirection: change the geometry of the space behind them
- System control:
- Changing settings
- Manipulating widgets:
- Lighting effects
- Object representation
- Data filtering
- Approaches:
- Floating windows
- Hand-held windows
- Gestures
- Menus on fingers
- Symbolic input: typing/inputting text/numbers
- Avatar control
- Body sensors to accurately map avatar to real user?
- Or approximate with head and hand position?
The “optimal” interface depends on:
- The capabilities of the user
- Dexterity
- Level of expertise
- The nature of the task being performed
- Granularity
- Complexity
- The constraints of the environment
- Stationary, moving, noisy, etc.
Augmented Reality
Azuma (1997):
- Fundamental article on AR
- Defined AR as:
- Combining real and virtual images
- Interactive, real-time
- Registered in 3D: positioned at a real position in the world
AR feedback loop:
- User:
- Observes AR display
- Controls the viewpoint
- Interacts the content
- System:
- Tracks the user’s viewpoint
- Registers the pose in the real world with the virtual environment
- Presents situated visualization
Requirements:
- Display: must combine real and virtual images
- Interactive in real-time
- Registered in 3d: viewpoint tracking
History:
- 1968: Sutherland HMD system
- System would hang from the ceiling
- 1970-80s: UC air force SuperCockpit program
- 1990s: Boeing wire harness assembly
- Now:
- Magic books: virtual content shown over pages
- Magic mirror: ‘mirror’ overlays X-ray image over color image
- Remote support
Display types:
- Head-attached
- Head-mounted display/projector
- Two types:
- Occluded/video: essentially a VR headset with a camera feed streamed through it
- e.g. Varjo XR-1:
- Low-latency (~20 ms) cameras
- 1080p resolution per eye
- Tethered
- 87 degree FoV
- e.g. Varjo XR-1:
- Optical see-through: transparent display that overlaying content
- No/lower distortion
- Safer: use will always be able to see real world
- No latency for real content
- Images will be a bit transparent as well
- More connected with the real world
- e.g. Hololens, Magic Leap
- Hololens has ~30 degree FoV: very limited
- Occluded/video: essentially a VR headset with a camera feed streamed through it
- Body-attached
- Hand-held display/projector (smartphones)
- Spatial
- Spatially-aligned projector/monitor
- Project images onto a real object
- e.g. pool table
- Everyone can see the content: not as awkward
- Spatially-aligned projector/monitor
Tracking:
- Continually locating the user’s viewpoint when moving
- Position (XYZ) and orientation (RPY)
Registration:
- Positioning virtual objects in relation to the real world
- Anchoring a virtual object to a real object when a view is fixed
Tracking requirements:
- Augmented reality information display
- World-stabilized: hardest, must track position + rotation
- Body stabilized: fixed distance from your body: must track rotation
- Head stabilized: easiest
Tracking technologies:
- Active:
- Mechanical, magnetic, ultrasonic
- GPS, Wi-Fi, cellular
- Passive
- Inertial sensors (IMU: compass, accelerometer, gyro)
- Computer vision:
- Marker-based tracking
- ARToolkit
- Research project will use marker-based tracking for reliability
- Natural feature tracking
- Vuforia texture tracking
- Can handle partially-occluded markers
- Vuforia texture tracking
- Marker-based tracking
- Hybrid tracking
- Combined sensors
- e.g. MonoSLAM
Evolution of AR interfaces (more expressive/intuitive going down):
- Browsing:
- Simple input:
- Very limited modification of virtual content
- e.g. placing furniture in room: can control position and rotation
- Viewpoint control
- Handheld AR displays
- Information registered to real-world context:
- e.g. AR map UI
- Simple input:
- 3D AR:
- 3D UI
- Often use HMDs, 6DoF head-tracking
- Dedicated controllers (6DoF)
- 3D interaction: manipulation, selection, etc.
- Tangible UI
- Augmented surfaces
- Object interaction
- Familiar controllers
- Indirect interaction
- Based on Tangible Bits vision (Ishii and Ullmer, 1997):
- Give physical form to digital information
- Make bits directly manipulable and perceptible
- Seamless coupling between physical objects and virtual data
- Tangible AR
- Tangible input principles applied to AR
- AR overlay
- Direct interaction
- Physical controllers for moving virtual content
- Support for spatial 3D interaction techniques
- Time and space multiplexed interaction
- Multi-hand interactions possible
- Natural AR
- Interacting with AR content in the same way as real world objects
- Natural user input: body motion, gesture, gaze, speech
- e.g. overhead depth sensing camera
- Create real-time hand model, point-cloud
- Overlay graphics (spider)
- Gesture interaction
- Demo: spider on desk: occluded by hand, can crawl over hand
02. Developing Augmented Reality Experiences
Adrian Clark, senior lecturer, School of Product Design.
Introduction to Unity
Unity: ‘real time development platform’. Not just for games.
Unity is so big, no one knows the full extent of what it can do.
Resources:
- Unity Learn
- AR stuff now under the
XRsection
- AR stuff now under the
- Community
- User manual
- Pretty decent docs
- Asset Store
- Also third-party stores like Turbo Squid
- Prefer
fbxfiles - To download assets purchased from the store: Window -> Package manager -> My assets
Unity:
- Use 2020.3 LTS release
- Use 3D template
Editor:
- Many windows
- Scene
- See and position all
GameObjects in the scene - Top left, switch between translation, rotation and scale tools
- Add Rigidbody to a GameObject to add physics
- See and position all
- Game
- What the camera sees
- Any changes made during play mode are not saved
- Project
- Assets folder: contents update automatically when FS changes
- Hierarchy
- All game objects in the scene in a hierarchy
- Parent nodes affect child nodes
- Inspector
- Modification of
GameObjectproperties - Components: behaviors or extensions to the game objects
- Game objects are essentially containers for components
- UI objects have their own event systems: cannot attach click listeners etc. to 3D objects
- Examples:
- Renderer
- MeshFilter
- Camera
- Light
- Directional by default:
- Infinitely far away: rays are parallel
- Node position does not matter: only rotation
- Spotlight: cone of influence
- Point: sphere of influence
- Directional by default:
- Collider
- Modification of
- Console
- Warnings, errors
- Scene
Shader rendering mode:
- Opaque: alpha ignored
- Cutout: binary transparency; on or off. Use if the object has holes
- Fade: change transparency of all aspects of the material based on alpha
- Transparent: TODO
Scripting:
// Behavior script is another component that attaches to a GameObject
// https://docs.unity3d.com/Manual/ExecutionOrder.html
// A massive number of lifecycle callbacks
public class NodeBehaviorScript: MonoBehaviour {
// Called before first frame update
void Start() {
Debug.log("Instantiated");
}
// Called every frame
void Update() {
if (Input.GetKey(KeyCode.UpArrow)) {
// transform: transform of the object the script is attached to
// localPosition: position relative to parent
transform.localPosition += new Vector3(0, 0, 0.1f);
}
}
void onMouseDown() {
// Set up a collider on the game object
// e.g. box collider: invisible box (hopefully) around the object
}
void onCollisionEnter(Collision collision) {
// Use collider that is larger than the object: when two objects
// come close together you can add custom behavior (e.g. 'picking up'
// the object in AR
}
}
Unity Remote:
- App installed on your phone
- Project Settings -> Editor -> Unity Remote
- Set game resolution to match device screen
- Game view is streamed to your phone screen
- But not the camera
- Touch events etc. on phone are sent the host computer
AR
Many different SDKs available. Some deciding factors:
- Price (free, paid per scan/month/app/licence period)
- Supported hardware platforms (iOS, Android, desktop, HMDs, web)
- Tracking (Fiducial, 2D natural feature, SLAM, 3D object, face, GPS/IMU)
- Performance
Unity also has AR foundation: a common interface to platform-specific AR frameworks. No way of running it in the editor, which makes development very frustrating (although there are some rumblings of a Unity Remote-like app which does on-device processing).
This course will use Vuforia:
- Initially developed by Qualcomm (and optimized for their chips), bought in 2015 by PTC and slowly becoming monetized
- Works on black and white
- Available as a UnityPackage
- Each target is its own separate game object
- Add game objects as children of the target in to anchor them to the target
- Except for ground/mid-air planes: ‘finder’ objects to find the plane, and ‘stage’ objects containing content
- Target types:
- 2D Image
- Single image: import image into Unity, drag image into texture field
- These have
Image Target Previewcomponent which displays a preview of the image
- These have
- Can create databases in the cloud, then download
- Can have cloud image target which does processing online:
- Useful for databases with large (hundreds) numbers of images
- Requires paid license
- Single image: import image into Unity, drag image into texture field
- Cylinder
- Multi
- Box: six images, one for each face
- Add occlusion object: create 6 planes with depth mask material which is rendered before the game objects
- Add target representation: renders the box on top of (where Vuforia thinks) the actual box (is)
- 3D models:
- CAD models (model targets)
- Being deprecated
- Scanned 3D objects (object targets)
- Supposedly getting better
- Scanned 3D environments (area targets)
- Doesn’t work that well
- Ground planes (ground or mid-air)
- CAD models (model targets)
- VuMarks
- Vuforia’s custom fiducial markers
- Can have multiple VuMarks with similar visual content but with different data
- Created as SVG files: Illustrator template available
- 2D Image
- Ground plane targets
- Anchoring virtual content to ‘ground planes’ - horizontal planes in the environment
- Uses SLAM: requires IMUs etc., so not supported on all devices
- Can emulate in editor with a PDF print-out of a texture
- Project -> Packages -> Vuforia Engine AR -> Vuforia -> Database -> ForPrint -> Emulator -> Emulator Ground Plane pdf file
- Requires ‘track device pose’ to be enabled in Vuforia engine configuration
- Ground plane finder:
- Places a reticle on ground planes (think crosshair)
- Interactive hit test: on tap, instantiates a prefab on the ground plane
- Can also use automatic hit tests
- Create ground plane stage:
- Link to the ground plane finder (content positioning behavior, anchor stage)
- Can enable duplication to have multiple copies of the ground stage content
- Size is in real-world units (1m x 1m)
- Can save the ground plane as a prefab (works somewhat)
- Link to the ground plane finder (content positioning behavior, anchor stage)
- Mid-air positioner:
- Fixed distance from the ground
- Position is tracked relative to the ground plane
- No automatic hit-tracking
- Ground plane stage: change anchor behavior to MID_AIR
- Adding custom targets:
- Requires license
- In AR camera, go to Vuforia configuration
- Enter license
- Can also change settings such as scale
- In AR camera, go to Vuforia configuration
- Requires license
Vuforia AR camera:
- GameObject -> Vuforia Engine -> Camera
- Replaces default camera
- Vuforia engine configuration
- Can also just edit configuration as a text file
- Asset/Resources/VuforiaConfiguration.asset
- Global settings: apply to all scenes
- World center mode:
- First target: first target detected is the world origin
- Device: camera always at origin
- Need to turn off track device pose
- Origin impacts things such as physics simulations
- Can also just edit configuration as a text file
Targets:
- Default observer event handler:
- Responsible for turning on/off AR content when target is visible
- Can run custom scripts or change properties of GameObjects when a target is found or lost
- Can choose definition of Visible for each target:
- Tracked: visible to the camera
- Extended Tracked: the area immediately surrounding the target is visible
- Limited: vague idea of position using IMU or something
- Responsible for turning on/off AR content when target is visible
UI:
- Can position element on screen anchored to a corner, center etc.
- Elements not scaled for different pixel densities: may appear tiny on phones
- Game view, set display resolution to some high portrait resolution to preview
- UI elements automatically have a ‘Canvas Scaler’ component
- Change UI scale mode from ‘constant pixel size’ to ‘scale with screen size’
- Set a reference resolution and an axis which it scales along
- Can also use ‘constant physical size’
- Button click events:
- Can call any public method from any (instance of a) script assigned to a GameObject
Prefabs:
- Saved chunk of a scene (like a snapshot)
- Drag the game object into the project window to create a new prefab
- Saves components, children etc.
- Creating elements from script:
-
GameObject -> new empty object
-
Add component -> new script
public class SomeCreator: MonoBehaviour { // In the inspector, script component, can assign any GameObject (including prefabs) to the property public GameObject SomeGameObject; // Works with a lot more types to (e.g. int, Rect) void Start() { // Create new copy of the `SomeGameObject` object GameObject someNewCopy = GameObject.Instantiate(SomeGameObject); // Can optionally pass in position, rotation (as a quaternion) // Primitives can also be created programatically GameObject cube = GameObject.CreatePrimitive(PrimitiveType.Cube); // Can set the material properties for the cube's default MeshRenderer // GameObjects start off sharing the same default material // Setting the color creates a new instance of the material. cube.GetComponent<Renderer>().material.color = Random.ColorHSV(); // `Color` components have a `[0, 1]` floating point range, // whereas `Color32` has a [0, 255] integer range // used `sharedMaterial` to modify the material instead, (potentially) // affecting multiple objects. Don't call if it uses the default material cube.GetComponent<Renderer>().sharedMaterial // For complex models (or prefabs) composed of multiple GameObjects, // we need to access the children components. // Find the first child object in the tree with the given name baseObject.transform.Find("ChildName"); } }
-
Building for Mobile:
- File -> Build Settings
- Set to iOS/Android
- Click switch platform (and wait a while)
- iOS:
- Builds: recommend new folder for each project
- Set signing team
- Project settings -> player
- Can set company, product name, icons etc.
- Device orientation: typically just portrait
- Other settings:
- Can set build number
- Configuration:
- Camera usage description: camera privacy description text
- Scripting backend: IL.2CPP (intermediate language which compiles C# rather than interpreting it)
- For Vuforia:
- iOS:
- Target iOS version >= 11
- Architecture ARM64
- Product name cannot be ‘vuforia’ (there is a library called ‘vuforia’ which it gets mixed up with)
- Android:
- ARCore: if available, Vuforia will use it
- Minimum API level 24, remove Vulkan (check if this is still the case)
- ARCore: if available, Vuforia will use it
- iOS:
- Android logs:
adb logcat -s "Unity:*"(sfor silence all but logs that match the given string)
Adrian:
- Co-founded AR company
- AR overused and often misunderstood
- Before developing AR applications, ask if there is benefit to doing it in AR:
- Could it be done in VR?
- As a desktop/mobile app?
- As a webpage?
- AR useful for visualizing spatial data, especially if it has an intrinsic link to the real world-environment
- If it fails the latter, it should be done in VR instead (or even just on a flat screen)
- Data should have at least 3 spatial dimensions
- You move around in 3 dimensions and hence, the data should have at least that many dimensions
- Awkward interactions:
- Holding your phone up with one hand while tapping the screen
- Requires powerful phones and drains battery
- Trying to tap targets in mid air while wearing a heavy HMD
- Holding your phone up with one hand while tapping the screen
- Hard to find a balance for visualization realism:
- Shouldn’t perfectly match the environment: people need to know they can interact with it
- Shouldn’t be so out-of-place to be jarring
- AR visualization considerations:
- Real/virtual object occlusion
- Never pixel perfect
- Lighting which matches the real world (brightness, color, reflection)
- Can’t get high-quality reflections: don’t know what’s behind the camera
- Shadowing to give the perception of distance
- Clutter/contrast between real/virtual objects
- Real/virtual object occlusion
- Interactions
- Still in its infancy
- With touch:
- How do we ensure people can accurately touch?
- Fat fingers, and phone is held in one arm outstretched
- How do we choose where they touch in 3D space?
- How do we ensure people can accurately touch?
- With gestures:
- What gestures are intuitive? How does it vary by culture
- How do we combat fatigue?
- How do we stop it from looking embarrassing?
- How do we deal with a lack of haptic response?
- Is my finger past the target? In front of it? On it?
- Without good occlusion, haptics is important
- Forget about the WIMP (windows, icons, menus and pointers) metaphor
- Why bring an intrinstically 2D metaphor into an intrinstically 3D experience?
- Think about new interactions and visual affordances
- Tangible user interfaces
- AR should be seamless: we should forget it even exists
03. Developing Virtual Reality Experiences
Dr Tham Piumsomboon, School of Product Design.
Current VR Development Tools
2016: rise of consumer HMDs. Oculus, HTC Vive.
XR Fragmentation: different vendors all had their own proprietary APIs (e.g. Steam VR, Hololens, Oculus, HTC Vive, Magic Leap).
Kronos group (which created OpenGL, Vulkan, etc.) developed the OpenXR standard: cross-platform API supported by many hardware vendors.
Toolkits:
- VRTK
- Open source, Microsoft
- MRTK
- XRTK
- Fork of MRTK developed by Unity
- [Oculus Interaction SDK]
- May be depreciated
Game Engines:
- HITLab using Unity engine
- A few others available: Unreal, CRYENGINE, GameMaker Studio, Amazon Lumberyard
- Social platforms (e.g. Breakroom)
Developing VR Experiences
Immersion:
- Feel like you are physically and mentally in the virtual world
- How to simulate a large world in a limited physical space?
- In non-VR games:
- Suspension of disbelief: enough realism in the experience that you can ignore the issues
Models of immersion:
- Three types
- Sensory immersion:
- Disassociation with the real world
- Challenge-based immersion:
- Control:
- Ask what made great games from the past (e.g. Mario) great?
- Input mechanisms (e.g. game controllers, hand gestures)
- Challenge:
- Challenges must be achievable: too difficult -> frustrated; too easy -> bored
- Hence, the difficulty must be balanced to make the experience rewarding
- Cognitive involvement:
- What you get out of the experience:
- Fun
- Social aspects
- Learning/training
- Work
- What you get out of the experience:
- Control:
- Imaginative immersion:
- Emotional involvement
- Sensory immersion:
Unity:
- Unity Settings -> XR Plug-in Management: use OpenXR (e.g. to support Windows MR headset)
- Windows MR app needs to be running in the background
- Add interaction profile
- Package manager -> XR Interaction toolkit (
com.unity.xr.interaction.toolkit) - Add XR Origin GameObject: virtual camera maps to headset position and orientation
- Add input action component: XR default input action
- Also adds controller game objects
Analysis -> Profiler
- Visualize frame rate of application and components (e.g. rendering, scripting, physics, display sync) that make up the processing time
- High frame rate important to prevent motion sickness
- Project Settings -> Quality to increase/reduce quality and decrease/increase frame rate
Game engines components:
- Core functionality:
- Rendering engine
- Physics engine
- And collision detection
- Sound, scripting, animation, networking, streaming, memory management etc.
Game loop:
- Read HID (human input device) state
- Update scene state
- Physics engine
- User input
- Multiplayer networking
- Collisions
- Animations
- NPC state
- Audio
- Render the scene
The subsystems will often update at different rates (e.g. NPC behavior may update at ~1 FPS, physics engine at 120 FPS, renderer at 60 FPS).
Camera placements and control mappings:
- Relatively easy to convert 2.5D to 3D
- FPS games: camera mapping can be done easily by adding XR Origin
- Controllers need to be mapped to existing controls
- XROrigin:
- Copy tracking origin, put it in player
- Put controllers under player
- Controllers: Add default SOMETHING
- Player: add locomotion, input system
- XR origin set
- Add reference to XRI default input actions
- Player: add continuous move provider, left hand XRI LeftHand LocomotionMove
- Player: add snap turn provider, left hand XRI SnapMove
- XR Grab Intractable: add to a game object to allow a user to grab it from afar
- XR Socket Interactor: allows interactables to be ‘docked’ to the game object
- Box collider: disable
Is triggerTODO
- Box collider: disable
- Scripts:
using UnityEngine.InputSystem, createInputActionReferenceproperty (or private property with[SerializeField]annotation)
ProBuilder package: allows you to create new primitive shapes.
VR Interaction Design
Even if the graphics are good, you need to be able to interact with the environment.
Seven principles of fundamental design (Norman):
- Discoverable: users can discover what actions are possible
- Feedback: full and continuous feedback about the results of your actions and the current state of the world
- Conceptual model: design informs the user of the system’s conceptual model, making it seem intuitive
- Affordances:
- The perceived affordance should match the actual affordance
- Clues as to how something works:
- Signifiers: something to indicate the existence of affordances (e.g. arrows, sound)
- Mappings: relationship between the controls and actions is understood
- Constraints: physical, logical, semantic, cultural constraints which guide actions
VR: mappings are from games, not reality (e.g. using scissors: clicK a button to use, not picking it up with your fingers). Chairs: cannot sit on VR chairs in real life
VR affordances:
- Use visual cues to show possible affordances
- Perceived affordances should match actual affordances
- Good cognitive model: map object behavior to expected behavior
- May vary by culture
- Controllers have different controls:
- Some have joysticks, some have touchpads, some have buttons
- Trigger buttons
- Examples:
- Buttons that can be pushed
- Objects that can be picked up
- Doors that can be opened and walked through
- Mutual human actuation
User groups:
- Age:
- Children require different interface designs
- Older people have different needs
- Prior experiences with HMDs
- Different physical characteristics: left/right-handed, height, arm reach
- Perceptual/cognitive/motor abilities
- e.g. color perception
- Cognitive/motor disabilities
Whole user needs:
- Social: don’t make them look stupid
- Cultural: follow local cultural norms
- Physical: can they physically use the interface
- Cognitive: can they understand the interface
- Emotional: make the user feel good and in control
Summary:
- High-fidelity graphics in VR is possible if we can afford it computationally
- But not sufficient for immersion
- SCI immersive model: engage user through sensory, challenge-based, and imaginative immersion
- General design principles can be applied to VR design
- Plenty off opportunities of richer VR interaction beyond what general design principles could govern
04. AR Tracking, Calibration and Registration
Optical tracking:
- Specialized
- e.g. IR lights for VR controllers
- Marker-based
- TODO
- Markerless:
- Edge-based
- Template-based
- Interest point
Trackable managers:
- AR Foundation: a Trackable is anything that can e detected and tracked in the real world
- Planes, point clouds, anchors, images, environment probes, faces, 3D objects
- We are interested in planes, point clouds, anchors, and images
- Each Trackable has a Trackable Manager, which is on the same GameObject as the AR Session Origin
- Each Trackable Manager keeps a list of its Trackables
- Planes, point clouds, anchors, images, environment probes, faces, 3D objects
Computer vision: detecting objects and tracking their movement in 6 degrees of freedom
Vision is inferrential: context, prior knowledge etc. is required to come up with a reasonable interpretation of the scene: An infinite number of 3D objects can lead to the same image.
3D information recovery:
- Motion
- Stereo vision
- Works up to ~3m
- Texture
- Shading
- Contour
- Can understand depth from a line drawing
- Time-of-flight sensors
TODO
The human visual system is really good and is often taken for granted; replicating this with a computer is very difficult.
Cognitive processing of color is dependent on context: neighboring colors, not absolute values. Hence, using a color mask for filtering is likely to fail unless you have control over lighting.
Low-level image processing:
- Image compression
- Noise reduction
- Edge extraction
- Contrast enhancement
- Good for humans but for computers, it is just throwing away information
- Segmentation
- Thresholding
- Morphology
- Image restoration
- e.g. if camera velocity known, can correct for motion blur
TODO
Recognition:
- Shading
TODO:
- Sports:
- American football: touch down line
- Swimming: flags
Perfect 3D point cloud -> 3D model is very difficult
Modelling the natural world: extremely difficult as there is variation. Manufacturing produces many copies of a single product, but nature does not.
Vision systems:
- Active:
- Laser scanner
- Structured light
- Project lots of dots; use dot size to determine distance
- Time of flight:
- Use time it takes for light to return to camera to determine distance: gives distance value for every single pixel
- Passive
- Stereo
- Cheap, works well in good lighting
- Stereo
- Structure from motion/3D reconstruction:
- Deep learning with moving camera to reconstruct 3D scene
COLOR:
- Visible spectrum is a tiny part of the electromagnetic spectrum
- Sun: greatest energy output at visible wavelengths
TODO
- Natural Feature Tracking:
- Keypoint detection
- SIFT, SURF, GLOH, BRIEF, FREAK etc.
- Descriptor creation and TODO
- Outlier removal
- Pose estimation and refinement
- Keypoint detection
Fidutial:
- No databases required
- Intrusive: environment
- Must be fully in-view
05. Mixed Reality Displays
Rob Lindeman: Professor & Director of HIT Lab.
Displays
Definitions
Virtual Reality
Rob first defined VR as:
Fooling the senses into believing they are experiencing something they are not actually experiencing
Lindeman, 1999 (PhD)
Today, he has a new definition:
Fooling the brain into believing it is experiencing something it is not actually experiencing
Mixed Reality
Mixing (not overlaying) of real-world (RW) and computer-generated (CG) stimuli.
This requires matching attributes such as:
- Visual: Lighting, shadows, occlusion, level of fidelity
- Aural: Sound occlusion, reflection
- Other senses?
Milgram’s Reality-Virtuality continuum: different displays influence the quality of the experience.
General Display Types
NB: humans are animals and as such, were evolutionary pressures have guided the development of ours senses. Displays that leverage the different strengths and weaknesses are more likely to be effective.
Senses:
- Visual
- Very good visuals: high framerate, good lighting simulation
- Auditory
- Very good spatialized audio
- Haptic
- Application-specific, cumbersome
- Catch-all for many different senses:
- Force/pressure
- Slipperiness
- Vibration
- Wind
- Temperature
- Pain
- Proprioception
- Sensitivity varies greatly
- Haptics is bidirectional:
- Tight coupling between sensing and acting on the environment
- e.g. picking up a cup: use haptics to tread the line between slipping and crushing the cup
- Tactile/force devices:
- Pin arrays for the fingers: individually actuated pins
- Force-feedback ‘arms’
- ‘Pager’ motors
- Particle brakes: stopping motion
- Passive haptics
- Most successful haptics are very application-specific (e.g. surgical devices)
- Virtual contact
- What should we do when contact has been made with a virtual object?
- Should the virtual hand continue to mirror the pose of physical hand, or be blocked by the wall?
- The output of collision detection is the input to virtual contact
- Cues for understanding the nature of contact with objects is typically over-simplified (e.g. sound)
- What should we do when contact has been made with a virtual object?
- Vibrotactile displays:
- Use of vibration motors as a display
- US Navy TSAS project: communicate which direction is ‘down’ to pilots during maneuvers
- Haptic vest: communicate collision direction, strength to users
- Wind feedback: head tracking + fans
- Olfactory
- Very hard - too many types of receptors
- Almost all human-perceivable colors can be produced from just three sub-pixel types
- Nose has ~15,000 types of receptors
- Gustatory
- Know the base tastes, but no way of producing or delivering them
- Meta cookie: AR display, air pumps with different smells, (tasteless?) cookie with marker burned into it
Display anchoring:
- World-fixed
- View-fixed
- Body-worn
- Hand-held
Visual display types:
- World-fixed displays
- Fishtank/desktop VR
- Projection AR
- Body-worn displays:
- Opaque HMDs (VR)
- Transparent HMDs (AR)
- Hand-held displays:
- Tablet/phone VR/AR
- Boom-mounted screens (not too common today)
Mixing Reality
Visual
NB: we don’t need to simulate reality, just need to make it good enough to make the brain believe it is physically correct.
Direct:
Human
Real-world ----> Environment ----> sensory ----> Nerves ----> Brain
signal subsystem
Display? Retina Optic Direct cranial
nerve stimulation
Captured/mediated
Real-world ----> Environment ----> Capture device ----> Post-processing ----> Captured signal
Audio
Real-world ----> Environment ----> Outer ear ----> Middle ear ---> Inner ear ----> Nerves ----> Brain
- Typical AR/VR systems use speakers (environment) or headphones (outer ear)
- Mixing could also be performed in the inner middle ear using bone conduction
Mic-through AR:
- Microphone glued to earbuds
- PC mixes audio for virtual user
Hear-through AR:
- Acoustic-hear-through AR: multiple speakers placed around the room
- Bone-conduction: ears are not covered so can continue to hear
- Mixing at the sensory subsystem
- Own voice: combination of sound reaching ears through air, plus vibration through cocela
Visual Mixing
Projection:
- Project virtual content on top of the physical world
- Examples:
- Microsoft IllumiRoom (2013):
- Use projector to ‘extend’ TV content
- Can also distort and re-project room texture
- Microsoft IllumiRoom (2013):
Optical-see-through AR:
- HMD with transparent display
- e.g. Microsoft Hololens, Magic Leap
Optical-see-through Projective AR:
- Projection onto retro-reflective surfaces: only visible to the user wearing the projector
Video-see-through AR:
- Camera on headset: camera feed mixed with virtual content and displayed in headset display
- Benefit: easy to remove things from reality: hard/impossible in optical-see-through systems
- e.g. Varjo XR-1
Visual Cues
Do we need stereo, which is one of the major things added by VR compared to traditional displays?
Monoscopic cues:
- Overlap (interposition)
- Sizing/shadows
- Size
- Linear perspective
- Texture gradient
- Height
- Atmospheric effects
- Brightness
Stereoscopic cues:
- Parallax between two images
- Only good for within a few meters of the cameras
Motion depth:
- Changing relative position of head and objects
- User (e.g. head movement) and/or object movement
- Proprioception can disambiguate between these two cases
Physiological cues:
- The eye changes during viewing
- Accommodation: muscular changes of the eye
- Convergence: movements to bring images to the same location on both retinas
Masking/Occlusion
Making a physical object block a virtual one.
- CAVE (CAVE Automatic Virtual Environment)
- Projection of VR content onto room surface
- Need to create mask to prevent projection on physical objects
- HMD: not necessary; mixing being done virtually
- Fishtank VR: display edge/bezels can break effect
Real-world Problems with Immersion
- Feeling sick after using VR for a prolonged period
- Popcorn problem: can’t interact with physical objects (or eat) without taking the headset on and off
- Communication: very difficult to talk with someone using a VR headset
Dynamic immersion:
- Open VR headsets which allow the user see the real world from the user’s peripheral vision
- Linderman: replaced part of Google cardboard’s frame with LCD panels (no backlight) that could be turned on and off
- Later version added eye tracker + tiny LCDs with eyes on the outside
Visuals & Sound
Non-intrusive senses: touch etc. requires something on or in your body.
Final Thoughts
- Real world stimuli: high fidelity/low control
- CG stimuli: lower fidelity but complete control
- Far, far future: a 3D printer and robot that quickly creates objects and puts them in the environment
- Later mixing point = more ‘personal’ stimuli (closer to the brain)
- Multi-sensory approaches compensates for weaknesses in one sense with another sense
06. Interaction in VR
Rob Lindeman, Director of HITLab NZ.
User interaction:
- How can we do things in VR environments?
- What controllers/inputs do they support?
- As a user, what kinds of thins would I even want to do?
- Is the task fatiguing? Will it make me feel sick
The state of VR:
- 1980s/1990s:
- Much hype
- No inroads into everyday life:
- Lagging technology
- But the technology we have today seems close
- Lack of understanding of usability issues
- Lack of ‘killer apps’ (games?)
- Lagging technology
- Some use in specific scenarios:
- Surgical simulation
- Military training
- Phobia therapy
- Oil/gas visualization
- Automotive design
- 2000:
- Growth of video games led to:
- Immense increase in hardware power
- Reduction in hardware costs
- Immense growth in number of users/gamers
- Many ‘new’ interface devices
- Better understanding of 3D-UI
- Gaming emerged as the killer app (?) (at least for now)
- Growth of video games led to:
- 2010s:
- Lots of new hardware:
- WiiMote, WiiMotion Plus
- Kinect
- Powerful smartphones (economies of scale)
- ~12 VR/AR headsets announced
- Lots of controllers
- 2015:
- A lot of VR headsets
- Very few AR headsets
- Many phone-integrated headsets (e.g. Google Cardboard, Samsung Gear VR)
- Some locomotion controllers (e.g. treadmills), but no consumer hardware
- A lot of companies go out of business: no use case for the products
- Lots of new hardware:
Motivation for studying VR interaction:
- Mouse + Keyboard great for general desktop UI tasks:
- Text entry, selection, drag/drop, scrolling, rubber banding etc.
- Fixed computing environment
- 2D windows, 2D mouse
- But how do we design effective techniques for 3D?
- With a 2D device?
- With multiple n-D devices?
- New devices?
- 2D interface widgets?
- With a new language; new interaction techniques that can support the new environment
- Gaming:
- Tight coupling between action and reaction
- Requires precision
- VR gives real first-person experiences, not just views
- HMDs:
- Look behind by turning your head
- Selecting/manipulating objects:
- Reach out with your hand and grab it
- Travel:
- Walk
- Except that you are still in a confined physical space with objects and cats in the way
- Doing things that have no physical analog is more problematic
- How do you change font size?
- HMDs:
Existing input methods:
- Joystick, trackballs, trackpoints, trackpads, tablets, gaming controllers
- General vs purpose-built controllers:
- General purpose:
- Single device used for many thing
- Mouse, joystick, gamepad, game controllers, Vive/Touch controllers, WiiMote etc.
- Okay for many tasks, but not optimal
- Special purpose:
- Typically used for a specific task (e.g. driving, playing guitar)
- Very effective for a given task
- General purpose:
- Current devices:
- PlayStation
- Vibration motors with different weights in each wing to allow varying vibration strength
- PS5: shoulder/trigger buttons have motors for force feedback (variable resistance)
- Xbox controllers
- WiiMote
- Leap Motion hand tracking
- Kinect body tracking
- Hand-held devices:
- Smartphones
- Tablets
- Nintendo DS
- Sony PSP
- PlayStation
Classification Schemes
Relative vs absolute movement:
- Mice return a delta: relative motion
- Touch screens, pen tablets return absolute position
Integrated vs separable degrees of freedom:
- e.g. Etch-a-sketch has separate X and Y controls
- Motions that are easy with one are hard with the other
Analog vs digital:
- Continuous vs digital input: prefer former
Isometric vs isotonic:
- Isometric: infinite resistance (no motion), but force sensing
- e.g. ThinkPad TrackPoint
- Isotonic: zero resistance
- In reality, devices exist on a continuum of elasticity
- Mice are mostly isotonic, but they do have mass and hence inertia
- Some controls (e.g. joysticks) are self-centering
Rate control vs position control:
- Mice
- Usually position control
- Scrolling:
- Scroll wheel: position control
- Windows middle click and drag: rate-controlled scrolling
- Trackballs: usually position control
- Joysticks: usually position (cross-hair) or velocity (e.g. aircraft)
- Rate control eliminates need for clutching/ratcheting
- Isotonic-rate and isometric-position control usually poor
Special-purpose vs general-purpose:
- Game controllers must support many types of games
- Few ‘standard’ mappings: each game can do things differently (for the most part)
- Some special-purpose devices:
- Mostly based on things that already exist in the real world
- Examples:
- Guitar controllers
- Steering wheels
- RPG keyboard/joystick
- Drum kits, dance pads, bongos etc.
Direct vs indirect:
- Direct:
- Click and drag with mouse/stylus/finger
- Touch screen gestures: swipe, two-finger rotate
- Problems:
- Works well for things that have a physical analogue, but not for those without
- May have low precision
- Selection/de-selection may be messy
- Indirect
- Use some widget to indirectly change something
3D Input Devices:
- SpaceBall/SpaceMouse
- Isometric device which senses force
- CyberGlove II
- From the 1990s
- Strain sensors used to sense hand movement
- PHANTOM Omni Haptic Device
- Stylus attached to robot arm
- Force feedback allows user to feel surface
- Limited working volume, only one point of contact, very limited use case
- HMD with 3/6-DOF trackers (e.g. Oculus Quest)
3D Spatial Input Devices:
- Microsoft Kinect
- Leap Motion
- Oculus Touch
- HTC Vive
- Used touch surfaces which could be displayed by a little, rather than a joystick
- Very different experience
- Used touch surfaces which could be displayed by a little, rather than a joystick
- Valve Index
- Strapped to your hand: don’t need to actively hold the controller
- Had finger tracking
Motion-Capture/Tracking Systems:
- Probably won’t be that common in the future: people will just use inside-out tracking built into headsets
- Used in movies/TV and games
- Capture actual motion and re-use
- Can be done interactively or offline
- Can capture three or more DoF
- Positions orientation, limbs
- No good general purpose approaches to high-fidelity, full-body tracking without markers
- Attempts:
- Magnetic tracking
- Transmitter creates a magnetic field
- Wired receivers stuck to clothes
- Receivers tracked (relative to the transmitter) using changes in magnetic field
- Pros:
- Fairly lightweight
- Six DoF
- Cons:
- Very noise near ferrous metal (e.g. rebar)
- Limited range
- Ultrasonic tracking
- Speakers emitting ultrasonic sound, captured by receivers containing microphones
- Used to compute distance
- Receivers had to point in the right direction (hemisphere)
- High resolution and accuracy
- Requires ‘line-of-sight’
- Inertial trackers
- Accelerometers, gyroscopes
- Pros:
- Lightweight
- Wireless
- Cons:
- Error accumulates
- Only moderate accuracy
- e.g. Wii MotionPlus
- Optical tracking
- Multiple fixed makers
- Known camera parameters
- Inside-out tracking: camera attached to user, lab-mounted fixed landmarks
- Outside-in tracking: cameras attached to lab, landmarks attached to user
- Active vs passive:
- Active: markers are lights
- Passive: reflective markers with external light
- PlayStation MOVE:
- Stick with illuminated ball of known size and color
- Camera tracker + internal tracker used for tracking
- Leap Motion:
- Three IR LEDs for illumination
- Stereo cameras used for depth
- Hybrid tracking
- Compensate negative characteristics of one approach with another
- Magnetic tracking
Other Input Devices:
- Speech input
- Gestures (e.g. pointing at an object)
- Device actions (e.g. buttons, joysticks)
- Head/gaze, eye blinks
- ‘Put that, there’: hybrid speech + gesture
Special Purpose Input Devices:
- Some applications are more ‘real’ with a device that matches the real action
- Examples:
- Light gun
- Flight simulator motion platform
- Snowboard/surfboard
- Pod racer
- Motorcycle
- Sensors are very cheap today: you may be able to simply attach some sensors to a passive object
Interaction in VR
Mapping Devices to Actions:
- For each (user, task, environment)
- For the four basic VR tasks
- For each device DOF
- Choose a mapping to an action
- For each device DOF
- For the four basic VR tasks
VR interaction:
- Must take advantage of people’s real-world experience
- And for those without real-world analogues, allow users to express their intent
- Without making people tired
- Without making people sick
- While making it easy to learn and use
Main interaction tasks (Bowman et al.):
- Object selection/manipulation:
- How does the user select the object they wish to manipulate?
- How do they actually manipulate it?
- Navigation:
- Wayfinding (mental): where am I now, and how do I get to where I am going?
- Locomotion (motor): how do I travel there?
- System control:
- Changing system parameters
- Manipulating widgets
- Lighting effects
- Object representation
- Data filtering
- Approaches:
- Floating windows
- Hand-held windows
- Gestures
- Menus on fingers
- Symbolic input:
- Text/number input
- Avatar control (Lindeman):
- How do you control you?
- And throwing things 😆
Objects:
- Issues:
- Ambiguity when there are multiple objects the user could be pointing to
- Distance
- Selecting multiple objects
- Releasing objects
- Selection approaches:
- Direct/enhanced grabbing (latter: items further away than arms reach)
- Ray-casting
- Image-plane
- Manipulation approaches:
- World in miniature (WIM): miniature world representing the world you are in
- Can you pick yourself up?
- Skewers
- Surrogates
- World in miniature (WIM): miniature world representing the world you are in
- Modifying objects:
- Choose among object properties
- Natural mappings of actions to changes
- Arbitrary mappings
Object selection in the real world:
- Touching/grabbing
- Pointing
- Finger: direct
- Pointer: extended
- Mouse: indirect
- Voice: ask someone
- Context
- Eye gaze
Selection-task decomposition:
- Indicate:
- Denote which open we intend to select
- On desktop: move mouse
- In VR:
- Avatar hand-movement
- Device movement
- Virtual ‘beam’ TODO raytracing
- Confirm:
- On desktop: mouse click
- In VR:
- Click
- Dwell (timeout)
- Verbal cue
Reaching objects:
- Indicating at a distance
- Go-go: greater than 1:1 mapping when arm more than a certain distance away
- Two-handed pointing
- World in miniature
- Flashlight
- Voodoo dolls
- Image plane technique: user pinches object, determine XY location on the image plane and then use determine front-most object at that point
Manipulation:
- Typical tasks:
- Repositioning
- Rotation
- Property modification
- Approaches
- WIM
- 3D widgets:
- Virtual sphere for rotation
- Jack for scaling
- Non-isomorphic translation/rotation
- Skewers
- 2D widgets
Design Guidelines:
- Use existing techniques unless you really need to
- Match the interaction technique with the device
- Use task analysis (e.g. does the task need high precision?)
- Use techniques that can help reduce clutching
- When dragging with mice: if you reach the end of the mouse pad, you need to grip and pick up the mouse, then move it to the opposite end of the mouse pad
- Non-isomorphic techniques are more useful and intuitive
- Use pointing techniques for selection; virtual hand techniques for manipulation
- Use grasp-sensitive object selection
- Constrain degrees of freedom when possible
- Fewer mistakes, less annoyance
- There is no single best interaction technique: just test, test, test
Research papers:
- Object Impersonation (Wang, IEEE VR 2015)
- Open headset: user can simultaneously use tablet
- User becomes the object:
- e.g. moving a light: user’s head becomes the light, can turn head to position beam
- e.g. paving a road: the path you move in becomes the road
- Navigation: wayfinding
- Easy to get lost in VR
- Traditional tools:
- Maps (North vs forwards up)
- Landmarks
- Spoken dierction
- Non-traditional
- Callouts
- Zooming
- Navigation: travel
- Limited physical space, possibly infinite virtual space
- Different travel types:
- Walking, running, turning, strafing, back stepping, crawling, quick start/stop, driving, flying, tale_porting
- Also lying down, kneeling, ducking, jumping
- Travel isn’t the goal: usually doing other things while travelling
- Initial Exploration of a Multi-Sensory Design Space:
- Spinning chair: user leans forwards/backwards to move
- Fans opposing direction to travel to simulate feeling of movement
- Finger Walking (Yan et al., 2016)
- Short distance: finger tapping on a touchpad to ‘walk’
- Long distance: two fingers (like feet on a hoverboard), force determines speed and angle between fingers determines angle
- TriggerWalking (Sarupuri et al., 2016):
- Tap controller trigger to walk: controller orientation controls moment direction
- System Control using Hybrid Virtual Environments (Wang, 3DUI 2013):
- Open headset
- Tablet taped to arm used for selecting objects
- Drop items in the scene in VR
- Avatar Control
- Paul Yost
- IMU controllers attached to arms, feet for full body tracking
The ‘optimal’ interface depends on:
- The capability of the user:
- Dexterity
- Expected level of expertise
- The task:
- Task complexity
- Granularity: how precise does the input need to be? -The environment:
- Stationary, moving, noisy environments
07. Interaction in AR
Stephan Lukosch
AR Interface Foundations:
- AR requirements:
- Combining real/virtual images: display technologies
- Interactive in real-time: input & interactive technologies
- Registered in 3D: viewpoint tracking technologies
- AR feedback loop:
- User input, camera movement
- Pose tracking
- Registration of virtual content
- Situated visualization
- Augmentation Placement:
- Relative to:
- Head
- Body
- Hand
- Environment: tables, walls, mid-air
- Relative to:
- Displays:
- Head-mounted (glasses)
- Hand-held projector
- Hand-help display (smartphones)
Designing AR system = interface design which satisfies the user and allows real-time interaction
Interacting with AR content:
- Augmented reality content is spatially registered: how do you interact with it?
- By touch:
- Hololens clicker: one button controller
- By raycasting:
- Cast a ray passing through eye and controller
- By hand tracking:
- Hand is recognized and mapped to a hand model
- Gestures (e.g. pinch) allow interaction
- Body tracking
- Skeleton tracking provides whole-body input
- Requires some sort of tracking system (e.g. external cameras)
Evolution of AR interfaces
Expressiveness and intuitiveness has increased over time:
- Browsing:
- 2D elements registered to real-world content
- For visualizing; limit interaction with the content
- Mostly hand-held devices
- 3D AR:
- Allows manipulation of 3D objects anchored in the real world
- Dedicated controllers, head-mounted displays, 6DOF tracking
- One of the most important interaction classes within AR
- No tactile feedback: just visual
- Tangible UI:
- Rekimoto, Saitoh, 1999
- Virtual objects projected onto a surface
- Physical objects used as controls for virtual objects
- Supports collaboration
- Ishii and Ullmer, 1997
- Tangible bits
- [Augmented Groove, 2000]:
- Mapping physical actions to MIDI
- Limitations:
- Difficult to change object properties
- Limited display: projected onto a surface or screen
- Separation between object and display
- Advantages:
- Natural: user’s hands can be used to interact with both real and virtual objects
- No need for special purpose input device
- User intuitively knows how to use the interface
- Rekimoto, Saitoh, 1999
- Tangible AR
- Tangible interfaces have a tangible gap: interaction and presentation are on 2D surfaces.
However, there is no interaction gap: same input devices can be used for physical and virtual objects.
Tangible AR tries to close both gaps:
- Physical controllers for moving virtual content
- Support spatial 3D interaction
- Support multi-handed interaction
- Time and space multiplex interaction
- Space-multiplexed:
- Many devices with one function
- More intuitive, quicker to use
- e.g. (physical) toolbox
- Poupyrev et. al., 2003
- Different functionality assigned to markers
- Opaque functionality: couldn’t tell what the marker would do by looking at it
- Time multiplexed
- One device with many functions
- Space efficient
- e.g. mouse
- VOMAR:
- Catalog book; tap a paddle against a page/section to choose the functionality of the paddle
- Space-multiplexed:
- Tangible interfaces have a tangible gap: interaction and presentation are on 2D surfaces.
However, there is no interaction gap: same input devices can be used for physical and virtual objects.
Tangible AR tries to close both gaps:
- Natural AR:
- Use of natural user input: freehand gestures, body motion, gaze, speech
- Multimodal input: not all input methods are appropriate in all situations
- HITLabNZ spider demo:
- Overhead camera with depth captures real-time hand model
- Can get spider to crawl over your hand
- Presence: how believable the virtual content is to the usre
- Hololens 2:
- Continuous 3D hand tracking
- Hololens sometimes overlays blue hand over user’s hand in order to reassure them that the tracking is working without distracting the user by overlaying the virtual hand for the whole time: less is better
- Gesture-driven interface
- Speech input:
- Commands applied to the object you are currently looking at (gaze tracking)
- Good for quantitative input (numbers, text)
- Precise input difficult in AR
- Continuous 3D hand tracking
Designing AR Systems
Basic design guidelines:
- Provide a good conceptual model and metaphor
- True for any kind of user interface
- Make things visible:
- If an object has a function, then the user interface should show it
- Even if a function is obvious, the user may not realize that the system supports this
- Map interface controls to the customer’s model
- Not that of the system implementation
- Provide feedback: WYSIWYG
- Interface components:
- Physical objects
- Interaction metaphor
- Virtual objects
Affordances
Objects are purposely built: they include affordances and make them obvious.
Affordances: an attribute of an object that allows people to know how to use it
Physical affordances:
- Chairs are to sit
- Handles are to twist and pull
- Scissors are to cut
- Surface Dial: we expect circular objects to be spun
Interfaces:
- Virtual objects do not have ‘real’ affordances
- They are better conceptualized as ‘perceived’ affordances
- Based on people’s prior experiences
- Common/repeated metaphors become ingrained in users
Augmented reality:
- Physical: tangible controllers and objects
- Virtual: virtual graphics and audio
Case Studies
Navigating a spatial interface:
- Menu displayed over hand; other hand used as pointer
- Menu attached to marker held in one hand; other hand used as pointer
- Interaction with cylinder: rotate to select
- One hand interaction: gesture to select?
- Place marker on surface: another marker used for selection
- Menu fixed at one location: stable
Workspace awareness in collaborative AR:
- Local player solving puzzle
- Remote instructor gave advice in AR
- Knew solution, gave advice either visually or aurally
- Audio made users more aware of the instructor’s actions, but was also more distracting
- Visual allowed players to follow instructions better than audio
- HMD had very limited FoV: local player may not notice when instructor marked a piece
- Object selection with HMD
- Tried reducing brightness of background and blurring: neither worked out
- Lighting plays an important role in depth perception
3D AR lens:
- Magnifying glass with physical handle and a marker where the lens would be
- When in AR, acted like a real magnifying glass
Magic book:
- 3D model shown in book using marker/texture tracking
Interaction Design
The process of:
- Discovering requirements, designing to fulfil requirements, produce prototypes and evaluates them
- Often requirements will be conflicting: you must make trade-offs that will best suit your future users
- Focus on users and their goals
- Trade-offs to balance conflicting requirements
- Approaches:
- User-centered design: user knows the best and guides the designer; the designer translates user needs and goals
- Activity-centered design: focus on user behavior around tasks: their behavior determines the goals
- System design: system is in the focus and sets the goal
- Genius design/rapid expert design: design based on the experience and creativity of the designer
Solving the right problem:
- Engineers and business people are trained to solve problems
- Designers are trained to discover problems
- We should rather have no solution than a brilliant solution to a non-existent problem
- Designers should:
- Never start by trying to solve the problem
- Start by trying to understand what the real issues are
- Diverging upon a solution
- Studying people, their needs and their goals
Double diamond of design:
- Four phases of design:
- Diamond 1:
- Discovery:
- Understand the problem. Never assume what the problem is
- Talk to the users
- Define: use insights from discovery phase to describe and define the problem
- Discovery:
- Diamond 2:
- Develop:
- Explore alternatives solutions to the problem
- Seek inspiration from elsewhere
- Deliver:
- Test out the solutions; give them to the user and gather feedback
- Reject under-performing solutions; improve promising ones
- Develop:
- Loop through each diamond as many times as required, and return to the start if required
- Diamond 1:
- Principles:
- Put people first: understand the people using the service; their needs, strengths, aspirations
- Communicate visually and inclusively: help people gain a shared understanding of the problem and ideas
- Collaborate and co-create: work with others
- Iterate: spot errors early
Involving users:
- Expectation management:
- Must give them realistic expectations: no surprises, no disappointments
- Timely training
- Communication, but no hype
- Ownership:
- Make users active stakeholders
- More likely to forgive/accept problems
- Can make a big difference in the acceptance and success of the problem
Interaction design:
- Discover requirements
- Design alternatives
- Which may lead to requirements being refined
- Prototype alternatives
- Evaluate the product and its user experience throughout
- If it sucks, this means that the requirements were probably incorrect
Practical issues:
- Who are the users?
- What are the users’ needs?
- How do you generative alternative designs?
- How do you choose among alternatives?
- Who are the stakeholders?
- They can influence the success/failure of the project, so involve them and keep them happy
What are the users’ needs?
- Users don’t know what is possible
- Instead:
- Explore the problem space
- Investigate user activities: see what can be improved
- Try out potential ideas
Alternative generation:
- Humans tend to stick with what works
- Considering alternatives: the design space, helps identify better designs
- Where do alternative designs come from?
- Flair and creativity: research and synthesis
- Cross-fertilization of ideas from different perspectives
- From users
- Product evolution based on changed use
- Inspiration from similar and different products/domains
- Balance constraints and trade-offs
- Morphological charts:
- List the functions: what does the product need to do?
- e.g. for a beverage container, it must contain the beverage, provide access to the contents, and display product information
- Then for each function, list its means:
- Methods of addressing the functions/user needs
- e.g. for the beverage container, access to the contents could be done through a pull tab, straw, or a cap
- Pick one means for each function
- Not every combination will be practical or possible
- List the functions: what does the product need to do?
Choosing between alternatives:
- Interaction design focuses on externally-visible and measurable behavior
- Technical feasibility
- Evaluation with users or peers
- Use prototypes, not static documentation: behavior is key
- A/B testing:
- Defining appropriate metrics is non-trivial
- Quality thresholds:
- Different stakeholder groups have different quality threshold
- Use usability, user experience goals to defined criteria
Prototyping:
- Allows the designer and their users to explore interactions and capture key interactions
- Focuses on use experience
- Communicates design ideas
- Learn through doing
- Avoids premature commitment
Typical development:
- Sketching
- Helps to express, develop and communicate design ideas
- Storyboards
- UI mockups
- Interaction flows
- Video prototypes
- Interaction prototypes
- Final native application
Low fidelity prototypes:
- Low development cost allows evaluation of multiple design concepts
- Limits the feedback you can get: error checking, navigational and flow limitations
High fidelity prototypes:
- Fully interactive
- Has look and feel of the final product
- Clearly defined navigational scheme
- Much higher development cost
- Sunk cost bias: more reluctant to make changes given the time/effort
08. Collaboration in Mixed Reality
Tuckman’s model of group formation:
- Forming
- Orientating themselves around the task at hand
- Become acquainted with each other
- Testing group behaviors
- Establishing common viewpoints, values
- Establishing initial ground rules
- Storming
- Marked by intense team conflicts
- Leadership and roles determined
- Project and tasks redefined
- Characteristics:
- Disagreements
- Resistance to task demands
- Venting of disagreements
- High level of uncertainty about the goals
- Norming
- Team roles cleared up
- Agreement on how the team can work with each other
- Clear expectations and consensus on group behaviors and norms
- Consensus on group goals, quality standards
- Forming the basis for behavior for the remainder of the project
- Performing
- Active work on a project
- Clearly understood roles, tasks, and well-defined norms
- Sufficient interest and energy from all team members
- Adjourning
- Dissolution of the team: team tasks are accomplished and the team disbands
- Possible feelings of regret
Drexler’s team performance model:
- Orientation: why I am here?
- Trust building: who are you?
- Goal clarification: what are we doing?
- Commitment: how are we doing it?
- Implementation: who does what, when, where?
- High performance
- Renewal
Collaboration
Definitions
- Wood and Gray, 1991: a process that occurs when a group of stakeholders engage in an interactive process using shared rules, norms and structures to act or decided on issues related to that domain
- Terveen, 1995: a process in which two or more agents work together to achieve a shared goal
- Knoll and Lukosch, 2013: an interactive process in which a group of individual group members use shared rules, norms and structures to create or share knowledge in order to perform a collaborative task
Designing collaboration
Collaboration is affected by internal and external factors:
- The group: size, proximity, experience
- Task: type and complexity
- Context: organizational culture and environment
- Process: interactive process, shared rules, norms
- Tools: technology and their limitations
Collaboration outcomes:
- Creative ideas for activities
- Shared understanding
- Commitment
- Consensus
- Sharing perspectives and visions
- More objective evaluation
- Acceptance
- Mutual learning
- Shared responsibility
e.g. using AR to help people understand impacts of climate change.
Collaboration Challenges
Piirainen et al., 2012 - group perspective:
- Shared understanding:
- Ensure the team has a shared understanding and mental models of:
- The problem
- The current state of the system
- The envisioned solution
- Ensure the team has a shared understanding and mental models of:
- Satisfying quality requirements/constraints
- Balancing rigor and relevance
- The more formal the process, the slower you go but the more you can involve and understand stakeholders
- Organizing and ensuring effective, efficient interaction between actors
- Ensuring ownership
- Team members must pick up tasks and take ownership of them
Nunamaker et al. 1997 - process perspective:
- Free riding
- Especially in larger groups
- Dominance
- Both the amount of work done and of decision-making power
- Group think
- Hidden agenda
- Fixed design
- Process limits the design space the group can explore
- Lack of expert facilitators
Haake et al., 2010 and Olson and Olson, 2000 - tool perspective:
- Google Docs, email, video conferencing, etc.
- No regular use
- Variety
- Not intuitive
- Difficult to adapt to group needs
- Collaboration awareness
- Being aware of when other people have made changes
- Co- and spatial referencing
Collaboration Design from a Tool Perspective
Time-space matrix of Computer-Supported Cooperative Work (CSCW):
- Same place, same time (synchronous interaction): face-to-face interaction
- Same place, different time (asynchronous interaction): shared files, team rooms etc.
- Different place, same time (synchronous distributed): video calls, shared editors etc.
- Different place, different place (asynchronous distributed): email, newsgroups etc.
In AR:
- Synchronous, co-located: AR shared space
- Synchronous, remote: AR telepresence
- Asynchronous, co-located: AR annotations/browsing (in-situ)
- Asynchronous, remote: generic sharing
3C model:
- Communication: information exchange to facilitate a shared understanding
- Coordination: arranging task-oriented activities
- Collaboration: working together towards a shared goal
- Group awareness mediates relation: none of the other 3Cs are possible without it
Human-Computer-Human Interaction Design
- Software design: software interacts with other software
- Human-computer interaction design: humans interacting with computers
- Human-computer-human interaction design: several humans in front of several computing devices working together towards a shared task
- Computers must interact with each other, and humans must interact with each other as well
Oregon Software Development Process (OSDP) (Lukosch, 2007):
- Oregon Experiment, Christopher Alexander:
- University campus did not put down any footpaths initially, but waited to see what trails the students would make
- Patterns for computer-mediated interaction:
- High-level patterns:
- Focus on issues and solutions targeted at end users
- Empower end users to shape their groupware application
- Low-level patterns:
- Describe issues/solutions targeted at software developers
- Focus on system implementation and includes technical details
- Example: remote field of vision
- Collaborative whiteboard/canvas: need to know where team members are and where they are looking at
- Possible solution: multi-user scrollbar (multiple narrow scrollbars)
- Understand where their team members are both globally and relative to themselves
- Users can see roughly how much of their screen space intersects with another user’s, and where
- High-level patterns:
- Iterations follow design -> implementation -> test/usage -> planning cycle
- Conceptual iteration
- Talking to users, understanding the problem space, creating prototypes
- Use of patterns to discuss high-level ideas with users
- Developers use the low-level patterns tODO
- Development iteration: TODO
- Requirements analysis
- Low-level patterns used to plan and design groupware
- Functional tests
- Tailoring iteration:
- Users have used the prototypes and have provided feedback
Workspace Awareness in Collaborative AR
Types:
- Informal awareness:
- General sense of who is around and what they are up to
- Not necessarily related to project work
- Social awareness:
- Understanding of the person:
- What they are interested in
- Their emotional state
- What they are paying attention to
- Understanding of the person:
- Group-structural awareness:
- Knowledge about the group structure:
- Roles/responsibilities/status
- Positions on issues
- Group processes
- Knowledge about the group structure:
- Workspace awareness:
- Understanding of the task space
- Interaction of others with the space and its artifacts
Awareness categories and elements:
- Who:
- Presence: is anyone in the workspace?
- Identity: who is participating?
- Authorship: who is doing that?
- What:
- Action: what are they doing?
- Intention: what is their goal?
- They are doing x in order to achieve y
- Artifact: what object are they working on
- Where:
- Location: where are they working?
- Gaze: where are they looking?
- View: where can they see?
- Reach: where can they reach?
- Can children or short people access it?
Workspace awareness:
- Knowledge: who/what/when/when/how
- is used to determine what to look for next
- Exploration
- is used to gather perceptual information
- The environment
- aids in interpreting the perceptual information
- Knowledge
- is used to help with collaboration
- Coordination of activities
- Anticipation of events
- which impacts the environment
Case Studies
Workspace awareness in collaborative AR:
- Remote expert can see what the player can see and give hints on how to complete the puzzle
- Expert given a gray box which represents the size of the Hololen’s display
- Expert can freeze the view:
- View is continually changes, which makes it difficult to focus
- Hence, they should be able to freeze the view (and annotate it), possibly in a separate window
- The remote person must be able to communicate to the local person that they have made changes or annotated something: workspace awareness
- They can, of course, talk, but the paper tried adding automatic notifications:
- Aural: TTS when the remote user adds/selects/deletes and object, or when they freeze/unfreeze the view
- Visual: small blinking icon
- Results:
- Audio is much more noticeable, but also more annoying
- Participants preferred visual notifications
- Game played in two environments (physical/augmented reality)
- Investigate how a remote person can try to help a local team
- Three players had to build a Lego tower following certain constraints:
- Each player has access to a subset of the constraints
- Each player could only move certain-colored blocks
- But also some blocks that everyone could move
- Two players co-located, one player remote
- Co-located users wearing HMDs, remote user viewing laptop
- The same group also played it physically (with randomized order)
- Asked AR presence questionnaire:
- Interaction/immersion
- Interference/distraction
- Audio/tactile experiment
- Moving in environment
- Results:
- Mental demand not significantly different
- Physical demand in AR higher
- Finger has to hover in midair
- Slower in AR
- Presence:
- Interaction in AR much more difficult, and impacts concentration
- Difficult for remote player to understand and foresee the other people’s actions
- Co-located AR players reported tactile experience (even though it was completely virtual)
CSI The Hague:
- Collaboration with Hague police circa. 2009
- Special skills required to secure evidence
- Need to capture evidence early on, but collector is likely not an expert
- Expert could remotely help the on-site person
- Video see-through HMD
- Two webcams used for SLAM:
- 3D pose estimation
- Dense 3D map
- Remote user could explore the space in VR
- Bare hand tracking for gesture-based interaction
- Evaluation:
- Lack of protocol for collaboration
- High mutual understanding
- Picture-oriented information exchange
- High consensus: both parties can see the same video stream
- Data integrity: how do you ensure it has not been modified
- Responsibility: if the crime scene gets messed up, who is responsible - the local person or the expert?
Burkhardt et al., 2009: seven dimensions of collaboration:
- Fluidity of collaboration: verbal turns (cues?)/actions
- Sustaining mutual understanding
- Information exchange for problem-solving
- Argumentation and reaching consensus
- Task/time management
- Cooperative/collaborative spirit in the team
- Awareness of their individual tasks and contribution
09. Creating Multiple-Sensory VR Experiences
Part 1: Yuanjie Wu
Yuanjie Wu, post-doc researcher at HIT Lab.
(currently in Auckland).
Senses
Creating a realistic experience must provide a multi-sensory experience and create a sense of presence.
A VR system can be modeled as a loop of:
- Input: data coming into the system from the user
- Application: physics simulation, user interaction
- Rendering: transform of data in computer-friendly format into human-friendly format - visual, aural, haptic, olfactory, gustatory
- Output: feedback perceived by the user
What is ‘input’ and ‘output’ depends on the point of view: the system or the human.
Subjective reality: the way an individual experiences and perceives the external world in their own mind.
Brains consciously and sub-consciously find patterns. The sub-conscious can be thought of as a filter that only allows information that does not conform to the patterns to pass through.
Perceptual illusions provide insight into some of the shortcuts the brain makes:
- Jastrow and Ponzo railroad illusion: brain can misinterpret size
- Moon illusion: moon appears larger when on the horizon (compared to high in the sky) as there are foreground items that can be used as a frame of reference.
- Ouchi illusion: rectangles appear to move
Mental models: NLP (neuro-linguistic programming)
- External stimuli (senses) pass through
- Filters, which delete, distort and generalize the information
- Based on meta programs, values, beliefs, attitudes, memories, decisions
- Which consciously and unconsciously impacts the person’s
- Internal state:
- Mental model
- Emotional state
- Physiology
VR research problems:
- Avatars
- Tracking
- Cybersickness
- Locomotion
- Navigation
- Perception/cognition
- Social dynamicsoSafety
- Ethics
- Sensory delivery:
- Tactile (e.g. force feedback, temperature, pressure)
- Olfactory/gustatory
- Evaluation metrics
- Interaction/manipulation
- Latency/FOV
- Fatigue
Multi-sensory VR systems
- Sub-systems:
- Stimulation of the senses
- Requires specific hardware, software and protocols
- Stimulation of the senses
- Data processing
- Pre-processing:
- Filtering
- Serialization
- Transmission
- Pre-processing:
- Integration: combining all data into one rendering system
- Data fusion
- Application
Subject wearing HMD in a cage:
- Enough space to walk around a little
- External cameras track position
- Fans mounted on cage used to direct wind
- Aroma diffusers using multiple scent bottles
- Speakers attacked to floor used for vibration
- e.g. simulating off-road driving
Avatar system:
- Control system
- Full body tracking with multiple Kinect cameras
- Needed to estimate orientation - Kinects could not determine if they were looking at the person’s front or back
- Leap motion attached to the headset for natural hand tracking
- Limited tracking range: users had to put their hands directly in front of them
- Fix: stick 5 Leap motion sensors onto the headset
- HTC Vive lighthouse used for HMD positioning?
- Full body tracking with multiple Kinect cameras
Realism:
- Appearance realism
- Behavior realism
- Verbal behavior
- Non-verbal behvaior
- Body movement, facial expressions etc.
Part 2: Rory Clifford
Dr. Rory Clifford, post-doc research fellow at HIT Lab.
Focus on training simulations, cultural preservation.
What creates a profound VR experience?
- Emotion
- Sound
- Movement
- Makes users feel present and localized within the space
In the first 30 seconds, you must:
- Grab the person’s attention
- e.g. flashing light to grab the user’s attention
- Provide affordances to navigate the environment
- They may be going the wrong way
- Although both diagetic and non-diagetic affordances can be used, diagetic cues keep the user more immersed
- Provide a natural and intuitive method of interaction
Sound:
- Induces mood
- Deepens the presence
- Adds believability
- 3D spatial sound especially deepens immersion
- Can also help with UI problems like navigation and discovery
- Don’t over do it
Movement:
- Movement types:
- Teleportation
- 360 video:
- Quick and easy way to produce VR content
- Can only teleport to pre-defined positions
- 360 video:
- Gaze-based
- Physical controls
- e.g. replica of steering wheel
- Teleportation
Smell:
- Olfactory sensory system
- Direct connection to brain through crainal nerves: most other sensory input passes through hypothalamus - an additional step of processing
- Must limit amount of smell to prevent simulator sickness
- Can trigger memories
- Theory: help users remember VR training when in actual scenarios
Vibro-tactile feedback:
- e.g. jolts, earthquakes, engine vibration
- Low-frequency audio passing through subwoofers or audio transducers
- Can be external (e.g. floor or other hard surface) or fitted (e.g. vest)
- Vests: portable, but users are aware that the vest is there, reducing immersion
Haptics:
- Independent of the sound channel
- Assists with spatial awareness and helping anchor the user in virtual space
- More control over the vibration (supported in game engines)
Fire Emergency NZ (FENZ):
- Arial firefighting training
- Can only train once a year before fire training
- Can’t exactly start fires for training
- Expensive: requires several aircraft
- Projector-based windows
- Headsets with multiple simulated audio channels mimicking real headsets
- Vibro-tactile feedback in chairs
Modeling the real world:
- Photogrammetry:
- Low accuracy but provides good textures
- Requires cleanup to reduce number of polygons
- LiDAR:
- High-accuracy, high-polygon count
- Camera used for texturing, but not great - should be combined with photogrammetry
10. Human Perception and Presence in MR
Rob Lindeman, Director HITLab NZ.
In popular media:
- UI about complementing the character: their personality, proficiency in technology, basic scene state (e.g. red blaring lights for bad information coming).
- Impressions matter
- Flow is a good concept to study
- Popular media can give us good ideas
Terms:
- Presence: sense of ‘being there’
- Immersion: being surrounded
- Flow: heightened state of awareness/action
- Situation awareness: clear understanding of surroundings
- Natural interaction: interaction that recedes into the background
- Low cognitive load
‘Being there’
What does it mean to ‘be here’?
- Experience of going through some process to get to a place (e.g. walking through the door)
What does it mean to be together?
- Eye contact with others, talking, shaking hands
How can we re-create these using technology?
In a real environment, we can use:
- Hand-held mobile device
- Phones/tablets
- In-vehicle system
- Navigation/traffic
- Augmented reality
- There++: augmenting reality
For a remote physical environment:
- Phone
- Video conference
- Eye contact difficult: looking at the camera (for eye contact) means you can’t see others
- Teleoperated robots
- Allows movement and possibly even manipulating the environment
- Drones
In virtual environments:
- Video games: FPS, MMOs
- Can be present even without VR
- Multiplayer games mimic physical co-presence
- Immersive learning environments
- e.g. immersive chemistry
- Surgical simulations
- Allows more precision and manipulators
- Allows training on simulated data
In described environments:
- Movies
- Books
- As long as you have the essence, the brain is able to fill in the blanks through their imagination
- However, everyone imagines a different scene: this can lead to disappointment when a book is adapted into a movie
Game Design
What makes a good game?
A great game is a series of interesting and meaningful choices made by the player in pursuit of a clear and compelling goal
- Sid Meier
‘Natural Funativity’:
- Survival-skill training
- Needs to have the player develop a set of skills with increasing levels of difficulty
- Putting them to the test: missions, quests, levels etc.
- Prize at the end (or in the middle)
- e.g. unlocking items, badges, leaderboards
Game structure:
- Movies:
- (typically) have a linear structure
- Are fixed - controlled by the writer/director/cinematographer
- In comparison, games must provide ‘interesting and meaningful choices’
- User must be in control
- Not fun to die due to circumstances outside your control
- Choices must make sense in the context of the story
Flow
Mihály Csíkszentmihályi, Flow: The Psychology of Optimal Experience (1990):
- Hightened sense of perception
- Highly focused on the primary task
- In the ‘sweet spot’ between frustration and boredom Occurs in athletes, writers, video gamers, programmers For game design:
- The ‘sweet spot’ for difficulty is relatively large
- Game difficulty must match the player skill (and increase over time)
- But if it matches exactly, this itself will cause the player to get boredom. Hence, difficulty should oscillate slightly
Convexity of game play:
- Provide a choke point: all paths, regardless of what the player chose, should lead to a single result
- e.g. bosses at the end of game stages, story progression
- The number of choices available after every iteration should increase
- e.g. unlocked items, skills, regions to explore
- This addresses the narrative paradox: writers can create a complete story while providing players with (the perception of) choice
Flow and convexity can be combined:
- The choke point should be at the higher end of difficulty
- After the choke point, choices can be provided to the user and the difficulty slightly decreased (relative to the player’s skill)
- By Jenova Chen (Thatgamecompany)
- Adaptive difficulty: game tries to determine player skill and adaptively change the difficulty level to match
Characterizing Flow:
- A challenge activity that requires skills
- The merging of action and awareness
- Tight coupling between actions and responses
- Clear goals
- Direct feedback
- Concentration on the task at hand
- Sense of control
- Loss of self-consciousness
- Transformation of time
Immersion
Immersion:
- To completely surround/envelope it
- e.g. swimming, intensive language course
- Affects all the senses
- Sound can be as important as the visuals
- Also need to consider touch and smell
- How can we immerse MR users?
Haptic ChairIO (Feng et al., 2016):
- Chair that looks like a joystick
- And acts like a joystick: it leans, and tilt sensors can be used as input
- HITLab added vibration floor, pan-tilt fan units:
- Combined with VR headset for audio/video
- Footstep vibrations and fans (wind from the ‘motion’) provide movement cues
- Non-fatiguing: sitting down, hands free to do other work
- Clear mapping of seat movement to camera movement
Natural interaction:
- Recedes into the background:
- Low cognitive load for interaction techniques
- Stimuli/feedback can be easily digested
- Low cumber
- Multi-sensory feedback
- Multi-modal user input
- e.g. ‘put that over there’: combines pointing (gesture) and voice
- Hybrid ways of executing commands
- Interactions should evolve with the user
- Provide scaffolding to novices
- Provide fast and efficient interactions for experts
Personal experiences:
- We all filter our senses
- Variations in eyesight, hearing etc.
- Different childhood experiences
- Different moods
Presence
Types of presence:
- Presence: sense of ‘being there’
- How virtual characters react to you
- The depth of the interactions with the environment
- Can you turn on the tap? Open a cupboard? Pick up a cat?
- Every interaction has a cost, both in terms of development and performance
- The invisibility/naturalness of the interface
- The lack of distractions (e.g. cables)
- Co-presence: ‘being there together’
- Multiple people can be in the same shared space without feeling ‘together’
- Tele-presence: ‘being over there’
- Remotely present in a partially physical space
- Tele-co-presence: ‘being over there together’?
Measuring presence:
- How can be measure if someone feels ‘present’ in a game or other virtual environment?
- How can we measure the depth of presence?
- Methods:
- Questionnaires
- Slater Usoh Steed
- Witmer & Singer
- Questions must be written carefully and validated
- Ensure they are unambiguous
- Measurement is done after the fact
- Behaviors
- Watch the user and see how they react
- If you throw something at them, do they duck?
- If they get hit, do they scream?
- Will they refuse to walk off a ledge?
- Hard to measure the depth of presence (but easy to see it)
- Issue: you may need to invent/incorporate events
- Watch the user and see how they react
- Physiological measures
- Possible metrics:
- Heart rate
- Sweat (galvanic skin response or skin conductance)
- Breathing rate/regularity
- Hard to fake
- Issues:
- Some measures take time to settle
- May need to calibrate to a baseline
- Need to wear sensors
- Possible metrics:
- Questionnaires
The Real World
The real world is great:
- Fast update rate
- Multi-modal rendering
- Really good physics
- Nearly infinite fidelity
- Can handle massive numbers of objects and players
- Realistic crowd behavior
- Minimal lag
Hence, it is useful to use existing things from the real world: this makes AR easier than VR in terms of fidelity.
But beyond perceptual, there is:
- Anticipation
- Expectations
- Previous experiences
We can tap into experiences already anchored in the mind of the user: provide the essence and let the brain fill in the details, or plant new experiences: seeds that can grow and become scaffolding for future experiences.
To do this:
- Prime the user to expect what you are about to show
- A VR experience starts long before the physical experience:
- Advertising
- Word of mouth
- To plant the seed, tell give them some specific information: this reduces variability between users.
- e.g. while you wait in line at a Disney park, you are shown videos, newspaper clips describing the backstory etc. which immerse you and reduce perceived wait time
- A VR experience starts long before the physical experience:
- Remove all distractions
- Non-interactable objects (e.g. cupboards that you can’t open)
- Lack of interaction precision
- Fatigue
- Bumping into cables
- Wearing a lot of gear
The myth of technical immersion:
- Technology is not necessary to achieve immersion
- Books are very low-tech but can still transport us to fantastic places
- Our ‘high-fidelity’ technology is still relatively low-fidelity:
- Leverage the mind to fill in the blank
- e.g. in Alien, you don’t see the alien until the end
- e.g. reading a ghost story at night with a window open: the environment and story are matched
- Tasks should be:
- Easy to learn
- Easy to carry out
- Not fatiguing
- Require appropriate precision
- e.g. movement/velocity control: need both very fine and large movements
- Support appropriate expressiveness
Impossible spaces:
- Have a non one-to-one mapping for rotation to redirect walking and effectively increase the size of the virtual space
- Change blindness for redirected walking: modify/reconfigure the virtual space when they are looking the other way
- Redirection also works with reaching, touching
Dava Visualization in Mixed Reality
Master in Human Interface Technology (MHIT)
HITLab NZ:
- Founded in 2002
- Research focuses: VR, AR, applied immersive gaming
- Philosophy:
We put people before technology, start with the person, look at all the tasks they are trying to perform, TODO
MHIT:
- Application, development and evaluation of HIT
- Learn:
- Interface design principles
- Describe/evaluate interface hardware/technology
- Research/development skills
- Engage with industry
- 3 months of course work:
- HITD602 design & evaluation
- Relationship between aesthetics, function, UX
- Evaluation of design/experience
- HITD603 prototyping and projects
- Requirement analysis, engaging with clients/problem owners
- HITD602 design & evaluation
- 9 month thesis project
- Develop prototype
- Run user study
- Write thesis
- Requirements:
- BEHons
- Min. B+ grade
- Scholarships available: more or less certain that you could get fees-only scholarship
- One student getting stipend from industry
- 22% of MHIT students remain in academia (enrolled in PhD program)
Data Visualization in Mixed Reality
Immersive analytics (Immersive Analytics, Springer, 2018):
- Coping with the ever-increasing amount and complexity of data around us that surpasses our ability to understand/utilize in decision-making:
- Business analysis
- Science
- Policy making
- General public (e.g. personalized health data)
- Removing barriers between people, their data and tools used for analysis
- Support data understanding and decision-making everywhere by everyone
- Allows both individual and collaborative works
- Engagement helps support data understanding and decision maknig
- Builds upon:
- Data visualization
- Visual analytics
- VR/AR
- Computer graphics
- HCI
Very dependent on availability of immersive technologies:
- HMDs for AR/VR
- Large wall-mounted, hand-held or wearable displays
- ML to interpret user gestures/utterance
Immersive analytics allows engagement:
- With wider audience through tools/technologies that more fully engage the senses
- With a new generation whose primary input device is not the mouse/keyboard
- In situations are desktop computing is impossible
- In groups where all participants are equally empowered
Opportunties:
- Situated analytics
- User-controlled data analytic linked with objects in the physical world
- Energy consumption
- Construction progress
- Supermarket (e.g. nutritional value of foods, comparison)
- Instruments in a lab
- User-controlled data analytic linked with objects in the physical world
- Embodied data exploration
- Touch/gesture/voice/TUI for more intuitive/engaging data exploration
- Computer becomes invisible to the user
- Collaboration: colocated or remote; synchronous or asynchronous
- Spatial immersion: 3D (or 2.5D) rather than 2D visualization
- Multi-sensory presentation
- Beyond visual/audio (e.g. haptics)
- Augmented cognition
- Engagement in data-informed decision-making
- Involve the general public/other stakeholders
- Allows immersive interactive narrative visualizations (e.g. climate change, carbon footprint)
Possible Values of 3D for Data Visualizations
Additional visual channel (3rd spatial dimension) for data visualization:
- Prone to occlusion, depth disparity, foreshortening
- Studies demonstrate some benefits to this channel
Immersive display technologies have advanced considerably: higher resolution, lower latency, wider range of interaction technologies
Immersive workspaces:
- Use the space around you as a workspace
- Place data visualizations where you want, anchored to the physical space (or relative to your position)
- Beyond task effectiveness:
- Focus not on accuracy/speed
- Does spatial immersion support deeper collaboration, greater engagement, or a more memorable experience?
Depth Cues and Display Technology
- Linear perspective:
- Consequence of the projective properties of the eye as a sensor:
- Occlusion: objects closer in space prevent us from seeing objects behind it
- Foreshortening
- Relative size: two objects of the same size at different distances from the observers project differently
- Relative density: spatial patterns of objects/visual features appear denser as the distance to the pattern increases
- Height in visual fields:
- Objects are bound to rest on the ground
- Bottom of objects can be used as a reference
- Aerial perspective: changes in color properties of objects at large distances
- Motion perspective: moving object/observers provide information about 3D structure
- Binocular disparity/stereopsis: small differences in the images received by the left/right eye
- Accommodation (depth of field):
- Effects of dynamic physiological changes in the shape of each eye
- Amount of blur of the background and other objects provides information about their relative distance
- Dependent on the lightness of the scene
- Depth cues:
- Shadows
- Cue for judging the height of an object above the plane
- Useful for floating objects
- Convergence
- Reflex of the visual system: change in rotation of the eyes that takes place to align the object/region of interest in the center of the eyes’ fovea
- Eye orientation/angle (and differences between the two eyes) can be used to infer short distances
- Shadows
- Controlled point of view
- Ability to manipulate the point of view in a virtual space (without physically moving)
- User knows positional changes, expects visual changes
- Relies on touch/proprioception
- Complementary to visual cues
- e.g. moving joystick to move your avatar/camera
- Subjective motion
- Actual physical motion in the space of the observer
- Information through the vestibular system (balance, movement detection)
- Complementary to visual cues
- Object manipuation
- Change position of objects with respect to the observer
- Trigger motion perspective, changes in other cues
- Does not trigger vestibular signals; uses touch (somatic), motor, priprioception
Limitations of depth perception:
- 30% of population may experience binocular deficiency
- Binocular acuity decreases with age
- Line-of-sight ambiguity: rays can only intersect once (occlusion)
- Text legibility
- Low resolution of HMDs
- Foreshortening, 3D orientation
Comparing 2D with 3D Representations - Potential Benefits of Immersive Visualization
Cone Trees:
- Indented lists/tree structures in 3D, where nodes are arranged in a cone that you can rotate
- Linear perspective provides a focus+context view of the tree
- 3D cues of perspective, lighting, shadows help with understanding
- More effective use of display space
- Interactive animation reduces cognitive load
- Study results:
- Poor representation for hierarchical data: occlusion, slow tree rotation
- May help in improving understanding of the underlying structure
Data mountains:
- Arrange documents on a virtual 3D desktop
- More objectives on the desktop
- Linear perspective provides focus + context view
- Natural metaphor for grouping
- Leverages 3D spatial memory
- Study results:
- 2D data mountains outperformed 3D, although participants thought otherwise
- 2.5D data (2D + linear perspective) outperform 2D
- i.e. 3D < 2D < 2.5D
Aviation:
- Show position and predicted flight path in 3D
- Study results:
- Better for lateral/altitude flight path tracking
- Worse for accurate measurement of airspeed
- ATC found it worse for everything other than collision avoidance
3D shapes/landscapes:
- 3D better for:
- Understanding the overall shape
- Approximate navigation and relative positioning
- 2D better for precise manipulation
Network visualization:
- 3D better for judging if there is a path between highlighted nodes
- Motion cues beneficial for:
- Path following in 3D mazes
- Viewing graphs in AR
- Egocentric spherical layout of 3D graph with HMD outperforms 2D for:
- Finding common neighbors
- Finding paths
- Recalling node location
Multivariate data visualization:
- 3D scatter plots better for:
- Distance comparisons
- Outlier detection
- Cluster identification and shape identification
- Answering integrative questions
Spatial and spatio-temporal data visualization:
- 2D vs 3D representations in VR:
- Exocentric: globe in front of view
- Egocentric: standing inside globe
- Flat map
- Curved map around the user
- Exocentric globe more accurate for distance comparison and estimation
- More time required for task completion compared to maps
Overall:
- Clusters/other structures may be clearer in 3D
- Sufficient depth cues required for the viewer to see clusters
- 3D may benefit path following
- Binocular ‘pop-out’ may be beneficial for highlighting elements
- Using the 3rd dimension to show time is a successful idiom
Summary:
- 3D not generally better than 2D
- 3D may show overall structures in multi-dimensional spaces better
- 2D preferable for precise manipulation or accurate data value measurement
- Choice of technology and depth cues can make a significant difference to the effectiveness:
- Binocular presentation, head-tracking increased spatial judgment accuracy
- Binocular 3D beneficial for depth-related tasks: spatial understanding and manipulation
Data Visualization in AR - Situated Analytics
- Data visualizations integrated into the physical environment
- Needs to take into account the existence of the physical world
- Examples:
- Supermarket (e.g. viewing detailed product information, price comparison)
- Attendees at a conference (e.g. displaying name, affiliation)
- Machinery in a lab (e.g. showing progress)
- Objects at a building site
Conceptual model:
- The raw data and the visualization pipeline exist in a logical world
- Raw data is turned into a visual form fit for human consumption
- Data is brought into the physical world through a physical presentation
- A physical referent (real-world items) may be present
Physically vs perceptually-situated visualizations:
- Physical distance separating a physical presentation and its physical referent may not necessarily match the perceived distance (e.g. visualizing microchip vs mountain)
- Spatial situatedness needs to be refined:
- Physically situated in space: if its physical presentation is physically close to the data’s physical referent
- Perceptually situated in space: if its physical/virtual presentation appears to be close to the data’s physical referent (e.g. mountain and its data visualization)
Embedded vs non-embedded visualizations:
- Embedded visualizations are deeply integrated within their physical environment
- Different virtual sub-elements align with their related physical sub-elements
Interaction:
- By altering its pipeline (e.g. filtering data)
- By altering the physical presentation (e.g. moving around, re-arranging elements)
- Using insights to take immediate action
12. Evaluating Immersive Experiences
Can simply ask the player for their opinion, but these statements are qualitative.
Through validated instruments that use questionnaires, you can get quantitative data (e.g. on situational awareness, workload).
There are many methods to achieve this:
- Usability (System Usability Scale (SUS))
- Game Experience Questionnaire (GEQ)
- Situational Awareness (Overview and SART)
- NASA Task Load Index (TLX)
- Simulation Workload measure (SIM-TLX)
- Immersive Tendencies Questionnaire (ITQ)
- iGroup Presence Questionnaire (IPQ)
- User Experience Questionnaire (UEQ)
- Game Engagement Questionnaire (GEQ)
- Revised Game Engagement Model (R-GEM)
- Revised Personal Involvement Inventory (PII)
- Flow Short Scale
Engagement
What is engagement?
- Some disagreement between academics in what it is and how you quantify it
- Benyon et al. 2005:
- Must be accessible, usable and acceptable
- Should provide experiences that pull people in to create experiences that are:
- memorable,
- satisfying,
- enjoyable,
- rewarding
- IJsselsteijn et al. 2008:
- Sensory and imaginative immersion
- Tension
- Competence that is asked of the user
- Flow
- Negative/positive effect on the user
- Challenge
Elements of Flow (Csikszentmihalyi):
- Be feasible for the user to complete the task
- Allow the user to concentrate on the task
- Have clearly defined goals
- Provide feedback on the user’s actions
- Feel involved in the situation
- Give the user control over the situation and goals
- Allow for a loss of self-conciousness: stop being aware of themselves
- Transformation of time: forget about time passing by
- Autotelic experience: activities should be intrinsically rewarding
O’Brien & Toms:
- Point of engagement: user decides to use the system based on factors such as:
- Aesthetics
- Novelty
- Interest
- Personal motivations
- Specific/experimental goals
- Engagement:
- Aesthetics and sensory appeal
- Attention
- Awareness
- Control
- Interactivity
- Novelty
- Challenge
- Feedback
- Interest
- Positive
- Disengagement attributes which prevent users from re-engaging with the system:
- Usability
- Challenge
- Positive affect
- Negative affect
- Perceived time
- Interruptions
Situational Awareness
AR promises the ability to provide additional information to the environment you are in.
Many jobs require high situational awareness to make effective and timely decisions.
Situational awareness (Endsley, 1995):
- Level 1: the perception of elements in the environment
- Level 2: the comprehension of their meaning
- Level 3: the projection of their status in the near future
- People make decisions based on their situational awareness, and their actions change the state of the environment, creating a feedback loop
- Situational awareness, decision-making and the performance of their actions can be influenced by:
- System capability
- Interface design
- Stress/workload
- Complexity
- Automation
- As well as more individual factors:
- Abilities, experience, trailing
- Their ability to process information
- Situational awareness and decision-making can be affected by:
- The user’s goals/objectives
- Their preconceptions and expectations
Assessing situational awareness:
- Self-rating:
- Non-intrusive: ask questions post-trial
- Subjective
- Situation Awareness Rating Technique (SART):
- Most well-known self rating system
- 10 dimensions on a Likert scale from 1 to 7
- Applicable when:
- The task is dynamic, collaborative and changeable (e.g. long tasks that can’t be frozen)
- Task outcome is not known (e.g. real world task)
- U-(D-S) score:
- U: summed understanding
- Information quantity/quality
- Familiarity with the situation
- D: summed attentional demand
- Instability, variability and complexity of the situation
- S: summed attentional supply
- Arousal (alert/ready for activity or low alertness level)
- Spare mental capacity
- Concentration of attention
- Division of attention
- U: summed understanding
- Freeze probe:
- Task randomly frozen: questions asked about the current or recent state of the system
- May negatively affect performance
- Real-time probe:
- Experts ask questions asked during the experiment
- No task freeze
- Response time indicator of situational awareness
- Observer rating:
- Experts observe participants while they do the task and rate their situational awareness
Metrics
NASA Task Load Index (TLX):
- Subjective workload assessment tool designed for human-machine systems
- Users rate workload on five dimensions:
- Mental demands
- Physical demands
- Temporal demands (how hurried/rushed was the pacing?)
- Performance: success in achieving the task
- Effort/frustration:
- How much effort did they put into the task
- How insecure/discourage/irritated/stressed/annoyed were they
- Overall workload score takes weighted average, with weights being defined by the experimenters based on their expert judgment
Simulation workload measure (SIM-TLX):
- Based on NASA-TLX
- Released 2020
- Considers degree of immersion, perceptual difficulties, novel methods of controlling the environment
- In addition to mental, physical, temporal demands, and frustration, asks:
- Task complexity: how complex was the task
- Situational stress: how stressed were they while perfuming the task
- Distraction: how distracting was the task environment
- Perceptual strain: how uncomfortable/irritating were the visual/auditory aspects
- Task control: how difficult was control/navigation
System Usability Scale (SUS):
- 10 questions on a Likert scale from 1 to 5:
- I think that I would like to use this system frequently
- I found the system unnecessarily complex
- I thought the system was easy to use
- I think that I would need the support of a technical person to be able to use this system
- I found the various functions in this system were well integrated
- I thought there was too much inconsistency in this system
- I would imagine that most people would learn to use this system very quickly
- I found the system very cumbersome to use
- I felt very confident using the system
- I needed to learn a lot of things before I could get going with this system
- Items 1, 3, 5, 7, 9: take sum of val - 1
- Items 2, 4, 6, 8, 10: take sum of 5 - val
- Multiply sums by 2.5
- ‘Good’ usability: average SUS value > 68
Game Experience Questionnaire (GEQ (another also has the same acronym)):
- Modular structure with core, social presence, and post-game modules
- Likert scale from 0-4; each question assesses one of seven components:
- Competence
- Immersion
- Flow
- Tension
- Challenge
- Positive/negative
- Slightly controversial: heavily used and can provide some insights, but was never validated
- If GEQ score is low while the usability score is high, it likely means the game is bad
- Bad usability will usually lead to a bad game experience
Igroup Presence Questionnaire (IPQ):
- How much an individual believes they are really in the virtual environment (VE)
- Constructed with ~500 participants
- Three subscales:
- Spatial presence: sense of being physically present in the VE
- Involvement: attention devoted to the VE and the involvement experience
- Awareness of real world surroundings (e.g. sound, room temperature, other people etc.)
- Experienced realism: subjective experience of realism
- 14 questions, including a general question that does not belong to any of the subscales
- Answered on a Likert scale from 0 to 6 (-3 to 3)
- Answers for each sub-scale summed together (with a few questions inverted)
- Results in a 3D scale
Immersive Tendency Questionnaire (ITQ):
- Measuring the tendency of individuals to be involved/immersed: how much of the immersion comes from the experience you created versus the participant’s tendencies?
- Participants group may be biased - people taking part in VR studies likely to have more experience with VR compared to the general population
- 7 point scale per item
- Three subscales:
- INVOL: tendency to become involved in activities:
- Difficulties with people getting your attention/being aware of surroundings when watching tv/movie/reading book
- Identifying closely with characters
- Becoming scared/apprehensive/fearful after watching TV show/movie
- FOCUS: tendency to maintain focus on current activities:
- How physically fit/mentally alert they feel currently
- How well they can block out external distractions
- Losing track of time
- GAMES: tendency to play games
- Feeling like they are inside the game rather than controlling it through a controller
- How often do they play video games
- INVOL: tendency to become involved in activities:
User Experience Questionnaire (UEQ):
- Measure UX of interactive projects
- 7-step Likert scale from -3 to 3
- Fully validated (in multiple languages)
- 6 scales, 26 items:
- Attractiveness: overall impression of the product
- Perspicuity: how quickly/easily they can learn to use the product
- Efficiency: how fast/efficient the interaction (and feedback) is; amount of perceived ‘unnecessarily’ effort
- Dependability: how in-control they feel; can the user predict system behavior and feel ‘safe’ while using the product
- Stimulation: how exciting/fun is the product
- Novelty: how innovative/creative is the product
Game Engagement Questionnaire (GEQ):
- Uses engagement as an indicator of game involvement
- Attempts to quantify absorption, flow, presence and immersion
- Questions on a no/maybe/yes scale with each question having a unique mapping to a numeric scale
Revised Game Engagement Model (R-GEM):
- Evaluates subjective gameplay experience
- Extends the GEQ
- At some point users shift from low-level to high-level engagement
- Low level:
- Immersion: felling of being enveloped by the game’s stimuli/experiences
- Involvement: motivation to play
- High level:
- Presence: feeling of being physically located within the game
- Flow: optimal experience of intrinsically-motivated enjoyment
- Questionnaire based on SUS, ITQ, PQ, Flow Short Scale (FSS), Personal Involvement Inventory (PII), Technology Acceptance Model (TAM)
Case Studies
AR game to assess upper extremity motor dysfunctions:
- Existing validated tools to assess motor performance for e.g. stroke, Alzheimer, Parkinson’s patients
- Instead of moving physical items, use AR and motion capture to assess health
- Goals:
- Evaluate usability/game experience
- Compare characteristics of movements in AR versus real world
- Used NASA-TLX, SUS, GEQ and Kinect motion capture to collect data
- Results:
- More engaging than standardized tests
- Motion capture was not accurate enough (at least not for initial assessment)
- Technology is probably good enough today
Human augmentation for distributed situational awareness:
- Collaboration with Dutch police and fire department
- Virtual co-location of local and remote experts
- Person at crime scene wears HMD (and backpack laptop), remote expert can annotate crime scene
- Remote expert also has audio connection
- Results:
- AR increased workload and situational awareness
- Remote colleague appreciated: acted as advisor
- Local user wanted avatar for the remote colleague for more presence, not just voice
Aerial wildfire firefighting training:
- Air attack supervisor (AAS):
- Coordinates fire crews fighting wildfires
- Communicates hazards and gives advice
- Stressful and dangerous
- Transitional AAS training expensive, rarely done
- Conditions:
- Cylindrical projection display with 270 degree field of regard
- AAS and pilot can see and interact with each other
- HMD with 360 degree field of regard (but limited by headset’s FoV)
- AAS can see an avatar of the pilot
- Cylindrical projection display with 270 degree field of regard
- Methods:
- SART for situational awareness
- Non-significant difference
- NASA TLX for workload
- HMD had slightly lower workload
- IPQ for presence
- HMD had slightly higher presence
- SART for situational awareness
Superhuman Sports
Designing MR games that motivate and engage users in physical activity.
You can:
- Augment the senses
- Extra-sensory perception:
- X-ray vision
- ‘Spider sense’
- Clairvoyance
- Sensory augmentation:
- Map an ‘invisible play world’ onto existing senses (substitution)
- Change properties of one sensory modality into stimuli for another
- Extra-sensory perception:
- Augment the body
- Laser tag: vest with PGM force feedback with constricted movements the wearer’s movements the more they got shot
- Mechanical tail: affected balance as the player moved around
- MetaArmS: remap feet to mechanical arms
- Augment the playing field:
- Adding virtual elements:
- New physics
- New equipment
- New opponents
- Train in a safe environment:
- Climbing treadmill:
- Circular wall which rotates as you move up, allowing users to climb endlessly
- Users always ~50 cm above the ground, but there is a button which causes a ‘platform’ to appear
- Physical excursion adds immersion to the experience?
- Climbing treadmill:
- Adding virtual elements:
- Technology is still not quite ready, so interactions should be kept simple