Difference Between VR and AR

This weekend, my sister Sarah challenged me to define the difference between Virtual Reality and Augmented Reality. And the more I talked about the differences between them, the more I realised that I don’t have a concrete definition, and I don’t think that anybody else does either.

A man wearing a VR headset while writing on a whiteboard.
AR: the man sees simulated reality and, probably, the whiteboard.
VR: the man sees simulated reality only, which may or may not include a whiteboard.
Either way: what the hell is he doing?

After all: from a technical perspective, any fully-immersive AR system – for example a hypothetical future version of the Microsoft Hololens that solves the current edition’s FOV problems – exists in a theoretical superset of any current-generation VR system. That AR augments the reality you can genuinely see, rather than replacing it entirely, becomes irrelevant if that AR system could superimpose a virtual environment covering your entire view. So the argument that compared to VR, AR only covers part of your vision is not a reliable definition of the difference.

Spectrum showing AR view of the Louvre on the left (with 5% of the view occluded by UI) and VR view of a 3D model of the Louvre on the right (with 100% of the view occluded). The middle of the spectrum remains undefined.
The difference by how much of your view is occluded by machine-generated images fails to define where the boundary between AR and VR lies. 5%? 50%? 99%?

This isn’t a new conundrum. Way back in 1994 back when the Sega VR-1 was our idea of cutting edge, Milgram et al. developed a series of metaphorical spectra to describe the relationship between different kinds of “mixed reality” systems. The core difference, they argue, is whether or not the computer-generated content represents a “world” in itself (VR) is just an “overlay” (AR).

But that’s unsatisfying for the same reason as above. The HTC Vive headset can be configured to use its front-facing camera(s) to fade seamlessly from the game world to the real world as the player gets close to the boundaries of their play space. This is a safety feature, but it doesn’t have to be: there’s no reason that a HTC Vive couldn’t be adapted to function as what Milgram would describe as a “class 4” device, which is functionally the same as a headset-mounted AR device. So what’s the difference?

You might argue that the difference between AR and VR is content-based: that is, it’s the thing that you’re expected to focus on that dictates which is which. If you’re expected to look at the “real world”, it’s an augmentation, and if not then it’s a virtualisation. But that approach fails to describe Google’s tech demo of putting artefacts in your living room via augmented reality (which I’ve written about before), because your focus is expected to be on the artefact rather than the “real world” around it. The real world only exists to help with the interpretation of scale: it’s not what the experience is about and your countertop is as valid a real world target as the Louvre: Google doesn’t care.

Milgram et al. (1994)'s Table 1: Some major differences between classes of Mixed Reality (MR) displays, showing seven classes: 1. monitor-based video with CG overlays, 2. headset-based video with CG overlays, 3. headset-based see-through display with CG overlays, 4. headset-based see-through video with CG overlays, 5. monitor/CG world with video overlays, 6. headset-based CG world with video overlays, and 7. CG-based world with real object intervention.
Categories 3 and 4 would probably be used to describe most contemporary AR; categories 6 and 7 to describe contemporary VR.

Many researchers echo Milgram’s idea that what turns AR into VR is when the computer-generated content completely covers your vision.

But even if we accept this explanation, the definition gets muddied by the wider field of “extended reality” (XR). Originally an umbrella term to cover both AR and VR (and “MR“, if you believe that’s a separate and independent thing), XR gets used to describe interactive experiences that cover other senses, too. If I play a VR game with real-world “props” that I can pick up and move around, but that appear differently in my vision, am I not “augmenting” reality? Is my experience, therefore, more or less “VR” than if the interactive objects exist only on my screen? What about if – as in a recent VR escape room I attended – the experience is enhanced by fans to simulate the movement of air around you? What about smell? (You know already that somebody’s working on bridging virtual reality with Smell-O-Vision.)

Dan and Robin concluding a VR Escape Room.
Not sure if you’re real or if you’re dreaming? Don’t ask an XR researcher; they don’t know either.

Increasingly, then, I’m beginning to feel that XR itself is a spectrum, and a pretty woolly one. Just as it’s hard to specify in a concrete way where the boundary exists between being asleep and being awake, it’s hard to mark where “our” reality gives way to the virtual and vice-versa.

It’s based upon the addition of information to our senses, by a computer, and there can be more (as in fully-immersive VR) or less (as in the subtle application of AR) of it… but the edges are very fuzzy. I guess that the spectrum of the visual experience of XR might look a little like this:

Comic showing different levels of "XR-ness", from the most to the least: The Matrix has you now, fully immersive VR, headset AR, "Magic Window" AR, conventional videogaming, HUDs, using google maps while walking, imagining playing Tetris, reality.

Honestly, I don’t know any more. But I don’t think my sister does either.

A man wearing a VR headset while writing on a whiteboard.× Spectrum showing AR view of the Louvre on the left (with 5% of the view occluded by UI) and VR view of a 3D model of the Louvre on the right (with 100% of the view occluded). The middle of the spectrum remains undefined.× Milgram et al. (1994)'s Table 1: Some major differences between classes of Mixed Reality (MR) displays, showing seven classes: 1. monitor-based video with CG overlays, 2. headset-based video with CG overlays, 3. headset-based see-through display with CG overlays, 4. headset-based see-through video with CG overlays, 5. monitor/CG world with video overlays, 6. headset-based CG world with video overlays, and 7. CG-based world with real object intervention.× Dan and Robin concluding a VR Escape Room.× Comic showing different levels of "XR-ness", from the most to the least: The Matrix has you now, fully immersive VR, headset AR, "Magic Window" AR, conventional videogaming, HUDs, using google maps while walking, imagining playing Tetris, reality.×

An Unusual Workday

Some days, my day job doesn’t seem like a job that a real person would have at all. It seems like something out of a sitcom. Today, I have:

  • Worn a bear mask in the office (panda in my case; seen below alongside my head of department, in a grizzly mask).
    Bears in the office
  • Chatted about popular TV shows that happen to contain libraries, for inclusion in a future podcast series.
  • Experimented with Web-based augmented reality as a possible mechanism for digital exhibition content delivery. (Seen this thing from Google Arts & Culture? If you don’t have an AR-capable device to hand, here’s a video of what it’s like.)
    Virtual Reality at the Bodleian
  • Implemented a demonstrative XSS payload targetting a CMS (as a teaching tool, to demonstrate how a series of minor security vulnerabilities can cascade into one huge one).
  • Gotten my ‘flu jab.

Not every day is like this. But sometimes, just sometimes, one can be.

Idea: mobile app that uses camera and shifts colour-balances to make colours “visible”

This self-post was originally posted to /r/ColorBlind. See more things from Dan's Reddit account.

I’m not colourblind, and I’m not really a mobile developer, so maybe there’s something I’ve missed, but I’ve got an idea for an app and I thought I’d run it by you guys to see if there’s something I’ve missed.

Mobile processing power is getting better and better, and we’re probably getting close to the point where we can do live video image manipulation at acceptable framerates (even 10 frames/sec would be something). So why can’t we make an app that shifts colours as seen by the camera to a particular different part of the spectrum (depending on the user’s preferences).

For example, a deuteranomat (green weak, difficulty differentiating through the red/orange/yellow/green spectrum) might configure the software to shift yellows and greens to instead be presented as purples and blues. The picture would be false, of course, but it would help distinguish between colours in order to make, for example, colour-coded maps readable.

I was thinking about how video cameras can often “see” infa-red (try pointing a remote control at a video camera and pressing the button), and present it to the viewer as white or red, when I saw a documentary with some footage of “how bees see the world”. Bees have vision of a similar breadth of spectrum to humans, but shifted well into the infa-red range (and away from the blue end of the spectrum). In the documentary, they’d filmed some flowers using a highly infa-red sensitive camera, and then they’d “shifted” the colours around the spectrum in order to make it visible to normal humans: the high-infa-reds became yellows, the low-infa-reds became blues, and the reds they left as reds. Obviously this isn’t what bees actually experience, but it’s an approximation that allows us to appreciate the variety in their spectrum.

Can we make this conversion happen “live” on mobile technology? And why haven’t we done so!