What Goes Into a Metaverse?
Back in 2008 I did a presentation at the wonderful PICNIC conference in Amsterdam on “Pathways to the 3D Web” and “The Metaverse Roadmap”. In the Roadmap I had a diagram I quite liked showing some of the component parts of a virtual world/metaverse. Looking back at it now it was quite selective in terms of what it included but I thought it might be worth making it a bit more current and complete. In the context in which it was, and is, written it makes no difference whether you are talking about “social virtual world” or “metaverse”, the component elements are, I would argue, largely the same.
Here is the updated diagram:
That’s 15 elements, and I’m not including some of the back-end elements needed to make any application function (such as user management, access control, databases, security etc) or to control the generic functions of a multi-user application (reflection servers etc). I’m really just focussing on those which the user is likely to have some immediate awareness of. So let’s go...
- Physics — If the virtual world is going to act anything like a real world it needs physics — if we let go of something it can drop, if we throw something it can arc. Of course this is a virtual world, so we also want to be able to selective switch physics off, or change values if we want to experience being on a moon or other planet. I just love the Beyond episode in The Animatrix — we need to be able to create experiences like that.
- Rendering — Rendering is how we make things appear in the virtual world. I’m not obsessed over high-definition and ultra-real imagery, a well done “cartoon” world can be far better than a flaky nearly-real space. There’s definitely an uncanny valley effect going on. I think Second Life hits a good sweet-spot here, as does AltSpaceVR. Rendering also needs to include the audio elements, and rendering the world completely as audio should also be an aim to support users with visual impairment — and drive a move to semantically rather than geometrically defined worlds. And then of course there’s smell...
- Navigation and Land Model — The early virtual worlds such as Second Life, ActiveWorlds, There.com worked on a “single land-mass” model, no doubt (and in fact in some cases) inspired by the globe-girdling Street of Neal Stephenson’s seminal Snowcrash. In these you “buy” land to build on, and from your land you can see neighbours land and you can walk (or fly) freely between them, permissions allowing, with the option to teleport (TP) longer distances. You get the feeling of being in a single, shared, persistent virtual world. Many of the modern VR spaces (Hubs, Rumii, Horizon Worlds, even AltSpaceVR and VRChat) work more on a “room” model — you have your room (which may be planet sized of course, but often isn’t) and you do what you want in that room, but you have no sight (or sense) of your neighbours, and can only TP between locations — that sense of a shared experience is only episodic, when you are in a room with other people, it’s not pervasive or persistent. There are definitely blurs at the edges, and systems like OpenSim which let you build your own “world” and potentially connect to others blurs things further — and I’ll discuss that more below. But for me any true Metaverse needs to be built on that “one world” model — but with the ability to pinch off (or dock) your own private virtual worlds to it — just like the Internet and Intranets of course. The downside of the one-world model is that it encourages land speculation, and many of the Blockchain based virtual worlds (e.g. Somnium Space and Decentraland) are practically unusable for most people because of that.
- Building — For me, being able to build (and not just place) objects inside of a virtual world is essential. By all means support the import of objects so that more powerful external build tools (like Blender etc) can be used, but if we want the metaverse to support instant creativity and flexibility then you must be able to build inside the world. And that building needs to be “persistent” — if you build it it stays built rather than resetting each time you log in. Again people might see early SL builds as cartoonish, but when you look at some of the works of art created by people like Starax you can see just what prim-torture is capable of — and what it encourages and enables artists to do. AltSpaceVR/BRCvr’s Burning Man is wonderful, but think how more inclusive it would be if anyone could readily build for it without having to know Unity or similar.
- Scripting — My bête noire. Whilst it’s wonderful that platforms like Hubs and FrameVR are bringing in-world placement and even limited building into their worlds both currently lack any form of in-world scripting, as do most other VR collaboration and social platforms. I’m assuming its primarily due to security — a bad script can wreak havoc with a space (I know, I once had an artificial life script that got out of hand and crashed my entire SL sim). The “enclosures” feature in the first, pre-Microsoft, iteration of AltSpaceVR was a nice compromise, your AFrame script ran and rezzed in a defined 3D space in the main world. But if we want people to be able to create interactivities and tools in the same way that they might create gadgets or apps in the real world then a virtual world must support scripting. And please base it on Javascript not LUA!
- Tools — Of course, not everyone wants to have to code all the things they want to use, and this is one area where the current crop of virtual worlds does pretty well, with drawing tools, screenshare, whiteboards, in-world web browsers, webcams, virtual iPads and virtual post-its existing in many of them. More of the same please, and let me script my own apps for that virtual iPad.
- Market/Economy — If the virtual world creator doesn’t provide what you need, then why not get it from another user? Again an area where Second Life really shines with a vibrant in-world marketplace and people making real-world livings from actually making and selling things (rather than speculating over NFTs). Some form of virtual currency — or at least a structured barter system — is probably needed. Again I can’t say I’m a great fan of making it Bitcoin or Ethereum based, but I having nothing against blockchain approaches as long as they can be made more environmentally friendly, and less open to hacking and speculation.
- Rights and Permissions — If you’re going to want to sell things, or just to protect your IP and privacy, then you need some ability to restrict permissions. But also if you want to share things (particularly with a defined group) you need to be able to open those permissions up — so a deeply embedded model of rights and permission controls is a must. Same comments on blockchain as above.
- Avatar Appearance — It’s probably all about uncanny valley again. A good, slightly cartoony avatar looks far better to me than a weird geomorphed photo stuck on a head, or a tele-tubby style chest screen.
I’ve had enough meetings with Daleks, ghosts and talking parrots to know that someone’s avatar doesn’t need to look like them, and in fact I think the freedom of avatar choice helps with freedom of expression and being able to express your true personality, as well as being able to indulge in role-playing when you want to. I take “appearance” here to include animation, and it’s notable how many HMD-VR orientated worlds have gone for the head, shoulders and hands approach, I guess in order to avoid the “fails” you get with inverse kinematics sometimes when elbows and knees bend in weird ways — but for me whole-body avatars are pretty much a must — so lets get the tech and code sorted.
- Avatar Interaction — In early DesktopVR worlds text-chat tended to dominate, but now (spatial) audio is the norm. However text-chat has its advantages — it masks your physical nature if your avatar is divergent from it, provides a useful back-channel during collaborative sessions, and it creates a more equitable discussion space. Against this voice is really the only option in HMD-VR at the moment — although decent virtual keyboards might address this. But human interaction isn’t just about voice — it’s about expressions and gesture and other forms of body language. Meta has talked about the new Project Cambria headset as being able to detect and reflect eye-gaze and expressions, and Philip Rosedale did a great natural body-movement demo in High Fidelity back in 2017. Haptics and force-feedback gloves would be nice, but probably not needed in most use-cases.
- Virtual Agents — The assumption in the last two sections is that I’ve been talking about human users’ avatars, but of course they also need to apply to virtual agents embodied as non-player characters (NPCs) whether they are acting as proper NPC game characters, or as guides, receptionists, shop assistants, personal assistants, training partners, coaches, tutors or even as embodied versions of physical world applications or as virtual versions of ourselves when we’re AFK (away from keyboard — or headset of course!). A key question is where does the “brain” of that agent lie? I’m adamant that it shouldn’t lie in the virtual world. It needs to be written in generic code, running on a generic server somewhere on the Internet, and then using an API to receive information from the virtual world, and to send commands to its avatar (or avatars) within it. And ideally that avatar needs to be able to do everything that a human controlled avatar can do. Back in 2017 I helped run an experiment with such a virtual agent controlling an avatar in Second Life in a “covert Turing Test” and 78% of the users who interacted with it thought it was human operated. Virtual worlds really level the playing field when it comes to the Turing Test . People’s working assumption is that an avatar is human driven, so the agent just has to make sure it doesn’t give away the fact that it’s a computer, rather than having to try to convince the user it’s human. In addition virtual worlds act as great virtual laboratories for working with theories of embedded and grounded cognition as they represent messy and complex worlds in a way that the traditional “block world” research space doesn’t, and you saves you all the hassle of mechatronics for building a robot to experiment with in the physical world. Lucy, the protagonist agent in Fable Studio’s Wolves in the Walls, is one of the best implementations I’ve seen so far of a virtual agent within a VR space.
- Portability — Everything so far is stuff that one development company could crack in their own world — but the real challenge is how can I take my avatar, my money, my scripts and even my possessions and move between different virtual worlds or metaverses (the so-called multi-verse). It’s probably unrealistic to think that there will only ever be one metaverse, so if the future is the multi-verse we need to address the issue of portability. We barely have it in the 2D internet (all those multiple sign-ons, Gravatar bolt-ons etc) so it’s no surprise that we haven’t got it yet in 3D. There are some positive signs, glTF and glb are the best I’ve yet experienced when it comes to 3D object portability, and ReadyPlayerMe is trying to become the Gravatar of virtual reality, but we need more than that. Just imagine if you had to lose everything everytime you moved between your office, your home, your local pub and your favourite holiday destination (OK, maybe forget the last one!)
- Interfacing — Again this is about how we relate the virtual world to other spaces — in this case the physical world (I always try and avoid calling it the “real world” as what goes on in a virtual world is also real). To me Second Life became really interesting around 2006 when they added a web services interface to their scripting language. Suddenly I could write scripts that would go out to the Internet in the physical world, grab some data, and plot it in Second Life. Not physical enough? I also wrote a script so that a light switch in SL would switch on a light in the physical world, and created a physical world light switch that would switch on a light in SL. We also had create fun turning SL into a musical instrument so that flying your avatar in a 3D space would drive a synthesiser and sound projection system (thankyou Martyn Ware) in the physical world. Trivial examples I know, but it shows how the virtual and physical can (and should) be part of the same multiverse, and if we want as much agency in the virtual world as we have in the physical we’ll need access to the same information and systems. And it also lets us build the guts of any big applications out on the Internet using industry standard tools and just treat the virtual world as the user interface. One of the few real innovative points for me in Mark Zuckerburg’s “Meta” launch was having a virtual mobile phone in a virtual world that carried the same information and functionality as the physical one. We need to build for the single multiverse, whether or not we have only one metaverse.
- User Interface — I’ve left “user interface” to almost last as to me its almost the least important. As I’ve written elsewhere (see our forthcoming Pedagogy for VR Guide) that to me the choice between a DesktopVR experience and an HMD-VR is one that the user should make on a per-session basis. There are so many use cases when HMD-VR is not feasible (train, cafe, sofa half-watching TV etc), and HMD-VR still doesn’t have anything near 100% penetration (2.4% more like according to this otherwise bullish report), so a twin-track strategy has got to be the way to go.
- Users — Of course virtual worlds (and metaverses) are nothing without users (apart from virtual agents of course) and I think we still face a huge challenge in making the metaverse (and even virtual world) concept understandable, let alone desirable for the average person. Yes there is a generation that has grown up on Animal Crossing and Habbo Hotel but I can’t say I’m seeing them (yet) grow into a generation of avid VR and virtual world users (although some are taking virtual worlds like Minecraft <proudfather>into their adult life</proudfather>, and Roblox may be spawning a new generation of converts). This is going to be a long haul, and we need to make sure that what we are building is meeting real needs, not tech fantasies, and that we are communicating their benefits in all the best ways that we can. And when they do enter that virtual world or metaverse it must let them do whatever they want — that’s what makes it different from a meeting app or a game or an event app. That was one of the real strengths (and for many weakness) of Second Life — once you were there you could do whatever you wanted: have meetings, play games, build relationships, look at data, teach, relax, make money, hang out etc etc. A true Metaverse must also give us that flexibility — it’s the physical world, just digital.
And here’s a summary list of those 15 elements — to make them easier to post and edit elsewhere :-) : Physics, Rendering, Navigation and Land Model, Building, Scripting, Tools, Market/Economy, Rights and Permissions, Avatar Appearance (and animation), Avatar Interaction, Virtual Agents, Portability, Interfacing, User Interface, Users.
Of course your mileage may vary in terms of how you see each of these, and (bias warning) my view if heavily influenced by all the time I spent in There.com, Active Worlds, Kaneva, Croquet, Wonderland and of course Second Life and Open Sim in the last iteration of the Metaverse a decade and a half ago. I’m also aware that I might get some brickbats for my stance on blockchain/NFT based virtual worlds, but whilst I can see some merit in the blockchain I think those are currently outweighed by its issues.
Seeing as a wrote last year about Evaluating 3D Immersive Environment and VR Platforms and gave 10 different areas to evaluate platforms on I thought it worth checking that this list and that one weren’t too far adrift. Whilst the earlier list was focussed a bit more on the technical side, I think there’s a pretty good alignment.
And is there anything missing from this new list? As a I wrote it, it struck me that perhaps there ought to be something about “Rules and Laws”? The counter-argument to that though has always been to be that the virtual world should be subject to the same rules and laws as the physical world, and ideally those should be the only limitations on what you can and can’t do in a virtual world. Two issues with that though. Positively, some laws and rules are there because of scarcity or cost or fragility, and since those can all be “coded-out” of a virtual world do we still need those laws to apply? Negatively, physical world laws and rules vary by jurisdiction, organisation and even culture — so which should we choose for our virtual world or metaverse, or do we just follow the physical model, with the equivalent of a UN Declaration Human Rights (no matter how poorly followed in the physical world) to govern the common spaces, and then each group or “land owner” able to enforce their own as long as they don’t violate that basic law? And who, in the Metaverse, will have the authority to establish that basic law, and to enforce it? Hopefully not Facebook, probably not the UN, but we need to decide who, and what and how pretty soon.
PS: The “Metaverse Roadmap” part of the presentation was apparently “an extensive 10-year technology forecast and 20-year visioning survey of virtual and 3D Web technologies, markets, and applications.” — I really ought to dig it out and see how well (or badly) I did 15 years into it!