March 15, 2010
The Good, the Bad, The Ugly and the Odd
Linden Lab released the first wide beta of the 2.0 viewer a few weeks ago. I’ve been using the beta as my primary viewer since then. I don’t think you can fairly make sense of a new user interface through a hit and run experience. What is my impression? Well, it’s a work in progress. There is much to like. There is some to dislike, one or two things which are really bad, and a range of things which make you scratch your head and go “How on earth did that end up in a user interface?”
As I stressed in my previous note about the client, this is BETA code. The reason one does a beta is to find problems when you think you’re getting close to what you want. With user interface code, this means actually getting it in the hands of real users and then seeing how they manage. So, one assumes this isn’t the fully baked version that will be 2.0. At the same time, it’s close enough to what Linden is likely to ship that its worth some serious critique about the parts which are broken.
This post represents my personal experience and my personal reaction. User interface is a mater of taste. Tastes vary, so you may not agree with some or possibly all of my specific issues. I welcome feedback and other people’s insights.
After a quick overall impression, I’m going to proceed to highlight a series of themed issues. I’m then going to list off a collection of specific smaller issues, and finally close with some suggestions on how to improve the user’s experiences with the client.
This entire post is intended to be constructive. User interface building is hard. Changing a large body of code is hard. There is a lot of potential in the 2.0 client. There are also a lot of rough edges. I hope this posts highlights some of the rough edges, and more importantly, offers some suggestions to smooth them off.
I like the basic approach. While Second Life isn’t the web and has some very different user interface challenges, starting new users with an environment which looks like one they know fairly well is clever. In many ways, this is both a strength and a weakness of the approach. Its strong when the metaphor works and in particular, when the client acts in ways which make sense to people who are familiar with web browsers. Looking like a web browsers works quite badly when the client looks like a web browser but doesn’t do what you expect of a web browser. It works even less well, when you say to yourself “Well, heck, it could do it in the obvious way.”
The “web like” framing metaphor is a starting point. Unlike a web browser, the Second Life Client actually is a set of tools for a number of very different tasks. The core task of the client is to immerse the user in the virtual world. This doesn’t simply mean showing the world. It also means making it easy to interact with the world. This has several parts. First, making it easy to do tasks you need to do, second doing so in a minimally obtrusive fashion. Third, not hiding information which aids immersion from the user.
A good user interface needs to present information in an attractive and consistent fashion. It should also do so in a fashion which accommodates people on a variety of screen sizes and users with a variety of visual challenges. The current user interface as presented in the 2.0 beta fails this test in significant ways. Good interfaces don’t waste space. The beta client often has acres of space which is unused while cramming elements into tiny spaces.
Good user interfaces try to avoid “modal” behavior, where things behave differently based on where you are in the interface. The 2.0 interface is modal in a number of frustrating ways. The Camera widget is painfully modal, requiring clicks to change the effect of clicking on the arrows within the widget. The Audio controls are obscure in similar ways.
The right hand dock
One of the major new elements in the 2.0 beta is the right hand dock. One stop shopping for a lot of information? Perhaps. But at the moment its also a highly modal tangle of user interface elements. The tab slides away to clean up the screen while holding context, which is nice. But, the tab can’t be resized, nor can it be torn off. This means that all the uses of the tab are forced to share a vertical format. It also means that you get to see one, and only one type of information in the client at a time. You can’t see your friends list while browsing a profile. You can’t see nearby people while looking at notices. You can’t use the inventory at the same time as seeing any of the above information. (You can, oddly, pop up the old 1.23 style inventory floater, if you know the obscure keystrokes, but all the other elements are pinned into the modal dock.)
Making the dock unsizable makes it painfully hard to use nested inventory folders, as you’re never able to see all of the names of nested folders without constant mouse motion. Making the dock modal means you end up constantly clicking between things in the dock and losing any context you had in a task. Making the dock handle so many type of information leads to odd little left and right arrows which hide and expose other information.
Notifications are a promising element, but they don’t quite work. For one, the things being shoved into the notification box are of several types. Some are group notices. These are rarely items you need to address in a hurry. Mixed in are events from scripts, teleport offers, texture offers and online/offline status updates. online/offline updates are especially bizarre. They go away without a trace. If you are not at your screen, you will miss them. They do not go to the chat history, they do not go to the IM history they are gone.
Mixing real time and notice style IMs in one play is awkward. Its easy to mistake the one for the other. Having that space combined with the sometimes redundant, sometimes needed IM popups is also remarkably busy.
The notifications are small by default . On top of this, every single one begins “topic:” Using up 7 characters in an already too small field is simply adding insult to injury. You can resize (and even tear off) the notification cluster, but it chews up space, and you soon get into scrolling even so. Even if you expand notifications to the point there is extra space, you still need to click through them to the popup to actually get the full content.
The “This group has traffic” chiclets rapidly end up with a hard to notice scroll bar and useful element out of sight. Yes, you can resize and float these, but again, it tends to end up with scrolling and active window management.
Chat, IM context switching and all that jazz
The 2.0 beta separates chat from IM (both group and private) This leads to two totally disjoint text steams and two separate input areas. In my personal use, this is very frustrating. I am often (almost always when I’m in a public space) using both IMs and chat. Often to people in the same space. Public chat for the social conversation in the space, IM for a quiet word with a friend to share a thought or insight. Group IM, because a relevant topic has come up. Having the two streams fully disjointed means I need to hop between two input areas. The ability to merge the streams in the 1.23 client is sorely missed.
For added frustration, there is a quick keyboard shortcut to the local chat area (enter will do if you’re not in an input area, escape, followed by enter if you are) I haven’t found one to take you back to the conversation box. Flipping the mouse back and forth is tedious. For added difficulty, the very hard to notice update of IM tabs while you’re looking at text chat makes it very easy to totally miss someone speaking to you in IM.
To Link, or Not to Link
The beta is very inconsistent about what’s a “link.” This breaks the metaphor of a web client approach in a number of frustrating ways. In a lot of places where you could just have a link there is a TINY little (i) which you can click. Most of these lead to an odd intermediate visual element and clicks there take you to the information you want. If this is a web like client, the model for links is well understood. You make the element that comprises a link visually distinguished, and clicking on it takes you to the information. Web browsers rarely have little tiny icons you have to locate and then click.
The intermediate element approach shows up in a lot of places. See someone’s name in chat? You get this odd “not a profile” not a link thing. Touch one of the notification chiclets,and you get a new thing, where most likely you can finally click what you want. Hover an item, and you get a very lightweight info box, which is far less informative than the 1.23 style hovertip. Again, you get to click that if you want the real information about the item. If the Lab is concerned that the current hovertips are too complex for beginning users, make level of detail a choice.
Windows/Floaters land all over the place
When you close a floater/window in the current 2.0 beta, its destination ranges from stacking up in very transparent boxes on the upper left, to docking with the bottom chat bar to just vanishing. How you restore a window you had up is equally variable. You can tear off both the notify stack and the conversation update stack, and they dock back to the bottom. This is a lot of places for windows to be going. If you do tear off the two notify/conversation update windows, they no longer re-arrange when you open and close the right side dock.
A ton of digital ink has been splashed on search. The current 2.0 beta search tool fails in almost every way imaginable. On head to head tests, the previous interface requires fewer clicks, finds far more material in a direct and accurate fashion and doesn’t require trying to speak directly to the underlying search code for simple searches. Event searches spanning multiple days don’t work. Avatar and group searches where the names include common terms require horrible contortions to avoid getting off topic results. The top search field is almost entirely useless. Almost every search entered there leads to slogging through the full search interface.
Just plain unattractive
Profiles, Profile Pictures and the whole use of the right hand dock to present these elements in the current beta is just a mess. The profile has historically been an important element in connecting users. Looking at people’s profiles in social settings is a routine behavior of many residents. In the 1.23 client, profile information is presented in a floater, with significantly more text in each element displayed. The way picks and classifieds are presented is especially well worked out. Each pick stands on its own with a picture, text and location. In the beta, The profile is mashed into a smaller space. The primary profile picture competes with the First life picture for screen real estate, shrinking both to postage stamp size. The text for both is truncated, requiring clicking more to see it and then hiding other parts of the profile. Putting both first life and second life profile components side by side makes each seem less important. The phrase “real life” muddies the water significantly, especially as there is no actual validation of any of the real life information.
Groups are shoved into a list of links flowed into a very hard to read hodge podge. If you click on a group to see what it’s about, you don’t get a floater, instead the profile is overlayed by the group information. At this point, clicking back takes you to… Umm. The friend’s list. Exploring a persons profile groups is remarkably painful.
Picks are similarly mangled. Tiny postage stamp photos, “more” buttons hiding most of people’s text, and painfully small fonts. Classifeds get the same treatment, with a small entry at the bottom of the screen to clue you into their existence. Picks and classifieds retain a link to a location. You might expect this to be a clickable field. It is not. You cannot click it. You cannot drag it to your landmarks or the favorites bar. This is a direct violation of the web metaphor. In most web browsers, you can click on a link, and drag it directly to anyplace which takes a link and saves it.
The admittedly quirky interests tab is gone. This sort of puzzles, as Second Life is actually actively promoting Europe and other places where English isn’t the customary language, and yet the place in the profile letting people know which languages you speak has been removed.
Shoving profiles into the sidebar also eliminates the possibility of having more than one profile visible at a time. This is a real annoyance when talking to multiple people. It also eliminates looking at your friends list while looking at a profile. I routinely stack up several profiles of people I’ve talked to, or content creators who’s content I’ve noticed to one side in the client. Removing this ability requires me to pause and copy/past or otherwise manage content, rather than letting the client for me. Again, being able to drag “links” would help here.
Ugly and Odd, point by point
This section enumerates bits of poor fit and finish. Some of these issues look like bugs/poor implementation of good ideas. Some of them look like more serious design choice issues. For each I’ll try to explain my concern and tie it to one of the themes from above.
Close, Minimize, Dock or whatever
When you have several windows up in the beta, you will be faced with a lovely selection of icons to dismiss a window. Some things have tiny little circular (x) tags. Some have more traditional boxed [x] elements. Some have little “_” elements which minimize them. Two almost identical items, the Conversation float and the chat history share the same element, but use totally different hiding mechanisms, and require a different way to restore them. Ctrl-p brings UP the preferences tab. Ctrl-h beings up (and minimizes) the local chat history. NOTHING on the keyboard seems to bring up the conversation floater. Some things are dismissed within the right hand dock by little top arrow clicks. The History chat adds to the mix by having a special little down arrow icon which redocks it onto the bottom text input area.
This hodge podge is inconsistent. Users can’t learn what to look for, because the answer is different in most of the dialogs. User’s can’t develop a consistent expectation for keystrokes which bring windows up and down, because the answer is different for each type of dialog.
Cog or Plus or “advanced” or “right click” or…”More>>>” or “>>>”
The beta viewer has at least four ways and places to tweak things. A bunch of elements have little tiny “*” cogs on them. They show up at the bottom of the volume slider, the bottom of various bits of the right hand dock, and in the edit floaters at various times. They also can be found on the Inventory Floater. A number of elements have “+” icons which expose more functon. Some elements respond to right (context) clicks. In the graphic’s bit of the preferences profile we come across the “Advanced” button which exposes all the graphics tweakery which used to be revealed by clicking “custom” . This is dead next to the “hardware” button which pops up a seperate little floater.
Colors, Contrast and Size
Those with even slightly long memories will recall much unhappy howling when the labs reskinned the client a while ago. One of the loudest complaints was that the lab imposed a color scheme which was very low contrast, thus making it hard on older eyes, or those with visual challenges. The lab seems to have fully forgotten this outcry. The beta client is full of low contrast elements. My personal dislikes include the dark green on gray, and the extremely tiny links to groups in profiles. In general, the client feels as it it was primarily developed and tested on people with 30 inch displays, and 30 year old eyes. Running it on my 16 inch laptop with nearly 50 year old eyes, it becomes unreadable on a regular basis.
In the 1.23 client, an unfocused floater becomes transparent. Transparent floaters allow you to see more of the actual world. The 2.0 beta doesn’t include this feature. Because floaters are more obtrusive when unfocused, they can be lest around longer. The blocking effect of unfocused floaters leads to more clicking to minimize them, and then more clicking to return them when they are wanted again.
The Conversation box details
Tabs on the conversation box are just odd. No matter how much screen real estate is available they don’t show full group names. Nor do those tabs act as link elements. They also blink in an incredibly soft and easy to miss fashion and highlight new content very poorly (Having new content highlighted in an area far away which I need to click to see the details of is no help here at all. The fact that if you’re *in* a tab you get one behavior and not *in* a tab another is just bewildering. For the record if you left focus on a conversation box tab, you will not get any notice of the new content except for the actual text showing. The talk count on the bottom doesn’t go up and the window tab doesn’t highlight. If you focus away from SL and come back, you aren’t given any cue you should look other than the actual new content (if you can remember what was last said)
The “People” tab
The 1.23 “Contacts” list is sucked into the right hand dock. As mentioned above this means it keeps getting lost as you use the dock for other features. The list has lost its direct access to tick boxes for “see me”, “map me” and “modify my objects.” This makes managing these properties much harder, as you can no longer sort by them, nor quickly see them. These status bits are hidden, two clicks away, inside people’s profile, on the “status and notes” tab.
The people list displays twice, first as “Online” then again as “All” which you can sort multiple ways, through the odd cog control at the bottom of the tab. This is redundant, and only serves as a tease, showing how nice it might be to to put people in folders of the user’s choosing. Pretty much every IM client I use has folders I can manage. The lack of same in Second Life has been glaring for years. The added visual teaser here only highlights the absence.
The camera controls are wildly modal and deeply confusing. I almost never used them in the 1.23 client. I simply cannot use them in the 2.0 beta. Having to constantly click between pan, tilt and zoom is just baffling. Its modal in the very worst sense of modal.
First, the audio controls are hidden. The tiny little play and speaker icon are subtle beyond redemption. Second, I think they include the densest mix of overlapping elements in the entire new user interface. If you hover over the little speaker icon you get a little drop down master slider. If you click it, it mutes/un mutes the master volume. If you hover in the area you get a dark gray tiny bar with a button, a cog/tiny down triangle button and a “More>>” button which hovertips “Advanced controls” and gives you another complex little box. Viewer 2.0 adds a lot of added media context. Well and good, but this set of controls is massively more complicated than the 1.23 audio controls, dumps you to the middle of the preferences floater for some tasks and is diagonally across the screen from the voice controls. It is significantly harder to fine tune the mix of multiple speakers, background stream and sounds. The full up preferences screen us huge and much less useful than the previous small adjustment panel.
Making it better
Sort out the right hand dock
Fix the right hand dock. Make it sizable. Make it tear off. Don’t try to cram stuff into the straight jacket of a single fixed sized bar. Allow multiple torn off floaters of any item which is docked. Create a simple, consistent scheme by which they can be dismissed or minimized back to the dock. Make the dock a place where things start, but recognize that it has limits. You don’t see every sub-window in a browser inside a single sidebar — don’t do that in the client. Think about providing a simple pattern for people to pull content back to the dock and then off the screen. One possibility would be a button which would redock floaters, another to pull the dock in. Think about making the metaphor as consistent as possible. Don’t minimize windows into multiple separate places without a very good reason.
Sort out notifications
Fix notifications. Stop letting some information vanish without a trace. Lower the click count needed to do routine tasks with notifications. Figure out how to highlight realtime interactive notifications. Scripts wanting user input are very different from a notice of an event four hours or days in the future.
Fix the chat/IM separation. People use chat and IM at the same time. Forcing people to mouse between them breaks immersion and complicates everyone’s life.
Pick a single visual element for each major task, and make sure you use them consistently. Pick consistent keystroke behavior to match and make sure you can raise and dismiss stuff in a consistent and easy fashion.
Make hovertips match your metaphor and if you want very simple ones for beginning users, make the level of details and click through needed user controllable. Make context clicking available universally rather than in select places. Make the meaning of left click and right click consistent across all the client’s elements, and make that match common web client practice. The new context menus allow more information to be displayed. Take advantage of this to minimize the number of clicks needed to access information. If this feels like it may overwhelm new users, include an option to either consolidate or cascade advanced information.
Make the client visually friendly and accessible
Allow serious, easy skinning of color. Allow easy font adjustments. Package at least one high contrast scheme. Avoid tiny window controls. They are hard to see, they are hard to mouse over and they are a nightmare to describe to people. Follow the web metaphor and make “link” and “click” areas large and easy to spot. Allow hovertips on them to help cue people. Avoid effects which blink. Avoid things you cannot turn off which are likely to trigger migraines and worse. Run the client on laptops. Run it on small screens. Find some testers with older eyes. Find some testers with less than 20 20 vision. Make sure that its easy to spot key windows when new information is available. This may require customization so people can have as strong or light a cue as needed.
Avoid modal elements unless you have no choice
Modal elements are widely viewed as problematic in user interface design. The current viewer and the 2.0 beta are filled with them. As much as possible, ask whether they are making the user’s life better or worse. This is especially cogent in widgets like the camera controls. It is equally relevant in couplings between which tab is selected and what things do and don’t blink on other parts of the screen.
Revist widgets and block elements
Look hard at the various controls such as camera and moving. Simple is good. Modal is bad. Getting down to as few elements as possible is good, but not if it requires constant mouse motion to switch between modes. Really look at the bulkier dialog elements and floaters. Look at how many different metaphors and modes are in use and ask if that is useful.
Listen to the community and your users about search
It doesn’t work. Look at search in 1.23 and make sure that what replaces it works as well or better. Search is an important part of the process of connecting users to users, and users to events. Broken search is really bad. Its been said before. I’m saying it again. Find out how people are actually using search and figure out how to get them the same results or better. This is the base expectation. Taking away function is bad.
Make the web metaphor meaningful
Web browsers have a consistent pattern for links, hover tips and context menus. They have a consistent pattern for what you can drop/drag and where and why. The client ought to look at best practices in the major clients and follow them. Let people drag folders of bookmarks to the favorites bar. Make names and other elements real links, and make them behave that way. Two small examples. Dragging someone’s name out of chat to the people bar should generate a “friendship offer” and put it up for you to click send on. Dragging someone’s name out of chat or IM to a group ought to do the same in generating a group offer.
Worry about pretty
Make sure profiles, groups and the various things users see on a regular basis look pretty and polished. Pictures need to be big enough to see nicely. Text should be easy to read. Stuff which logically belongs together ought to stay together.Icons need to be big, pretty and consistent. Look at how other applications are doing these elements and emulate the pretty ones.
Second life is a very visual place. When the client frames rich 3d content in idiosyncratic, clunky visually obscuring ways, it breaks immersion and it creates a poor setting for users expecting a highly visual experience. Fit and finish matters a lot. Pretty matters a lot. Think about how to make elements unobtrusive when they are not in use. Transparency should be settable for all elements, including whether they fade at all, fade on loss of focus, and how far they fade.
There are things which seem to be simply impossible to do in the 2.0 client. You can’t search for who’s permitted to modify your objects or map you. With large friend’s lists, these become serious annoyances. The very rudimentary skinning available in 1.23 is missing. Hovertips don’t show up for a lot of items and the ones which do take you through another link to the full information, which still seems less than was in 1.23. Anything which simply can’t be done at all in 2.0 but can in 1.23 ought to be sorted out before the code is considered done.
The people tab hints at nested folders for your contacts, but doesn’t deliver, instead showing an odd redundant pair of lists. Fix thisand make it easy for people to manage contacts in ways which make sense to them.
There are a number of good ideas in the third party viewers which have not made it into 2.0 beta. quick access to Windlight presets is one of the biggest. There are a bunch of small but useful ideas to make it easier to tweak preferences without needing to visit the full preferences panel. The list goes on. The 2.0 team ought to be asking “Which of these belong in the mainline client?”
The 2.0 beta introduces a lot of new ideas. Some work, some don’t. The client is clearly among other tasks, aimed at improving first hour and new user experiences. The web client look helps with that, but the current fit and finish does not. I’ve offered one set of thoughts about what’s not working. I’ve tried to make this constructive and thoughtful. I look forward to seeing future versions of the 2.0 code. I don’t expect it all to get better at once. But.. At the moment, the sharp edges are all over the place. The ugly is pretty directly in front of people. Take the time to get the fit and finish right. Take the time to think through where you’ve over used new elements and where you don’t use them enough.
August 1, 2008
After reading some comments and hearing some questions in person I want to dive a little deeper into some of the topics I posted in my last entry. I think its important to be as clear as possible about: how specific parts of this technology works, how interoperability and software work, how implementations of software interact, and exactly why this can’t be simply solved by writing software. I want to explore some of the technical limits of software, and then how we can augment software with the legal and policy frameworks, to enable what I think are desirable properties in an emerging virtual worlds ecosystem.
Since I’m going to be talking about some legal issues and some policy issues, the usual disclaimers. I am not a Lawyer. I don’t represent Linden Lab.™ I’m discussing the technical issues and how they intersect policy. I do work for IBM but I am not speaking for IBM, or setting IBM policy. This discussion is reflection of my opinions on a work in progress. Most of these ideas have been discussed in Zero Linden’s office hours, or at AWGroupies meetings. Contents may have settled during shipping. Only the consumer should remove this tag.
Protocol, Software and Trust
There seems to be a some confusion about how interoperability could or could not protect content. Some of this seems anchored in a misconception that somehow its just a matter of writing software one way or the other and it will all be solved. The core of doing secure interoperability is understanding what can and cannot be done with software. There are unavoidable constraints when managing security and trust between independently controlled software components. The essence of which is we can never force another, independent component to be trustworthy via technical means. Given this constraint, we need to anchor inter component trust, not in pure software, but in a combination of legal and technical solutions.
Protocol, not software
The Architecture Working Group isn’t designing software, it is designing a suite of protocols. These protocols will be built into software. Hopefully lots of software. They will describe how software which wants to interoperate has to behave, in order for the desired interoperation to occur. It doesn’t dictate the interior of the software, but rather the points at which the software interacts with other software.
I am writing software which implements the protocols, as is Linden Lab, and the Python Open Grid Protocol ( PYGOP ) testing team, and people in the community. The protocol work is directly informed by what we learn from implementing the test cases, but the design happens at the protocol level.
This is not a casual point; this is the essence of the task. ANY software which follows the protocol should be capable of interoperation. The AWG is designing not one bit of software, or two or three, but a set of rules of engagement which tells software developers how they can write software which will work together. Note: Following a protocol permits interoperation, it does not enforce or require it. A service built on software which follows the protocol, should be able to work with other services following the protocol. They may or may not actually interoperate, based on whether they chose to accept requests from each other. The protocol says how they may work together, not that they must, or will accept requests from other services; that is a choice of the service providers. In fact, parts of the protocol are about managing those choices.
Trusting other software
A basic assumption of distributed computing is that there are concrete mathematical limits to how much you can trust remote software. From the outside you can’t tell what software is doing inside itself, nor can you force behavior on it. Understanding this limit and managing it is a key task of building a large distributed system. Take for example, the desire to have a simple handshake which says “This bit of code enforces Linden Lab style protection bits.” On its own, that handshake is meaningless. Why? because anyone can code the handshake and we have no mechanism for knowing if the code actually does that. We can code a complete permissions management system and contribute it to OpenSim, and someone can change a few lines of code, recompile the software and run an OpenSim that claims to match that spec, but doesn’t.
Fortunately while we can’t trust other systems to implement what we want them to, there are some nice approaches which permit us to verify the identity of a remote system. We can use these techniques to know *who* is making requests. Better still, we can use such tools to issue partners ways of proving they are who they claim to be. Just as we have formalisms which show us why we can’t trust software to do what it says it does, we have a set of formalisms, which allow software to prove to us within careful limits, it is being run by a trusted partner. For those interested in the topic, a look at public key cryptography as well as digital signatures would be worthwhile. Including a digital signature in the handshake, gives us a basis for trusting that the remote software is being run by the party it claims to be run by.
So.. if we can’t ensure by technical means that a server we want to talk will honor our desires, how can we trust it? In the end, I suspect, we will fall back to the very simple mechanism of having a legal contract which specifies the rules of engagement for both parties. A terms of service, as it were, for a service provider to access another service provider’s resources.
One such pattern might require a legal contract that both parties would honor DMCA takedown requests, an augmented Linden permissions system, and use the Lindex as a common currency. Another pattern might be a simple agreement to only share publicly released content. Another might be a formal “wild west” pattern which says, effectively, “This service doesn’t enforce any rules.” As a protocol, we want to enable a broad range of possible policies and allow the greater community to explore which policies prove attractive and useful and which prove less desirable.
Blending technical and legal techniques
Now lets look at how we could blend together the technical with the legal. The technical is the set of protocols which allow grids to use public keys to establish, provably, they are who they are. The technical is also being able to mark which policies they care to share. This can be as simple as using null keys and policies and allowing non-secure operation. This could be as complex as having a legal contract defining what policies they want to follow.
Key exchanges and the associated policies would provide a mechanism to tie the technical trust to the legal framework for each service to trust the other as much or as little as they chose. As we are looking to enable a broad range of possible connections between services, the protocols need to admit a range of possible relationships. I expect we will see a small set of common ones and a much larger set of less common ones. The goal, in many ways, will be to enable exploration of a rich set of possible connections. In general, permitting a range of possible solutions and letting real world experience show us which ones prove valuable feels right to me.
To recap. In this model trust is anchored in real world legal agreements. Based on those agreements parties can issue each other digital keys which allow proof that they are who they say they are and that they are legally obligated by the contracts associated with the issuance of those keys. Those who don’t feel the need for such comprehensive measures can publish keys which permit access at a less trusted level. A whole range of possible relationships is the goal, not one single relationship.
Related concepts and issues
I want to run through some related concepts which came up in reading people’s comments, and listening to various people’s concerns and questions.
Permissions, and Copyright
Some people in the Second Life community seem to believe that the current system of “Copy/No Copy, Mod/No Mod, Transfer/No Transfer” within the current Second Life implementation forms a digital rights scheme, or a way to manage copyright and licensing. This seems very odd to me. Nothing in the scheme speaks to license, copyright or use. Linden Lab’s uses the permissions bits as exactly that. They form a set of permissions that the software follows. While I’m not a lawyer, as I see it, permissions aren’t a license and don’t convey copyright. I can’t see how a content creator would either surrender rights, nor protect them because of the permission bits they set inside second life. The license grant, which covers use of material in Second Life, would seem to be pretty firmly in the Terms of Service, not in the permissions bits. I highly encourage anyone reading this entry who wants to understand where the copyright grant to use content happens at the moment, to read the TOS.
Virtual worlds, and web content
A parallel is often raised to content in the web. This roughly is asserted as, “why can’t I simply apply the same range of choices to protect my content in a virtual world, as I can in the web.” The rough answer, is. well, you could use any scheme you wanted to encode textures, but only people sharing the scheme would be able to see the textures. If you wanted to go deeper into content encoding, note that you would need a common scheme which was also supported by the region simulators, as without being able to decode the scheme, you couldn’t actually manage the physical simulation which is at the heart of a virtual world. Unlike a web page, the contents of a region are managed as much by the server as a client. When you want your prim boots to be seen by all the other avatars in your virtual space, you need to share the description of those boots with the server and the other clients. When you want your prim tree to be part of the landscape, you need to actually hand it’s description to the region simulator to manage.
Deep Digital Rights Management (DRM)
Digital Rights Management covers a lot of space. When I say, that deep DRM isn’t in scope for what we’re doing with the AWG, i specifically mean the form of DRM where you have a provably sealed cryptographically managed path from the content source to the end of the consumption chain. This is the sort of thing which attempts to prevent a DVD from being copied on your computer, or you making copies of your music downloads. When you dig through all the technical details, this sort of DRM turns into an exercise in trying to give you a copy of the digital content that isn’t a copy of the digital content. That is the essence of the problem. “Here, consumer, you can have this song, so you can play it, but you can’t have this song so you can copy it.” In order to make this happen, you end up with very draconian software which controls every aspect of the process of using the music. This works in some settings. Apple manages, in the context of very good, very closed hardware, and even in the itunes space, there are programs available for cracking the copy protected music, and the option of burning and re-ripping disks. For a world of user created content, the bar is higher, as you need to provide users with a way to upload and put content under a secure key, which opens up series of “known plaintext” attacks.
Adam Frisby has some useful comments in his blog here. The core issue is that, deep, cryptographic DRM of the digital assets being used in a virtual world trips over the need to let all clients in the world see that content. This is nothing new and has been discussed at length. Doing a deep DRM solution would require a closed client, a closed operating system, secured device drivers, and indeed proprietary hardware. Computers are, by their nature general purpose computing devices. Forcing them to behave as non general devices is hard, and most such schemes end up cracked. You can raise the difficulty in cracking such schemes, generally in direct cost of special purpose hardware, and end user complexity.
Policy, Software, services and the creation of virtual spaces
One last point, for a long, and complex set of topics. It is important to keep in mind that there are many layers and parts involved in this space. At the bottom, there is software, which creates a set of services, those services can be composed to simulate a space. Deploying these services, in various ways, allows people to create virtual space, with content, avatars, and interactions between them. Second Life today, consists of one set of services and spaces arranged to produce one style of virtual world.
OpenSim is a platform which allows the creation of services which can be arranged to be very similar to Second Life. It is, however, a software platform, not a virtual world. A reasonable analogy would be the difference between one of those Lego kits which comes with a pile of normal lego blocks and some special blocks, and a pretty picture of the cool car you can make.
OpenSim, is a collection of lego blocks. Some are very general purpose, some are quite specilaized. Out of the box, you can trivially build a standalone server which hosts a small number of regions. With slightly more effort you can hook up your region to one of several existing grids, and with increasingly large amounts of added effort, build grids, run shared asset servers, and provide more and more features which approach a complete virtual world.
Plenty of people have created isolated virtual spaces using OpenSim. Some have built grids, such as OSGrid, CentralGrid, and many others, listed here.
The actual creation of the specific services, the composition of those services into virtual spaces, and the management and policies of such a virtual space, is not set by OpenSim, but rather the people building these spaces and grids using the OpenSim platform. Some of these virtual spaces will form full up virtual worlds of their own. Others, will be used for very specific and narrow purposes, private pocket worlds, for the use of their creators. People will modify the code, extend it, add and delete features, and create variations as they build their grids and spaces.
Interoperation, is at its core, about creating an ecosystem, in which a great many people will innovate. OpenSim, as a platform, enable people to explore different ways of creating and managing virtual space. Many of the virtual spaces people build on top of OpenSim will probably end up interoperating. Some will do so in very broad ways, creating large grids with broad swaths of virtual land. Others will do so in very small ways, allowing a select group of regions to work together. Some parts of the ecosystem will thrive, others will die off. Successful innovations will be picked up and shared, while others will fail to thrive. How parts of this ecosystem connect up, and how broadly and deeply regions share users, data and policies will be determined, by the people who host the services, the people who use the services, and the choices they make. My goal for interoperation, is to make the set of possible choices as broad as possible, so that we can see as wide a set of ecosystems emerge.
July 9, 2008
Identity, trust and policy
This was originally gong to be a part of my post on the interop teleport work. But.. it got out of hand and demanded its own entry, so here it is.
The past few weeks, as I’ve been coding, have also been full of discussion and thought about other parts of the interoperability story. There have been a couple of very good chats at the AWGroupies meetings and Zero Linden’s office hours, about the technical aspects of managing to build out trust and policy between separately run grids. This tends to be a tedious area for discussion because it always happens at the intersection between the technology, and how one can use the technology. Goldie Katsu, has been very helpful in framing some good questions and thoughts, as has Latha Serevi, and many other people who have been in on the discussions.
Based on some of these discussions, here are some thoughts about both the technical approach, and some of the social/legal things that one might enable.
First, what we can, and should do
So, lets start with a few thoughts on the possible, and desirable in this space. We are bounded by the limits of technology, so let’s start with a few bounds:
- We shouldn’t be thinking about doing deep Digital Rights Management. Full up DRM is out of scope for what we can do, and mostly doesn’t work, it’s just a ongoing battle of stopgaps, measures and countermeasures, which annoy users, and don’t actually protect content.
- We shouldn’t be thinking of this as a single solution. There are a broad range of possible policies, and our goal should be to create the protocols and design points to permit people to deploy a range of solutions, and get a range of possible behaviors out of the system. Different uses of the technology, will require different policies and behaviors, we should not merely accept this, but we should embrace it. Some people will deploy environments with very strong content protection policies, others with very weak ones. Some regions will be willing to allow anyone to enter, others will be very restrictive.
- We should plan for trust to be short term, and short range. The longer we hold a trust token, the less reliable it is, and the more links in a trust relationship, the less reliable it is. (Hat tip to Morgaine Dinova, for reminding me of this)
- We should try to capture and make clear to all involved the intents of content creators, and have ways of marking content, and regions for compatibility, so content creators can say “I only want my content in regions which follow rules like X” where we provide for at least a good range of Xs. At the same time, we should not try to break out pick solving problems the semantic web community has not solved in a decade of effort.
- We should, make it as easy as possible for creators of content to properly mark their content, and then, if that content is stolen, we should make it as easy as possible, to show the theft, and allow legal recourse.
Second, a road map of possible technical steps:
So, with those bounds in place, here is a thought on the technical underpinnings we’d want:
- Define and implement a mechanism for establishing identity between components, and in particular, between collections of regions and services (domains) such that we can securely prove that service S, belongs to domain D. As much as possible, this should be built on top of existing, web based mechanisms.
- Define and implement a common mechanism for expressing avatar/user identities across domains/grids. Candidates for such mechanisms include OpenId.
- Create a policy language to allow us to express a range of behaviors between components. Again, as much as possible. based on existing work, Included in this work would be defining a set of policies which would met several common use cases, including permissions similar to those found in Linden Lab’s second life grid, fully open permissions, and use cases developed by the community.
- Design, and implement a set of services for storing, and fetching assets which uses the components built in 1-3.
Third some comments on code, services and choices
The whole space involving moving digital assets between virtual worlds has stirred up a lot of concern, from content creators, content users, and various residents. Some of these concerns are about content being stolen. Some are about people being forced to adopt standards, and others imply that because the OpenSimulator (OpenSim )is an Open Source project, any objects brought to an OpenSim region would instantly become Open Source themselves.
Open Source software refers to how the software is created and maintained. The OpenSimulator project is an Open Source project based on a very open License. Anyone is free to take the code and adapt it in any way they desire. Many people, myself included, view this as a collaborative process, and contribute back our work to the community. Various people contribute to the code, and they have a wide range of thoughts about how they will use the code. In the case of OpenSim, the code can be deployed to host regions, and many people are hosting private regions, or grids of regions. These deployments will be setup and run according to the taste and desires of the deployers.
Saying, interoperability will mean X, or using OpenSim means Y, or all OpenSim developers have agenda Z, is non-sensical. The protocols, the code and how they will be used are seperate things, and they are also separate from the personal beliefs of various developers. Certainly everyone in the process will bring their own interests to the table. But, the actual deployed systems will have policies which reflect the people deploying them. Some will likely have policies which closely mirror those of Linden Lab, others may have policies based on the belief that copyrighting digital objects is a bad idea. The software, and protocols won’t force anyone to use these grids, nor will it force people to put thier content out for use by them.
Service providers, including, perhaps Linden Lab, will, I assume, set policies about what they will permit, and what policies they will require of the grids that wish to connect to their grid. We will, likely see a large range of policies, a large range of opinions, and a process of people choosing which policies work for them. In general, the protocols and the code which implements them will be quite disjoint from the people making the policies. The range of polices that the ecosystem supports will reflect the best guesses of the designers. Most of us, I suspect, know that we don’t know enough to simply get it right with one policy, or one set of policies. I certainly am striving to allow a broad range of choices, and I hope and expect that will be the overall approach taken by the community.
Next steps for interop testing
As you’ve most likely seen by now and IBM and Linden Lab (TM) have announced the completion of testing with a proof of concept version of the AWG(link) protocol. I’m going to dive into some of the details of what’s been done, as well a some discussion of next steps.
The code supporting this proof of concept work, has been posted to the OpenSim mantis.
This is a series of additional steps, beyond the login via aditi work I discussed last month. In particular, it combines additional viewer support, with fuller support for the rez/derez avatar capabilities in both the OpenSim code, and the Linden Lab(TM) beta grid codebase. For the technically inclined, this means that you can login to either an OpenSim Region, or Linden Lab hosted region (currently on a very small number of test grid regions) and then teleport your avatar between sims in both environments.
This teleport preserves your identity, in the form of your avatar’s UUID, which is its unique identifier. Appearance, assets and inventory don’t move. Thus, the collection of generic ruth like avatars seen in the machinema Torley Linden did of some of our testing. In order to manage names and UUIDs sensibly, the OpenSim suffixes your avatars name with a tag, to mark it as having been authenticated via an external agent domain. If you haven’t seen the rather nice machinema Torley pulled, take a look here.
What does this mean?
So, what does all this mean? In the short term, its an exercise in working out technology. A proof of concept, is just what it sounds like. Build enough to see how the design works, find problems, and refine the design. Getting the basic code working gives us an understanding of how to build it properly, and lets us discover problems before they get deeply baked into the implementation.
Does this mean people will be teleporting between OpenSim hosted regions and the main Linden grid, with mad abandon any time soon? Probably not. The code to support this, is deployed in a very specific testbed, and is under active development. Further, while the teleport/login protocol is an important part of an overall story about how the protocol space may evolve, it’s only part of the puzzle. There has been a very active discussion of how to manage assets, permissions and inventory. These discussions have technical, legal, and social implications, and are happening in a fairly large number of places, from blogs and forums to inworld meetings, to casual one on one chats.
As usual, a bunch of cleanup. Code working is not code the way you want it. So, some cleanup is planned, with updates to the community. Beyond that, some discussion on things we learned, and working with the community to do wider testing. For those interested, Whump Linden, is leading up this work at Linden.
Beyond the proof of concept on teleport, is a whole range of issues about managing trust, allowing creators to express the terms they wish to apply to the use of thier objects, and then design, protocol and code which actually enables assets to be moved in a fashion that respects desires of the creators. This topic got so big I broke it into a seperate entry.