August 1, 2008
After reading some comments and hearing some questions in person I want to dive a little deeper into some of the topics I posted in my last entry. I think its important to be as clear as possible about: how specific parts of this technology works, how interoperability and software work, how implementations of software interact, and exactly why this can’t be simply solved by writing software. I want to explore some of the technical limits of software, and then how we can augment software with the legal and policy frameworks, to enable what I think are desirable properties in an emerging virtual worlds ecosystem.
Since I’m going to be talking about some legal issues and some policy issues, the usual disclaimers. I am not a Lawyer. I don’t represent Linden Lab.™ I’m discussing the technical issues and how they intersect policy. I do work for IBM but I am not speaking for IBM, or setting IBM policy. This discussion is reflection of my opinions on a work in progress. Most of these ideas have been discussed in Zero Linden’s office hours, or at AWGroupies meetings. Contents may have settled during shipping. Only the consumer should remove this tag.
Protocol, Software and Trust
There seems to be a some confusion about how interoperability could or could not protect content. Some of this seems anchored in a misconception that somehow its just a matter of writing software one way or the other and it will all be solved. The core of doing secure interoperability is understanding what can and cannot be done with software. There are unavoidable constraints when managing security and trust between independently controlled software components. The essence of which is we can never force another, independent component to be trustworthy via technical means. Given this constraint, we need to anchor inter component trust, not in pure software, but in a combination of legal and technical solutions.
Protocol, not software
The Architecture Working Group isn’t designing software, it is designing a suite of protocols. These protocols will be built into software. Hopefully lots of software. They will describe how software which wants to interoperate has to behave, in order for the desired interoperation to occur. It doesn’t dictate the interior of the software, but rather the points at which the software interacts with other software.
I am writing software which implements the protocols, as is Linden Lab, and the Python Open Grid Protocol ( PYGOP ) testing team, and people in the community. The protocol work is directly informed by what we learn from implementing the test cases, but the design happens at the protocol level.
This is not a casual point; this is the essence of the task. ANY software which follows the protocol should be capable of interoperation. The AWG is designing not one bit of software, or two or three, but a set of rules of engagement which tells software developers how they can write software which will work together. Note: Following a protocol permits interoperation, it does not enforce or require it. A service built on software which follows the protocol, should be able to work with other services following the protocol. They may or may not actually interoperate, based on whether they chose to accept requests from each other. The protocol says how they may work together, not that they must, or will accept requests from other services; that is a choice of the service providers. In fact, parts of the protocol are about managing those choices.
Trusting other software
A basic assumption of distributed computing is that there are concrete mathematical limits to how much you can trust remote software. From the outside you can’t tell what software is doing inside itself, nor can you force behavior on it. Understanding this limit and managing it is a key task of building a large distributed system. Take for example, the desire to have a simple handshake which says “This bit of code enforces Linden Lab style protection bits.” On its own, that handshake is meaningless. Why? because anyone can code the handshake and we have no mechanism for knowing if the code actually does that. We can code a complete permissions management system and contribute it to OpenSim, and someone can change a few lines of code, recompile the software and run an OpenSim that claims to match that spec, but doesn’t.
Fortunately while we can’t trust other systems to implement what we want them to, there are some nice approaches which permit us to verify the identity of a remote system. We can use these techniques to know *who* is making requests. Better still, we can use such tools to issue partners ways of proving they are who they claim to be. Just as we have formalisms which show us why we can’t trust software to do what it says it does, we have a set of formalisms, which allow software to prove to us within careful limits, it is being run by a trusted partner. For those interested in the topic, a look at public key cryptography as well as digital signatures would be worthwhile. Including a digital signature in the handshake, gives us a basis for trusting that the remote software is being run by the party it claims to be run by.
So.. if we can’t ensure by technical means that a server we want to talk will honor our desires, how can we trust it? In the end, I suspect, we will fall back to the very simple mechanism of having a legal contract which specifies the rules of engagement for both parties. A terms of service, as it were, for a service provider to access another service provider’s resources.
One such pattern might require a legal contract that both parties would honor DMCA takedown requests, an augmented Linden permissions system, and use the Lindex as a common currency. Another pattern might be a simple agreement to only share publicly released content. Another might be a formal “wild west” pattern which says, effectively, “This service doesn’t enforce any rules.” As a protocol, we want to enable a broad range of possible policies and allow the greater community to explore which policies prove attractive and useful and which prove less desirable.
Blending technical and legal techniques
Now lets look at how we could blend together the technical with the legal. The technical is the set of protocols which allow grids to use public keys to establish, provably, they are who they are. The technical is also being able to mark which policies they care to share. This can be as simple as using null keys and policies and allowing non-secure operation. This could be as complex as having a legal contract defining what policies they want to follow.
Key exchanges and the associated policies would provide a mechanism to tie the technical trust to the legal framework for each service to trust the other as much or as little as they chose. As we are looking to enable a broad range of possible connections between services, the protocols need to admit a range of possible relationships. I expect we will see a small set of common ones and a much larger set of less common ones. The goal, in many ways, will be to enable exploration of a rich set of possible connections. In general, permitting a range of possible solutions and letting real world experience show us which ones prove valuable feels right to me.
To recap. In this model trust is anchored in real world legal agreements. Based on those agreements parties can issue each other digital keys which allow proof that they are who they say they are and that they are legally obligated by the contracts associated with the issuance of those keys. Those who don’t feel the need for such comprehensive measures can publish keys which permit access at a less trusted level. A whole range of possible relationships is the goal, not one single relationship.
Related concepts and issues
I want to run through some related concepts which came up in reading people’s comments, and listening to various people’s concerns and questions.
Permissions, and Copyright
Some people in the Second Life community seem to believe that the current system of “Copy/No Copy, Mod/No Mod, Transfer/No Transfer” within the current Second Life implementation forms a digital rights scheme, or a way to manage copyright and licensing. This seems very odd to me. Nothing in the scheme speaks to license, copyright or use. Linden Lab’s uses the permissions bits as exactly that. They form a set of permissions that the software follows. While I’m not a lawyer, as I see it, permissions aren’t a license and don’t convey copyright. I can’t see how a content creator would either surrender rights, nor protect them because of the permission bits they set inside second life. The license grant, which covers use of material in Second Life, would seem to be pretty firmly in the Terms of Service, not in the permissions bits. I highly encourage anyone reading this entry who wants to understand where the copyright grant to use content happens at the moment, to read the TOS.
Virtual worlds, and web content
A parallel is often raised to content in the web. This roughly is asserted as, “why can’t I simply apply the same range of choices to protect my content in a virtual world, as I can in the web.” The rough answer, is. well, you could use any scheme you wanted to encode textures, but only people sharing the scheme would be able to see the textures. If you wanted to go deeper into content encoding, note that you would need a common scheme which was also supported by the region simulators, as without being able to decode the scheme, you couldn’t actually manage the physical simulation which is at the heart of a virtual world. Unlike a web page, the contents of a region are managed as much by the server as a client. When you want your prim boots to be seen by all the other avatars in your virtual space, you need to share the description of those boots with the server and the other clients. When you want your prim tree to be part of the landscape, you need to actually hand it’s description to the region simulator to manage.
Deep Digital Rights Management (DRM)
Digital Rights Management covers a lot of space. When I say, that deep DRM isn’t in scope for what we’re doing with the AWG, i specifically mean the form of DRM where you have a provably sealed cryptographically managed path from the content source to the end of the consumption chain. This is the sort of thing which attempts to prevent a DVD from being copied on your computer, or you making copies of your music downloads. When you dig through all the technical details, this sort of DRM turns into an exercise in trying to give you a copy of the digital content that isn’t a copy of the digital content. That is the essence of the problem. “Here, consumer, you can have this song, so you can play it, but you can’t have this song so you can copy it.” In order to make this happen, you end up with very draconian software which controls every aspect of the process of using the music. This works in some settings. Apple manages, in the context of very good, very closed hardware, and even in the itunes space, there are programs available for cracking the copy protected music, and the option of burning and re-ripping disks. For a world of user created content, the bar is higher, as you need to provide users with a way to upload and put content under a secure key, which opens up series of “known plaintext” attacks.
Adam Frisby has some useful comments in his blog here. The core issue is that, deep, cryptographic DRM of the digital assets being used in a virtual world trips over the need to let all clients in the world see that content. This is nothing new and has been discussed at length. Doing a deep DRM solution would require a closed client, a closed operating system, secured device drivers, and indeed proprietary hardware. Computers are, by their nature general purpose computing devices. Forcing them to behave as non general devices is hard, and most such schemes end up cracked. You can raise the difficulty in cracking such schemes, generally in direct cost of special purpose hardware, and end user complexity.
Policy, Software, services and the creation of virtual spaces
One last point, for a long, and complex set of topics. It is important to keep in mind that there are many layers and parts involved in this space. At the bottom, there is software, which creates a set of services, those services can be composed to simulate a space. Deploying these services, in various ways, allows people to create virtual space, with content, avatars, and interactions between them. Second Life today, consists of one set of services and spaces arranged to produce one style of virtual world.
OpenSim is a platform which allows the creation of services which can be arranged to be very similar to Second Life. It is, however, a software platform, not a virtual world. A reasonable analogy would be the difference between one of those Lego kits which comes with a pile of normal lego blocks and some special blocks, and a pretty picture of the cool car you can make.
OpenSim, is a collection of lego blocks. Some are very general purpose, some are quite specilaized. Out of the box, you can trivially build a standalone server which hosts a small number of regions. With slightly more effort you can hook up your region to one of several existing grids, and with increasingly large amounts of added effort, build grids, run shared asset servers, and provide more and more features which approach a complete virtual world.
Plenty of people have created isolated virtual spaces using OpenSim. Some have built grids, such as OSGrid, CentralGrid, and many others, listed here.
The actual creation of the specific services, the composition of those services into virtual spaces, and the management and policies of such a virtual space, is not set by OpenSim, but rather the people building these spaces and grids using the OpenSim platform. Some of these virtual spaces will form full up virtual worlds of their own. Others, will be used for very specific and narrow purposes, private pocket worlds, for the use of their creators. People will modify the code, extend it, add and delete features, and create variations as they build their grids and spaces.
Interoperation, is at its core, about creating an ecosystem, in which a great many people will innovate. OpenSim, as a platform, enable people to explore different ways of creating and managing virtual space. Many of the virtual spaces people build on top of OpenSim will probably end up interoperating. Some will do so in very broad ways, creating large grids with broad swaths of virtual land. Others will do so in very small ways, allowing a select group of regions to work together. Some parts of the ecosystem will thrive, others will die off. Successful innovations will be picked up and shared, while others will fail to thrive. How parts of this ecosystem connect up, and how broadly and deeply regions share users, data and policies will be determined, by the people who host the services, the people who use the services, and the choices they make. My goal for interoperation, is to make the set of possible choices as broad as possible, so that we can see as wide a set of ecosystems emerge.
July 9, 2008
Identity, trust and policy
This was originally gong to be a part of my post on the interop teleport work. But.. it got out of hand and demanded its own entry, so here it is.
The past few weeks, as I’ve been coding, have also been full of discussion and thought about other parts of the interoperability story. There have been a couple of very good chats at the AWGroupies meetings and Zero Linden’s office hours, about the technical aspects of managing to build out trust and policy between separately run grids. This tends to be a tedious area for discussion because it always happens at the intersection between the technology, and how one can use the technology. Goldie Katsu, has been very helpful in framing some good questions and thoughts, as has Latha Serevi, and many other people who have been in on the discussions.
Based on some of these discussions, here are some thoughts about both the technical approach, and some of the social/legal things that one might enable.
First, what we can, and should do
So, lets start with a few thoughts on the possible, and desirable in this space. We are bounded by the limits of technology, so let’s start with a few bounds:
- We shouldn’t be thinking about doing deep Digital Rights Management. Full up DRM is out of scope for what we can do, and mostly doesn’t work, it’s just a ongoing battle of stopgaps, measures and countermeasures, which annoy users, and don’t actually protect content.
- We shouldn’t be thinking of this as a single solution. There are a broad range of possible policies, and our goal should be to create the protocols and design points to permit people to deploy a range of solutions, and get a range of possible behaviors out of the system. Different uses of the technology, will require different policies and behaviors, we should not merely accept this, but we should embrace it. Some people will deploy environments with very strong content protection policies, others with very weak ones. Some regions will be willing to allow anyone to enter, others will be very restrictive.
- We should plan for trust to be short term, and short range. The longer we hold a trust token, the less reliable it is, and the more links in a trust relationship, the less reliable it is. (Hat tip to Morgaine Dinova, for reminding me of this)
- We should try to capture and make clear to all involved the intents of content creators, and have ways of marking content, and regions for compatibility, so content creators can say “I only want my content in regions which follow rules like X” where we provide for at least a good range of Xs. At the same time, we should not try to break out pick solving problems the semantic web community has not solved in a decade of effort.
- We should, make it as easy as possible for creators of content to properly mark their content, and then, if that content is stolen, we should make it as easy as possible, to show the theft, and allow legal recourse.
Second, a road map of possible technical steps:
So, with those bounds in place, here is a thought on the technical underpinnings we’d want:
- Define and implement a mechanism for establishing identity between components, and in particular, between collections of regions and services (domains) such that we can securely prove that service S, belongs to domain D. As much as possible, this should be built on top of existing, web based mechanisms.
- Define and implement a common mechanism for expressing avatar/user identities across domains/grids. Candidates for such mechanisms include OpenId.
- Create a policy language to allow us to express a range of behaviors between components. Again, as much as possible. based on existing work, Included in this work would be defining a set of policies which would met several common use cases, including permissions similar to those found in Linden Lab’s second life grid, fully open permissions, and use cases developed by the community.
- Design, and implement a set of services for storing, and fetching assets which uses the components built in 1-3.
Third some comments on code, services and choices
The whole space involving moving digital assets between virtual worlds has stirred up a lot of concern, from content creators, content users, and various residents. Some of these concerns are about content being stolen. Some are about people being forced to adopt standards, and others imply that because the OpenSimulator (OpenSim )is an Open Source project, any objects brought to an OpenSim region would instantly become Open Source themselves.
Open Source software refers to how the software is created and maintained. The OpenSimulator project is an Open Source project based on a very open License. Anyone is free to take the code and adapt it in any way they desire. Many people, myself included, view this as a collaborative process, and contribute back our work to the community. Various people contribute to the code, and they have a wide range of thoughts about how they will use the code. In the case of OpenSim, the code can be deployed to host regions, and many people are hosting private regions, or grids of regions. These deployments will be setup and run according to the taste and desires of the deployers.
Saying, interoperability will mean X, or using OpenSim means Y, or all OpenSim developers have agenda Z, is non-sensical. The protocols, the code and how they will be used are seperate things, and they are also separate from the personal beliefs of various developers. Certainly everyone in the process will bring their own interests to the table. But, the actual deployed systems will have policies which reflect the people deploying them. Some will likely have policies which closely mirror those of Linden Lab, others may have policies based on the belief that copyrighting digital objects is a bad idea. The software, and protocols won’t force anyone to use these grids, nor will it force people to put thier content out for use by them.
Service providers, including, perhaps Linden Lab, will, I assume, set policies about what they will permit, and what policies they will require of the grids that wish to connect to their grid. We will, likely see a large range of policies, a large range of opinions, and a process of people choosing which policies work for them. In general, the protocols and the code which implements them will be quite disjoint from the people making the policies. The range of polices that the ecosystem supports will reflect the best guesses of the designers. Most of us, I suspect, know that we don’t know enough to simply get it right with one policy, or one set of policies. I certainly am striving to allow a broad range of choices, and I hope and expect that will be the overall approach taken by the community.
Next steps for interop testing
As you’ve most likely seen by now and IBM and Linden Lab (TM) have announced the completion of testing with a proof of concept version of the AWG(link) protocol. I’m going to dive into some of the details of what’s been done, as well a some discussion of next steps.
The code supporting this proof of concept work, has been posted to the OpenSim mantis.
This is a series of additional steps, beyond the login via aditi work I discussed last month. In particular, it combines additional viewer support, with fuller support for the rez/derez avatar capabilities in both the OpenSim code, and the Linden Lab(TM) beta grid codebase. For the technically inclined, this means that you can login to either an OpenSim Region, or Linden Lab hosted region (currently on a very small number of test grid regions) and then teleport your avatar between sims in both environments.
This teleport preserves your identity, in the form of your avatar’s UUID, which is its unique identifier. Appearance, assets and inventory don’t move. Thus, the collection of generic ruth like avatars seen in the machinema Torley Linden did of some of our testing. In order to manage names and UUIDs sensibly, the OpenSim suffixes your avatars name with a tag, to mark it as having been authenticated via an external agent domain. If you haven’t seen the rather nice machinema Torley pulled, take a look here.
What does this mean?
So, what does all this mean? In the short term, its an exercise in working out technology. A proof of concept, is just what it sounds like. Build enough to see how the design works, find problems, and refine the design. Getting the basic code working gives us an understanding of how to build it properly, and lets us discover problems before they get deeply baked into the implementation.
Does this mean people will be teleporting between OpenSim hosted regions and the main Linden grid, with mad abandon any time soon? Probably not. The code to support this, is deployed in a very specific testbed, and is under active development. Further, while the teleport/login protocol is an important part of an overall story about how the protocol space may evolve, it’s only part of the puzzle. There has been a very active discussion of how to manage assets, permissions and inventory. These discussions have technical, legal, and social implications, and are happening in a fairly large number of places, from blogs and forums to inworld meetings, to casual one on one chats.
As usual, a bunch of cleanup. Code working is not code the way you want it. So, some cleanup is planned, with updates to the community. Beyond that, some discussion on things we learned, and working with the community to do wider testing. For those interested, Whump Linden, is leading up this work at Linden.
Beyond the proof of concept on teleport, is a whole range of issues about managing trust, allowing creators to express the terms they wish to apply to the use of thier objects, and then design, protocol and code which actually enables assets to be moved in a fashion that respects desires of the creators. This topic got so big I broke it into a seperate entry.