RSS

Visual Search, Augmented Reality and a Social Commons for the Physical World Platform: Interview with Anselm Hook

Sun, Jan 17, 2010

anselmhook

Visual search is heating up, and with it a key stage of turning the physical world into a platform is underway as images become hyperlinks to the world in applications like Google Goggles, Point and Find, and SnapTellsee this post by Katie Boehret.  And while there may be no truly game changing augmented reality goggles for a while, make no mistake, key aspects of our augmented view, factors that will have a lot to do with what we will actually see when an augmented vision of the world is a commonplace, are already in the works.  And, as Anselm Hook (pic above from @caseorganic’s flickr) notes:

“There is a real risk of our augmented reality world being owned by interests which are not our own. There is a real question of when you hold up that AR goggle, what are you going to see?”

Cooperating services, e.g., Google Earth, Maps, Streetview, Google Goggles, and leader in local search like Yelp (see here) would have an enormous ability to filter and control a mobile, social, context aware view of the physical world, and Google themselves see an ethical quandary.

“A Google spokesperson says this app has the ability to use facial recognition with Goggles, but hasn’t launched this feature because it hasn’t been built into an app that would provide real value for users. The spokesperson also cites “some important transparency and consumer-choice issues we need to think through” (quote from Wall Street Journal Column by Katie Boehret).

Anselm Hook and Paige Saez, with great prescience, have been advocating a social commons for the placemarks and imagemarks to our physical world platform through a number of pioneering projects, including imagewiki.   I have interviewed both Anselm and Paige (upcoming) in depth, recently.  My talk with Anselm was nearly three hours long!  So I am publishing the transcript in two parts.

Understanding what it means to have a social commons for  our physical world platform, and augmented reality, are key questions for all of us to think about, but especially important for those of us involved in the emerging industry of augmented reality.

Anselm notes :

“The placemarks and imagemarks in our reality are about to undergo that same politicization and ownership that already affects DNS and content. Creative Commons, Electronic Frontier Foundation and other organizations try to protect our social commons. When an image becomes a kind of hyperlink – there’s really a question of what it will resolve to. Will your heads up display of McDonalds show tasty treats at low prices or will it show alternative nearby places where you can get a local, organic, healthy meal quickly? Clearly there’s about to be a huge ownership battle for the emerging imageDNS”

The mobile internet is moving beyond the internet in your pocket phase of mobility with mobile, social, proximity-based, context aware networks like FourSquare, Gowalla, Brightkite and GraffitiGeo (see Smart Data Collective) likely, soon, to start to take precedence over other forms of social network.

Regardless of the timeline for true augmented reality – 3D images & graphics tightly registered to the physical world,  proximity-based social networking and real time search are already taking us into a hyper-local mode and the realm of augmented reality which is “inherently about who you are, where you are, what you are doing, and what is around you” (Robert Rice – see here). The ground is being prepared for augmented reality now.

If you have been reading Ugotrade, you will know I have been actively involved in developing  an open, distributed AR platform/mobile social interaction utility for geolocated data based on the Wave Federation Protocol – AR Wave a.k.a Muku – “crest of a wave” (see my posts here, here and here for more on this project, and the AR Wave Wiki here).  Federation is, I believe, one vital aspect to developing a social commons for augmented reality and the physical world platform.

Also, a bit of news, I am co-chairing the upcoming Augmented Reality Event (are2010) with Ori Inbar of Games Alfresco and Ogmento, whurley.  Sean Lowery, Prospera, is the event organizer, and are2010 has the support of the AR Consortium.   The are2010 web site is live and there is an Open Call For Speakers.  You can submit your proposals and demos for one of the three tracks, business, technology, or production on the web site here.

are2010

Bruce Sterling “prophet” of  augmented reality and more, “will deliver the most anticipated Augmented Reality keynote of the year.”

bruces-brasspost

It didn’t surprise me when Anselm mentioned that Bruce Sterling was a key influence for his work on the geospatial web and augmented reality.  Anselm explained:

“I’d seen a talk by Bruce Sterling at an event called Planetwork [May, 2000]. And that event was, for me, a turning point where I decided to focus full time on exactly what I cared about instead of doing things that were kind of similar to what I cared about. So, his influences is a pretty significant one to me at that exact moment.”

dhj5mk2g_490gcp7q6fn_b

For more see viridiandesign.org -  seems it is time for a “Neo-Viridian,” revival!

This post by Bruce Sterling on Pachube Feeds, and Thomas Wrobel’s prototype design for open distributed augmented reality on IRC, were key inspirations for me when I began thinking about the potential of Google Wave Federation protocol for augmented reality.  I had been exploring Pachube and deeply interested in the vision of Usman Haque, but I had a real aha moment when I read this :

“(((Extra credit for eager ubicomp hackers: combine this [pachube feeds] with Googlewave, then describe it in microsyntax. Hello, 2015!)))”

I think the AR Wave group will earn the extra credit and more very soon!  Davide Carnovale, need2revolt, and Thomas Wrobel have been leading the coding charge, and there will be a very early AR Wave demo soon, perhaps as soon as the Feb 16th ARNY Meetup. 

Open access to the creation of view that will eventually find its way into AR goggles, will depend not only on the power of  an open distributed platform for collaboration like the AR Wave project.  Our augmented reality view will be constructed through complex “hybrid tracking and sensor fusion techniques” (Jarell Pair), cooperating cloud data services, powerful search and computer vision algorithms, and apps that learn by context accumulation will drive our augmented experiences, and at the moment, these kind of resources, at least at scale, are for the most part in private hands.

In the interview below, Anselm’s discusses  how trust filters, and being able to publicly permission your searches so that other people can respond and so that people can reach out to you, and the democratization of data in general, are even more of a concern with augmented reality and hyper local search. The task of understanding what it means to have  a social commons for the outernet remains an open, and pressing question.

Anselm explains (see full interview below):

“as we move towards a physical internet where there’s no clicking and there’s no interface and the computer’s just telling you what it thinks you’re looking at, translating, you know, an image of a billboard to the name of the rock star who’s on that billboard, or translating the list of ingredients on a can of soup to the source outlets where it thinks that, those ingredients came from. When you have that kind of automated mediation, the question of trust definitely arises.

And we haven’t seen the Clay Shirkys or the Larry Lessigs of the world start to talk about this yet.  Although I suspect that in the next four or five years that the zero click interface will become the primary interface, that we’ll have…we’ll come to assume that what we see with the extra enhanced data we get projected onto our view is the truth. Yet, at the same time, there is just no structure or mechanism even being considered for a democratic ownership of it.”

Augmented Reality will emerge through sensor fusion techniques & cooperating cloud services

In 2010, sensor fusion techniques, computer vision technology in conjunction with GPS and compass data will create data linking that can enable the kind of augmented reality that has been the stuff of imagination for nearly four decades (see Jarrell Pair’s post).

Putting stuff in the world in 3D is of course key to the original vision of augmented reality, and one of its biggest challenges.  Augmented reality is going to be implicated in a real time mapping of the world at an unprecedented scale and granularity.  We have barely an inkling of the implications of this now.

Anselm and Paige have been working in the heart of the social cartography movement for nearly a decade.  The vision and experience of this community is vital to understanding how augmented reality and the world as a physical platform can evolve into something that benefits people and allows them “to have a better understanding of the opportunities around them.”

We have been hacking maps for millenia –  “from conceptual story mapping, to colloquial mapping in European development and the cartographic renaissance created by the global voyages and rediscovery of Ptolemy’s maps” (Andrew Turner).  And, recently, initiatives on a public-provided GIS, like OpenGeo, have led the way toward more open, interoperable, geospatial data.

Mapping takes on a new an crucial role to augmented reality.  Nokia’s ImageSpace is beginning to do what many thought Microsoft would do with photosynth two years ago.

And, if we see these kind of projects developed into a “photo-based positioning systems” -  “3d models of the environment to cover every possible angle, and then software that can work out in reverse based on a picture precisely where you are and where your facing” (Thomas Wrobel), we would find augmented reality leap forward over night.

It is time to take very seriously the vast opportunities and potential pitfalls of an augmented world.

“when you are mediating the translation layer between the image and the data, then there is an opportunity for you to control it, and that opportunity is hard to resist.  It is hard to choose not to own that opportunity. It is an advertising opportunity. It is a revenue opportunity. It is a chance to send a message and a tone.

I know that Google and companies like that are keenly aware of the kinds of roles they don’t want to hold, but it is sometimes seductive to think about them. And I am afraid that we, as a community, need to assert an ownership, kind of a commons, over how computers will translate what they see to information that we perceive.”

There are some initiatives emerging.  Tonchidot (who closed on $4 million of VC for augmented reality last December) has helped create the AR Commons in Japan.  CFO of Tonchidot, Ken Inoue explained in an interview with me in September 2009.

We feel that public data, such as landmarks, government facilities, and public transport should be shared. We see an AR world where people can readily and easily access information by just seeing – quick, easy, and efficient.  And because of this ease and intuitiveness, children, the elderly and handicapped will surely benefit.  AR could help create a safer society.  Warnings, alerts, and safety information could save lives and avoid disasters.  These are what we, and AR Commons would like to tackle in the not so distant future.”

But the task of building a social commons for the physical world platform has only just begun.


Interview with Anselm Hook

anselm3

photo from Anselm’s Flickr stream here

Tish Shute: We first met last year at Wherecamp. The start of 2009 was I think the “OMG finally” moment for augmented reality and in less than a year AR, at least in proto forms, AR is breaking into the mainstream now! You are one of the founding visionaries/philosophers/hackers of the geo web and you have been thinking about geo web and AR for a long time – all the way back to the legendary Head Map Manifesto, and before.  Mostly recently you led the way in the very successful ARDevCamp in Mountain View. Could you start by telling me a little bit about the history of your pioneering work with geolocated data?

Anselm Hook: I am a long time Geo fanatic. I’m really interested in social cartography and what some people call public-provided GIS, that’s some language that people use. Anyway, my personal interest, when I talk to people who are non-technical (and it’s been a long term interest in the way I phrase it) is that I want to help people see through walls. So, the goal is very simple. I want people to have a better understanding of opportunities around them, the landscape around them. I always get frustrated when people make bad decisions because of a lack of information, especially when it’s related to their community and related to their environment. But, plainly put, I really just want “to help people see through walls”. It’s a very simple goal.

Tish Shute: I know you worked on Platial, which is really one of my favorite social mapping applications. It really broke new ground. What was the history of that? How did you get involved with Platial?

Anselm Hook: That’s an interesting question. It actually started at around 2000 when I saw Bruce Sterling talk. I had been writing video games for many years, and I was quite good at it, and I enjoyed it. But, the reasons I was doing it diverged from why the industry was doing it. I was making video games because I like to make shared spaces for my friends to play in and to share experience. I really enjoyed making shared environments. I worked on BBS’s and my friends and I were always making these collaborative shared environments.

Once the video game industry kind of started to take off, I started to do high performance, 3D interactive video games and making compelling shared spaces, and it was a lot of fun. But, the frustration for me was that there was a huge industry growing around it and became very commercial. Although it paid well, it started to diverge from my values which were more centered around community environments, and shared understanding.

Tish Shute: Yes very rapidly, the big games kind of devolved from the social aspects and became more and more into single player really, didn’t they?

Anselm Hook: It was the way, actually, because even though often you were in a many player world, you weren’t collaborating, everything else became just a target.  I liked the idea of deep collaboration that calls the kind of playful space you see in IRC, or in the real world, where people are solving real world problems.

And I grew up in the Rockies, and I was always had a lot of access to the outside. So, I saw shared spaces and collaboration as a way to protect our environment. [ To step back ] I think people used different metrics for measuring their choices in the world and many people have a value system centered around minimization of harm: making sure that the people are not hurt. But, my value system is different. I personally believe that protecting the planet is more important: to maximize biodiversity. I feel like protecting people around me comes from protecting the ecosystems they live in.

Tish Shute: That’s interesting, isn’t it, because the history of Keyhole was really that, wasn’t it.  Keyhole later became Google Earth, but I mean it began out of a project to look at what was going on in the ecosystem over Africa at that time, didn’t it?

Anselm Hook:
Yes, in fact many people’s projects are stemming from an environmental concern. Mikel Maron’s works for example – he’s doing Map Kiberia, and he also worked on OpenStreetMaps.

Tish Shute: Map Kiberia – that is the new project?

Anselm Hook: Oh, yes his project is called Map Kiberia. He’s mapping a city in Africa.
[For more see Map Kiberia’s YouTube Channelphoto below from ricajimarie ]

dhj5mk2g_487qfcv76ft_b

Tish Shute: Right, great!

Anselm Hook: When I started to look at GIS and mapping I started to meet people who had a very similar background. What happened to me is I kind of stepped away from games around the year 2000. I’d seen a talk by Bruce Sterling at an event called PlaNetwork. And that event was, for me, a turning point where I decided to focus full time on exactly what I cared about instead of doing things that were kind of similar to what I cared about. So, his influences is a pretty significant one to me at that exact moment.

dhj5mk2g_490gcp7q6fn_b

[For more see viridiandesign.org – seems that it is time for a “Neo-Viridian,” revival.]

Tish Shute: It’s interesting because now your paths are crossing again with augmented reality. You are on the same wavelength again.

Anselm Hook: It’s funny, actually, I’ve had a couple of brief overlaps in that way.  Well, so in 2000 I went to see this talk and I did a small project called — well, I called it SpinnyGlobe. What I did is I mapped protests from a number of websites onto a globe to show the level of community opposition to the pending war in Iraq. It was the first time there had been a protest before a war. So, it was very interesting to me. [ See http://hook.org/headmap ]

Tish Shute:
That’s really fascinating. Do you have any pictures of that you could send me?

dhj5mk2g_492ffct2df4_b

photo from anselm’s flickrstream

Tish Shute: Yes, I’ll definitely look SpinnyGlobe up. It sounds very interesting.  One of the aspects of your work on geo-located data projects like this and Platial is that you really started to develop this idea of a culture of place, about how people make place. This was the wake up call to me regarding the power of networks combined with geo-data.

We are hoping to extend this idea into augmented reality with the an open distributed platform for AR so that we can collaboratively map our worlds from the perspective of who we are, where we are, and what we are doing.  I know you’ve just done some work recently in augmented reality.  I know you put the code up already.

By the way, I love the way you take your philosophy into the way you make code – the practice of making some code, trying some things out, making it all public and publishing your findings, you know, your comments on that experience.  Perhaps you could recap sort of how you picked up recently on the state of play with augmented reality and what aspects you looked at, and what came out of that experience?

Anselm Hook: So, it’s a very simple trajectory. Coming out of the work I had done, Platial, among other projects and I started to just look at the hyper-local and I suddenly realize that even those services weren’t really speaking to living, and how to really see and solve local problems. What was missing was a sense of context.

The map doesn’t know how you’re feeling, it doesn’t know if you’re in a hurry, it doesn’t know what you want, it’s very static. Even the web maps are very static. And augmented reality for me I started to recognize as a combination of — well — it’s probably collision of many forces, many forces that we’re all a part of. We’ve also didn’t realize that the real-time web is really important, it’s part of what AR is about.

We have all started to realize that the context is important. You know, your personal disposition, your needs, if you want to be interrupted or not. That is the kind of thing that the ubiquitous computing crowd has talked about. We started to recognize that there are sensors everywhere, and the ambient sensing communities talked about that. So what is funny for me about augmented reality is I started realizing it is just a collision of many other trends into something bigger.

Everything else we thought was a separate thing is actually just part of this thing. Even things like Google Maps or mapping systems we think are so great are really just kind of almost an aspect of a hyper-local view. You actually don’t really care what is happening 10 blocks away or 100 blocks away. If you could satisfy those same interests and needs within a single block, one block away, you would probably be really happy. You really just want to satisfy needs and interests, find ways to contribute, or get yourself fed, or whatever it is you want. And AR seemed to be the playground to really explore the human condition.

Tish Shute: Anyway, I think one of the things that has been very amazing this year is we to have the good mediating devices that, for the first time, give us compasses, GPS, and accelerometers. But one sort of missing pieces with AR at the moment is [tracking, mapping, and registration] – the kind of things colloquial mappings of the world could be of great help with.

We have seen mapping coming out of the Flickr data, e.g., the University of Washington, put the maps together from the geo-tagged Flickr photos. Now if we could have that linked up with AR, then we have the kind of mapping we need to kind of really hook the geo-data onto the world in a way that goes beyond…you know, what compass and GPS can really deliver is pretty minimal at the moment.

Anselm Hook: There is a real risk of our augmented reality world being owned by interests which are not our own. There is a real question of when you hold up that AR goggle, what are you going to see? Are you going to see corporate advertising? Are you going to see your friends’ comments or criticisms? It is going to be an Iran or a democracy, right? It is unclear.

Right now there are some disturbing trends I have noticed. I am a big fan of Google Goggles. I think it is a great project. But when you are mediating the translation layer between the image and the data, then there is an opportunity for you to control it, and that opportunity is hard to resist. It is hard to choose not to own that opportunity. It is an advertising opportunity. It is a revenue opportunity. It is a chance to send a message and a tone.

I know that Google and companies like that are keenly aware of the kinds of roles they don’t want to hold, but it is sometimes seductive to think about them. And I am afraid that we, as a community, need to assert an ownership, kind of a commons, over how computers will translate what they see to information that we perceive.

Tish Shute: Yes. And this is how we met, again, recently [over the project to create an open, distributed platform for AR using the Wave Federation Protocol]…

This is something I feel really deeply is that, you know, basically we need the physical internet to be as open as, as the, as the internet, as the end-to-end internet has been. Or more so, actually, because the end-to-end internet has seen the trend has been to walled gardens.  Basically Facebook became enormous, an enormous walled garden which, I think, was despite, our predictions about them, [walled gardens] are the social experience really on the web.  It’s very much in walled gardens still and I, and I really feel that with the physical internet, we need to make great efforts not for it not just to be a series of small pockets of privately funded walled gardens.

There needs to be some kind of communications infrastructure that keeps it open so that was when I got interested in looking at the Wave Federation Protocol because it was a real time, you know, an open real time protocol that could possibly be a basis for that. But I think the point you’ve talked to just now, the mapping of the world and who has the “goggles”, i.e., the image data, image databases, that make the world meaningful is really, that’s still a, it’s still a BIG question [i.e. who controls the view?].

When I saw ImageWiki, [I realized] that is a piece that is vital for, for augmented reality. We need to have a huge social effort to be involved in this,  linking in and creating the  physical internet, in creating the image hyperlinks that will make that meaningful.

dhj5mk2g_493fv23rg33_b

Anselm Hook: I think that’s a great point. The search interface, the kind of Internet that we’re used to, the way we talk to the network now, is fundamentally open end to end. Yes, you can have your oligarchies inside of it, as we see with Facebook, but you can always start your own venture up and you can do a search on something, and you can find that, that website and you can join it or you can put up your own webpage and people can find it.

The translation layer, the idea of text search and the ability to discovery power and the serendipity and the openness of that discovery, it’s pretty open right now. We do have some serious boundaries of language, which is one of the reasons I was working at the Meedan.org [hybrid distributed, natural language translation] for a couple of years, trying to bridge that issue.

But here, as we move towards a physical internet where there’s no clicking and there’s no interface and the computer’s just telling you what it thinks you’re looking at, translating, you know, an image of a billboard to the name of the rock star who’s on that billboard, or translating the list of ingredients on a can of soup to the source outlets where it thinks that, those ingredients came from. When you have that kind of automated mediation, the question of trust definitely arises.

And we haven’t seen the Clay Shirkys or the Larry Lessigs of the world start to talk about this yet.  Although I suspect that in the next four or five years that the zero click interface will become the primary interface, that we’ll have…we’ll come to assume that what we see with the extra enhanced data we get projected onto our view is the truth. Yet, at the same time, there is just no structure or mechanism even being considered for a democratic ownership of it.

We have with DNS, for example, the idea that you can register the domain name and people can search for it, and find it, and go to it. There’s no such thing as an Image DNS, or an image translation to DNS right now. What does it mean when everything is just “magic”, when there’s no way for you to be a part of the conversation, where you’re just a consumer of what people tell you, or of what one company right now, tells you, is reality? That’s a real concern.

Tish Shute:
This, to me is the most important question at the moment. I mean, it’s the big one and it’s the place to put energy if you love the Internet [and what it can now become] right. You’ve got to put a lot of energy into this because this [a democratized view of the physical world as a platform] won’t just happen, because there’s a lot of momentum already for it to be heavily privatized, partly because, one reason is, some of the computer vision algorithms that, say, make sense of things like the geotag photographs are not open.  I mean, for example, the beautiful maps that have been made from the University of Washington [from Flickr geotagged photo sets], that isn’t in the public domain.

Anselm Hook: Right. Tish, and in fact you’re referring to [with the maps from the Flickr photos] to ordinary maps and the fact we’ve already seen that maps lie, we’ve already, seen how much maps are reflecting a certain truth that becomes the normative truth. Google maps reflects roads, because this is roads and cars, right? Only recently have they thought about buses and walking. So the normative view that people assume is the reality, is showing off you know Starbucks, and roads, and cars, that becomes the default, those prejudices are just assumed, you know, the truth. But they’re not the truth at all.

I was talking to a friend of mine in Montreal, [Renee Sieber], and she said that their Indian portage routes are a bridge across land and water, they don’t think of a piece of land and a piece of water as being different things, they think of them as one thing: a route. It’s already a different kind of language we can’t even reflect it.

So not only is there this kind of formal, anthropological lie, in a sense, but there’s this way that we deceive ourselves because of our own prejudices.

Tish Shute: Yes I agree and that’s why I think when I saw some of the things you had written on the ImageWiki point clearly to the need to create a social commons. We need a social commons for the real-time physical internet, we need it for the image hyperlinks that make sense of that.

And it’s a complicated thing in a sense, though, because we don’t actually have a good distributed infrastructure for AR yet, and I found exploring AR Wave, that at last we have the suggestion of an open, federated protocol for real-time communication – the wave federation protocol. [Real time communications is a very important part of AR].  It isn’t an actuality yet where lots of people are able to use it, set up their own servers, and there’s not a standard all the way through  [there is not a standard for how data is sent between the client and the server].

But Wave Federation Protocol does make possible truly distributed social AR.  I started thinking when I saw ImageWiki that to bring ImageWiki together with the social collaborative power of distributed AR.  This really would be the basis of creating a social commons for augmented reality and the physical world as a platform – the start of a bottom up with deep social collaboration on how we create augmented reality colloquial maps that can inform a hyper-local of the world.

Anselm Hook: Yes. When Paige Saez, John Wiseman, and myself, and a few other folks… You know, Benjamin Foote, Marlin Pohlmann, and a couple other people started to play with this, we quickly found that… We started to realize, “Oh, this kind of thing will be at least as popular as IRC. There will be at least as many people doing this as chatting in little virtual spaces. There’ll be at least as many people decorating the world with augmented reality markup, and maybe using the real world as a kind of barcode for translating what you’re looking at into an artifact, a digital artifact.

And that the size of that space was going to be huge, basically. Maybe not quite as commodifiable as Twitter, but certainly very energetic.

Many of the projects we did were just kind of looking at these kinds of issues sort of from an artistic, technical, and political point of view. We weren’t so much posing complete solutions, but simply using a praxis to explore the idea with an implementation, as a foundation for this discussion. So I think we sort of opened that can of worms for sure.

Tish Shute: Did you actually set up ImageWiki to be working as a location based app yet?

Anselm Hook: It is a location based app. It collects your longitude, latitude, and the image and stores it. And then it uses that as a way to translate that image to anything else. It could be a piece of text or a URL.

Tish Shute:
So there is a smartphone app, but you didn’t take it as far as an AR app yet?

Anselm Hook: No. We didn’t do a heads-up view. There are apps on the iPhone store that do that, but they don’t do the brute force image recognition that we were using. We used a third party off the shelf algorithm that we found on Wikipedia and downloaded the source code, and threw it on the server. And John Wiseman in LA wrote the scalable database backend so that we could scale the actual…

Tish Shute:
So how did you set the iphone app up to work?

Anselm Hook: The iPhone side was very simple. You take a picture of something and it tells you what it is. That is all it did. We would take the location, but the client side, the iPhone side, just rendered, returned to you…It said, “Someone said that this picture of a barking dog is an advertisement for a local band.”

Tish Shute: Right. So basically it was a geo-tagged?

Anslem Hook: Yes. We are just collecting the geo information. Actually, there were a whole lot of technical challenges. The whole idea of ImageWiki is actually kind of beyond our technical ability for a small team like us. It really does take a team, a group like Google, to do this kind of thing in a scalable way.

Tish Shute:
Why is that?

Anslem Hook: There are two sides. There is the curating the images. I think that is the job of groups like us – open source groups who can curate images that are owned by the community. And then the searching side, the algorithm side, where you are actually matching the fingerprint of one image to images in your database, that takes a much more…that is much more industrial.  We get both sides, ours is not a scalable solution. It is mostly…proving that it could be done was important.

Tish Shute: In terms of hooking Imagewiki up to the collaborative possibilities of AR Wave wouldn’t federation pose some interesting possibilities for scaling search algorithms and all that?

Anselm Hook: Yes. And what is funny also, incidentally, is that, nevertheless, we did look for some financial support for it, but we couldn’t…we just didn’t find the investors to scale it. Now, other companies like SnapTell took a shot at it. And they have an app in the iPhone store where you can point at a beer bottle and get back the name of the beer bottle.

The classic example everyone uses is a book. Amazon has all the image jackets of all their books. You can point SnapTell at almost any book and get back links to buy that at Amazon, the price of the book, and user comments on the book. So they are treating Amazon as the canonical voice of the book, for better or worse. That is the state of the art so far, up until Google Goggles came out a little while ago, which actually blows it out of the water. But, that is where we are now.

Tish Shute: Right. But the point you raise about how when something like Amazon comes canonical of what is book, right, this is the whole point, isn’t it?

Anselm Hook: Is Amazon truth? It’s not bad. Jeff Bezos seems like a nice guy, but, you know.

Tish Shute: And this is the point of having these open infrastructures for this.  And this should be obvious in a way, but it comes back to the thing about what made the Internet great was the fact that even though as you note, you get an oligarchy like Facebook, but people always could just go off and do something else, right? Because the fundamental infrastructure was basically open and designed to be available for everyone. And many people have championed that and fought for it hard [to maintain this openness] haven’t they? They have devoted their lives to keeping it that way, even if the oligarchies have done their thing.

Anselm Hook:
Yes. There are really some things that are underneath all of this that haven’t been solved yet.

One is that the trust in social networks has not been built yet, so we can’t do peer based recommendations very well. We can’t filter noise by peers. Twitter kind of is moving there, but I don’t just want to listen to my Twitter friends. I want to listen to my friends of friends. If I am getting truth from somebody, I want to get that truth from people my friends say that they trust.

Then the second problem is that there is a search business. My friend Ed Bice, who owns Meedan, always says that a search itself, a search request, is an opportunity to make…is a publishing moment. It is an opportunity to say what you think. In the real world, if you are just hanging out with humans and you look somewhere, other people might look at your gaze and they might look at what you are looking at. Your gaze itself is a public act.

Gaze is a soft act, but it is one that is visible. With Google, the gaze of four billion people is invisible. We don’t what people are looking at, there is no opportunity to participate. Let me give you a real example.  I have taken a image of something of the bust of figure or a statue.  Why can’t the museum in Cairo look at my request and tell me oh yeah that is Tutankhamen, or that is Nefertiti right? Why can’t they have a chance to participate in the search and respond to me?

Right now the the only person that responds is Google when I do a search. We need to invert the search pyramid and open up search, so that search is a democratic act, so that you can publicly permission your searches so that other people can respond and so that people can reach out to you, not just you having to do a dialogue.

The common example of this.. and we see this everywhere: I am looking for a slice of pizza right, now I am hungry I want some pizza. I have to ask Google, look find twelve websites, call twelve phone numbers, and talk to each of the twelve stores, and ask them are they open late, is the food organic, is the food in any good, do my friends like it.

Whereas what I should be able to do is just say it’s a search moment and I am interested in pizza. If those pizza places my criteria like you know my friend’s like them and they are organic, they are open, then that pizza place can call me. I have the money why should I do the search? So the whole business of search, the whole structure of search is predicated around a revenue model, but its a really short-sighted revenue model, its not a brokerage.

Search isn’t search, search is hand waving.  These should be moments for us to have a discourse. So problem we are seeing in AR with communication of the right information is actually underneath AR, at the level of the whole infrastructure.

Search needs to be inverted, trust filters need to be built. We need to democratically own our data institutions.  We don’t right now.  That will be more of a concern, especially with AR.

Tish Shute: Yes, especially with AR, which is this why got all excited about federation.  Do you think federation has the potential, an opportunity to create [the new infrastructure you describe?]

Anselm Hook: Absolutely,  its absolutely what we must do. It is much harder to do. It is absolutely critical.

Tish Shute: And why is it much harder to do? Could you explain that?

Anselm Hook: Well, it’s very easy for a bunch of hackers to build a service that you log into and fetch some data, it’s a single thing. They don’t have to talk anybody, they can use their own protocols, they can hack it, it’s a big black box, behind the scenes. There’s running back and forth in a giant Chinese room delivering manuscripts and scrolls to you. Whatever is behind the black box, you don’t care, it just works.  But when you federate, you need to actually publish and have standards, and then you’re talk about semantic, everyone starts getting really excited and wave some hands. It becomes a disaster. It’s, at least, another power order, more difficult than DIY, build it yourself.

Tish Shute: So, in terms of what Google Wave have done with their approach to federation, what do you think have been their achievements and what do you think is their obstacles? What do you think are the failings of the Wave? Because it’s the first big public major player backed approach to something federated, isn’t it? In real time.

Anselm Hook: Yes. I think the most important non-federated service on the planet today is Twitter.  Identi.ca it’s not getting any traction with respect to Twitter. [ Even though ] Identi.ca is a federated version of Twitter and is very good. [ Identica is now Status.net ] . So, we see already there that small players aren’t being competitive. Then look at other services like IRC. IRC is the secret backbone of the Net. All the open source projects, all the teams, all the people that work on opensource projects are all on IRC. It’s the only way they get anything done.

With Google Wave, and the protocols underneath Google Wave, we see an attempt to build a similar kind of real time, but distributed protocol. I think it’s the right direction. I think, people should pick up the offering and make their own servers. I think that protocol is really great, I think the fact that is compressed, its high performance, it is small, real-time of blobs of data flying around, all exactly the way it should be done. It is getting close to this kind of rewrite of the Internet that people keep talking about, because, you know, the net protocols are so bad, it is starting to treat the idea of intermittent exchanges being more transitory, volatile, and not heavy.

….to be continued.  Part 2 coming soon!

categories: Ambient Devices, Ambient Displays, architecture of participation, Artificial general Intelligence, Artificial Intelligence, Augmented Reality, culture of participation, digital public space, Instrumenting the World, internet of things, mirror worlds, Mixed Reality, mobile augmented reality, mobile meets social, Mobile Reality, new urbanism, online privacy, Paticipatory Culture, privacy and online identity, social gaming, social media, sustainable living, sustainable mobility, ubiquitous computing, virtual communities, Web Meets World, websquared, World 2.0
tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

3 Comments For This Post

  1. Darkflame Says:

    Wonderful interview, lots of very interesting stuff said and I’m sorry I don’t have time to reply to all of it specifically. Pretty much all of it along my own beliefs and lines of thinking, but also a heck of a lot of good points to let my brain digest.
    I especially found the Pizza example brilliant.

    I think overall I’m most scared of any emerging of a single “imageDNS” source.
    There shouldn’t be any “ownership” of linking. Not even a party-natural registry.

    URLIP is a one to one relationship, which is something we want to avoid for ImageData links.

    There should be multiple image matching sources of which the user/client is free to choose which they want.
    ImageWiki and Google Goggle’s are both great things, but I hope they will become just one of many varying data source’s that
    clients can look up from. (using some sort of standard protocol).

    Of course, the idea of a federated Wave based AR also ties into this, as Imagedata links can be created by and for
    relatively small specialist groups, as well as private and family use. However, Wave-based image links would have to be
    examined client-side for now, making them unsuitable for large-scale lookups. Still, smaller categorized image-based links
    still have their place.

    This all goes back to context being important; while the idea of “pointing at everything” to get information is
    attractive, we have to think carefully about how much of everything is in your field of view at any one time…chance’s are you
    only want to match a specific thing, and not be overloaded by information about the wallpaper, the floor, the type of
    lightbulb or whatever else catches the global image-searchs eye.
    Not to mention image-searching will be much more accurate and achievable if the system is only trying to match against a
    small subset of “everything”.

    I absolutely agree that clients should be able to automatically filter a lot of data by context, and work out the “most probable” things we want in any given situation. User’s should also be able to set preferences and override, but in day to day activity the
    user shouldn’t need to prompt the client too much. I don’t think I believe in a truly zero-click interface, as free-will and the unpredictable nature of humans makes that impossible. But I do think we can minimise it to near-zero.
    Even this is such a incredibly hard task to achieve, and thats another reason (imho) why we need an open federation for AR. There really needs to be good competition of AR Clients even when they are all looking at the same dataset. There is absolutely huge scope for research and experimentation into how best to choose and display the AR data that the user see’s, but it needs to be open for this field to flourish.

    “One is that the trust in social networks has not been built yet, so we can’t do peer based recommendations very well. We can’t filter noise by peers. Twitter kind of is moving there, but I don’t just want to listen to my Twitter friends. I want to listen to my friends of friends. If I am getting truth from somebody, I want to get that truth from people my
    friends say that they trust.”

    This is fairly close to how Vark works (http://vark.com/ask).

    Ask a question and it will get “the truth” sourced from your friends and friends of friends.
    (if you let it, it will also search strangers views, based on their user-picked categorize of knowledge, but it always bias’s towards those close to you).

    It strikes me theres still a massive gap for opinion based searchs too. A lot of what we need isnt just definitive answer’s, but rather “whats the best for me”.

    Whats the best restaurant near here?
    Whats the best bookshop?
    Whats the best videogame to buy now?
    What movie should I see?

    A movie recommendation from a movie critic or from a friend wont make much difference unless you know how close their opinions correlate to your own. I think theres a lot of scope for research here, and Ive been making a website based on this concept, but its a little offtopic.

    Either way, as our clients get more advanced they will have to learn our preference’s and taste’s if they are to show us what we want to see.

  2. Darkflame Says:

    Incidentaly, theres some corruption in the post I made above. I foolishly used greater then/less then signs to represent a too way link between Data and Image. Which, of course gets interpreted as markup.
    So;
    URLIP should read “URL to IP”
    and ImageData should read “Image to Data”
    etc.

    Sorry about that :)

  3. Jesper Says:

    The example of the pizza search was an excellent example of the deep dive required from each of Google’s search results. Anyway, I would any time prefer this ordering of results to the ancient Yahoo black/white/male/female/song/lyrics… categorizing of everything small and big. Google’s results match the randomness in picking clues from peoples brains, rather than retriving the content according to the same index key as when it was stored.

    However, the pizza example has been solved: a Danish company is offering a mediating service for small restaurants to overcome exactly the problems identified in the interview. On http://www.just-eat.com/ the enterprise is explained for new markets, but try out e.g. http://www.just-eat.co.uk/ you can search for the nearest restaurant according to your location and tastebuds.

    Can this method become integrated with the simplicity of google search – and even better, filtered onto your AR goggles?

14 Trackbacks For This Post

  1. Social Commons for the Physical World Platform « Games Alfresco Says:

    [...] considered a crime not to read them if you are an augmented reality fan or professional. Her last interview with Anselm Hook is especially interesting and [...]

  2. Social Commons for the Physical World Platform | Augmented Reality Says:

    [...] considered a crime not to read them if you are an augmented reality fan or professional. Her last interview with Anselm Hook is especially interesting and [...]

  3. Geospatial Technology « Steve Wilde Says:

    [...] http://www.ugotrade.com/2010/01/17/visual-search-augmented-reality-and-a-social-commons-for-the-phys… [...]

  4. Visual Search, Augmented Reality, and Physical Hyperlinks for Playfulness, Not just Purchases: Talking with Paige Saez about ImageWiki | UgoTrade Says:

    [...] most importantly, they have been actually developing applications (again see my interview with Anselm for more background on this), to allow people to play with, hack and explore and create with the [...]

  5. Are We Entering the Age of Augmented Trademark Infringement? | This Is An Awesome Web Site Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  6. Are We Entering the Age of Augmented Trademark Infringement? | iPhone 4 | iPhone | iPhone Review | iPad | iPod | iTunes Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  7. Are We Entering the Age of Augmented Trademark Infringement? Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the "imageDNS" – [...]

  8. If we in the era of Augmented trademark infringement? Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  9. the hive » Are We Entering the Age of Augmented Trademark Infringement? Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  10. Are We Entering the Age of Augmented Trademark Infringement? « Dave Saunders Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  11. www.readwriteweb — Are We Entering the Age of Augmented Trademark Infringement? | the leak in your home town Says:

    [...] an interview with UgoTrade author Tish Shute, AR developer Anselm Hook discussed what he calls the [...]

  12. Vision Based Augmented Reality (AR) in Smart Phones – Qualcomm’s AR SDK: Interview with Jay Wright | UgoTrade Says:

    [...] Shute: Anselm Hook is very interested in having some kind of open standard around this physical tagging of the world, [...]

  13. Augmented Reality Year in Review – 2010 « The Future Digital Life Says:

    [...] Quotes – This one comes from Tish Shute’s interview with Anselm Hook. When you are mediating the translation layer between the image and the data, then there is an [...]

  14. Interview with Vernor Vinge: Smart phones and the empowering aspects of social networks & Augmented Reality are still massively underhyped | UgoTrade Says:

    [...] Tish Shute: Powerful computer vision apps are emerging for smart phones and face recognition technologies are beginning to appear in consumer apps. Do you think we need a major shift in the way we handle data ownership? And, is “there is a real risk of our augmented reality world being owned by interests which are not our own?” (see my conversation with Anselm Hook last year. http://www.ugotrade.com/2010/01/17/visual-search-augmented-reality-and-a-social-commons-for-the-phys… [...]