We have some HTC Vive VR systems at the Department of Design at the Aachen University of Applied Science. This system is a game changer for consumer grade Virtual Reality applications. I have to admit that I am awed by the engineering feat that Alan Yates and Valve have pulled off with the HTC Vive. I was not aware of the details when I first used the system.Continue reading HTC Vive is a tech breakthrough
Minority Report science adviser and inventor John Underkoffler demos g-speak — the real-life version of the film’s eye-popping, tai chi-meets-cyberspace computer interface. Is this how tomorrow’s computers will be controlled?
G-Speak is a really interesting concept. Right now I do not feel it is where it should be to be adopted on a broader scale: You need a certain environment with at least 2-3 square meters of space in front of a quite large screen.
I wonder if Microsoft will offer a extension to its Project Natal sensor some day — so that voice commands, body language and hand gestures create an immersive UI.
The algorithms for facial recognition have improved a lot in recent years. Here is a company showing a working prototype of a mobile app that recognizes faces and attaches links to social network layers to them:
The prototype was shown last year — but there was a live demo at the Mobile World Congress in Barcelone last week. Obviously the company that also created the polarrose.com service wants to turn this in to a real application.
The implications of this is shown in the video: when looked through the “eyes of the app” people virtually carry logos, brands, name tags and messages around.
10/GUI (by Clayton Miller) is an novel approach to human-computer interaction. But it draws attention to the fine line designers will need to walk to effectively create physical human-computer interactions.
The video demonstrates the potential advantages of navigating within a desktop interface with up to ten fingers, rather than via a single cursor:
[More on fastcompany.com]
These research prototypes for future computer mice by Microsoft are very interesting approaches:
The new iPhone 3GS adds a compass to the set of sensors. Combined with the GPS, the motion detection sensor and some image change detection via the internal video camera, this enables a new breed of “augmented reality” applications.
NearestWiki for example displays WikiPedia entries about buildings and places in the vicinity.
NearestWiki is not the first augmented reality app for the iPhone, but it is the first that is not tied to a specific region or city (like Metro Paris)
Next versions of the iPhone may feature more precise sensors and a lower latency – giving a much better feeling (e.g. labels not jumping around in the scenery).
I am just collecting some thoughts about some observations and issues – while I am trying to understand Google Wave.
(see a demo on their site)
Google Wave is an integrated set of technologies (with protocols that allow semi-synchronous editing of outlines and their federation across several servers). With this approach Google Wave solves some difficult technical and infrastructural problems.
But it also generates some new problems that need to be solved to make Google Wave a success — otherwise I think users will not adopt the system (which in case of Google will set the seal on this project I suppose!).
1. Misty horizon
Google Wave is a frameworked solution for things people did not ask for and communication processes that no one is practicing yet (but no one has really “asked” for the mouse as input device either!). It is hard to see where Google Wave is going to be. This breeds creation, but it also challenges the non-developer. There will be best practices, but it will take a lot of time to identify use cases that people can learn “to wave” with.
So Google Wave challenges the imagination – and few people will be able to answer the “What is it all about?” question easily. The horizon is schrouded in mist.
Possible approach: A potential solution to this is to start with guided tours (a LOT of them) showing very common and powerful use cases for different scenarios. This is probably going to happen when Wave gets closer to the public beta.
2. Asynchronous patterns
We have learned to communicate in a turn taking fashion. It is polite to let someone speak until he has finished before starting to respond. It is not polite for everyone to speak up at any time. Waves allow people to reply or edit without obeying to the turn taking pattern. This can cause “stress” and also a lot of misunderstanding. People could reply to a text, that is going to change without them noticing that. Their reply suddenly become nonsense – the playback feature could become the only way to percieve a conversation properly. But playback is new – people have learned that the threaded view is a chronology – but in Google Wave it is not (or not necessarily).
Even with the playback feature, people need to become aware of the asychronicity in Google Wave – and learn how to recap conversations correctly.
Possible approach: Find a very good way to understand the chronology of a wave (e.g. making the playback as fundamental for navigation of a wave like scrolling)
3. Information (over)flow
While Google Wave may integrate many messaging systems – it also generates a lot of density. Means of communication that were apart from each other – using different URLs and applications for each – are now combined. The crucial part of that is to understand which option is suited for what purpose.
With Waves being set to “updated” by displaying them in bold typeface and sorting it to become a top item in the inbox, this also means that things are brought to my attention that should remain buried for a good reason. Google Wave users would have to learn how to manageund understand the “Inbox” and the “Active” areas properly, to be able to get the most out of it.
Possible approach: Allow users very powerful and fine grained control over the way they are informed about updates.
4. Scattered spaces and framgmented scopes
One of the things that really can make things too complex to be comprehended properly is that people can read & write to waves – but replies can extend or narrow the scope (e.g. who may read and reply to a new item. Who is reading? Who am I replying to? Is this part really private or not? Am I releasing a secret to the public accidentally?)
With a view from a different angle: What I can see within a wave may be different to what someone else is seeing. To make my communication appropriate to the situation I need to be able to “read” from a different standpoint. It is required to understand when communication could fail on the recieving end.
Whenever I want to understand the perspective of someone else – in need to be able to represent his/her view in my mind. The change of scope for parts of a wave within that wave can make this difficult.
Possible approach: Make any changes of the scope (e.g. recipient list) within a wave very visible and allow users to navigate them.
The developers of the Webkit HTML rendering engine (the one that is used in the Apple Safari Browser) have added 3D styles to CSS. It allows layers to be rotated, scaled and moved in a 3D space.
You need to download a nightly build of the browser to see it working.
There are quite a number of applications for this I can think of. I wonder if this approach will be adopted by the W3C for a new CSS standard.
I have been thinking about Project Natal over the weekend. I do not want to discredit some of the innovations Microsoft has created over the last two decades – but for the most part Microsoft has not been able to create innovations on its own (but rather mimicking or buying stuff from outside). There may be some advances like C# and .NET – but generally this is insider stuff – meaning nothing to a wider public.
Project Natal may be the first true innovation with an Microsoft stamp on it. Fifteen years ago I have seen programmers trying to recognize 2D movements of arms and legs from a video – with results that were respectable – but never a game changer. Too much CPU power was required back then to be relevant in the consumer market.
To include the 3rd dimension in the motion detection is such a game changer. Combined with voice and face recoginition, this takes away the controller out of the control: your full persona is represented in the system – not just your fingertip. This is radical – and it has been a dream for many many years.
Just look at this example from game designer Peter Molyneux from Lionhead:
The device is so complex that a developer will have to have access to an SDK that allows simplified communication with the sensory system of Natal. Frameworks could provide automatic recognition of gestures to programmers – even in combination (so I you wave your arm, that would call another function than waving your arm and saying “Bye!”).
The level of precision could increase with future revisions. It could be combined with classical controllers. Maybe one day even finger positions, fluctuations/timbre of the voice, body temperature or point of view will be detected as well. Simple “lite” versions specialized on facial parameters could replace webcams in laptops.
So I do not look at Natal as a game controller – I see it as a complete new interface generation coming up.
Hats off to Microsoft!
Obviously Microsoft feels the need to claim back some market share the Nintendo Wii took away with a new controller type. Project Natal is utilizing a range of biometric sensors for body motion, face and voice recognition.
The video is more a vision than an actual feature presentation. But it is clear what the goals are.
Here is another Video from the demonstration that shows what is possible right now:
- Better and more intuitive devices interaction
- Everyday devices connected to the Internet
- Multi-touch, without touching the screen
- Interactive and intuitive user interfaces for better browsing
- Gesture based interfaces
- Interfaces aware of context
- New materials that will influence UI
While I agree with the list in general there is something I do not like about it: this list is purely determined by technological advances.
We will see changes in almost all areas of society: how we shop, how we love, how we go about politics, what we regard as value, etc.
So I add some other (very speculative and spontaneous) ideas that are not so much based of the hardware innovation:
- Laws that require users have ideal control over privacy issues (hopefully!)
- Programmable operating systems on any device with good scurity
- Redundant storage on different locations that “logically cloud together” in a personal and searchable environment
- Working culture that permits more work “on the road” as before (specifically regarding the social aspects involved in this)
- Affordable plans for wireless connectivy and low-priced roaming
- Architectural advances that integrate media and new display/projection technologies into the interior environments
John Chung Lee made an interesting software that uses a Wii remote to allow interaction with multiple pens on any screen or projection.
One of the most interesting topics for information architecture is search. There are ways to find, explore, browse and discover things in digital domains. The value of information increases with the possibility of being found. So design for findability becomes the most important strategy to increase the value of information.
One of the distinct experts in the field of Information Architecture is Peter Morville. He gave an interesting one-hour talk at Google about »Ambient Findability and the Future of Search«:
He talks a lot about the problem of search in general (he is speaking at a search engine company). How to enable better search and findability is a question of a) metadata and b) representation.
It is the representation aspect of searching and finding, that is still a huge area for design innovations. While improving the Google search result page may be too difficult, there are a lot of very specific problems where searching and navigating an information domain gets a very interesting and particular design issue.
A designer needs a good understanding of the fact that users have different approaches of locating things depending on
- the nature of the information,
- the structure and relations,
- the quantity of data,
- their habit of solving things systematically and
- the prior knowledge about the domain.
There constantly is a discussion about making the computer feel more “natural” to the user. I do think that this approach led to the graphical user interface we are all used to today and it is philosophically the right approach to deal with technology. But I also do believe technology is not yet ready to allow “natural” interaction in most occasions.
If you don’t believe me this video will hopefully bring this discussion to an end:
And don’t miss this Second Life parody:
While researching some information about user interfaces for video sharing websites I stumbled over a statement in an article at news.com by Alonso Vera of NASA Ames Research:
»Design is starting to change who succeeds and who fails. A few years ago that wasn’t true. If I had a better algorithm, I would win!«
I had a conversation about this yesterday. Users are fed up with lousy interfaces. Whenever I do a research into a UI field – mobile phones, websites, software applications – I run into these strange situations, that the user is left alone with an error message and litterally no practical advice what to do. It really baffles me, that obvious pitfalls are left open until release of the software or product and kept unadressed for month if not years. So if Design comes a market driver, that is a very good sign!
Example: If you are a Macintosh Users (and some are indeed) and you go to hi5.com – a music/video community website with 50 million users (!) and you want to upload a video file (one of the core features), you will be presented with a form where you can enter title, tags and category for your video. Fine.
But after clicking you will see this error screen (above) telling you that someone named VideoEgg (What by the way do they have to do with it?) does not support the browser you are using and that you need to go to videoegg.com to check a list of supported browsers.
While I wondered that I was not presented with that list (or at least a link to that list) directly (Again when were Hyperlinks introduced to the World Wide Web??), I almost got angry after unsuccsessfully trying to find such a list anywhere on the proposed site. Remember: Hi5.com claims to serve 50 million users!
I took me half an hour to try all browsers available on my system – costing me time and quite some nerves AND money. And the effect of this really obvious and simple UI problem: I won’t give hi5.com a second shot on me or recommend that site.
What does that mean? Maybe the “web 2.0” market is one where idiots happyily buy stuff from slightly smarter idiots?
Tobias Jordans pointed to a new astonishing video of a 8-by-3 inch two-panel multi-touch sensitive wall mounted screen:
The video makes evident, that the interaction feels continuous and natural. And since Apple has shown with the iPhone that multi-touch can improve small interfaces as well, I think this technology could replace the mouse one day.
Update: There is also a short presentation in February 2006 by Jeff Han on The TED 2006 conference.