Windowed user interfaces have remained the dominate, all-pervasive interface since the mid to late 1980s for all but the most specialized computer systems. Irreverently donated “WIMP” type interfaces (window, icon, menu, pointer), they are indeed versatile, flexible, and more-or-less intuitive, to say nothing of their 25+ year establishment as the prevailing graphical interface to virtually all general purpose computers. WIMP type interfaces are of course not limited to Microsoft’s line of Windows operating system. In fact, the very first WIMP interfaces were created at the Stanford Research Institute and Xerox’s Palo Alto Research Center in the late 1960s and early 1970s.
The 1984 release of Apple’s Macintosh personal computer brought WIMP type interface into the mainstream, on the heels of which came Tandy’s DeskMate, Amiga’s Workbench, Microsoft’s Windows 1.0, and the X Windows system for Unix and later Linux based operating systems. As computer technology continued to improve through the 1990’s, adding more memory, faster 32-bit CPUs, and significantly higher graphics capability, WIMP interface likewise advanced with improved resolution, higher color depth, and the ability to multitask between applications. Such advances were seen with Windows 95, Windows NT, Mac System 7, OS/2 Warp, and X Windows for Unix and Linux. Yet, the WIMP paradigm remained essentially the same. Even today, Windows 7, Mac OS X, and the latest 3D advances to X Windows system (such as the Compiz desktop cube), WIMPs remain the undisputed champion of the graphical user interface. The obvious exception, of course, is the multi-touch interface used on the iPhone and similar small device interface, as well as the forth coming Apple iPad.
Two Possibilities for Stagnation
There are two possibilities why WIMPs have been the predominate human-to-computer interface for the past 25 years. Either WIMPs provide such a simple, flexible, and intuitive means of interaction that anything else imaginable is just a second-rate hack, or the entire computer industry is suffering from a chronic lack of innovative imagination. Apple challenges us to “Think Different” (or at least they did in previous marketing campaigns), and Apple does shine when it comes to imagining new methods of human-to-machine interactivity. And yet, the most innovative interface they have managed to come up with in the past 20 years is the iPhone/iPad multi-touch technique. But Mac OS itself is still a WIMP.
While it is true that WIMP type interfaces are relatively simply, flexible, and intuitive (in the sense that most people are by now familiar with them), I do not believe they are perfect. Nor do I believe we will still be using WIMP type interfaces in the 2020s. Something else will have taken their place. Computers will become so advanced in the next 10 to 15 years that a WIMP interface will be akin to trying to pilot an F-22 fighter with a biplane’s control stick — it would be far too restrictive, to say nothing of being bland and unimaginative. The technologies and innovations necessary for this “new interface” of the mid-to-late 2010s are already here. In fact, we’ve seen bits and pieces of it in one form or another, scattered here and there in such devices as touch screens, netbooks, rich web GUIs, voice recognition, the iPhone, the Wii-mote, and even the complex-yet-intuitive interactive menus seen in the latest multi-player online games. All of these interfaces, while presently diverse, inconstant, and highly divergent, will soon come together to usher in the “Next Big Thing” in the evolution of the personal computer.
False Starts a-Plenty
Of course, every company in the computer industry is looking for the “Next Big Thing”. Since the invention of the mouse, there have been countless attempts to contrive new and better ways for humans to interact with computers. Most have failed. Devices such as the trackball sought to replace the mouse, and although some people still use them they have seen only marginal adoption. Stylus tablets are only used by graphics artists or in computer-aided design applications. Touch screens have significant potential, but until very recently, have only seen widespread utilization in specialized niches such as ATMs and kiosks. Voice recognition still still lousy and is not really practical in an office environment. Can you image what it would be like in a cube-farm of office workers dictating emails and spreadsheets to their computers all at once? The concept of 3-D mice and interactive glove devices, while nifty and interesting, proved to be dismal commercial failures. And then there is still the problem of the window, icon, menu, pointer paradigm itself — a paradigm that has proven extraordinarily difficult to surmount.
To be sure, there have been many valiant efforts (and quite some horrible ones) to escape the flat 2-D, rectangular box world of the WIMP. Visionaries of the late 90s foresaw advanced 3-D virtual realms where users might interact with their computer environment through electronic sensor gloves to rotate objects, push and pull applications into view, and type, quite literally, “in the air” on virtual keyboards floating in a cyberspace around them. More down to earth, Microsoft tried it’s hand at BOB — perhaps the most embarrassing product ever released by the Redmond Giant. Apple tried to improve the WIMP interface with OS X, but the only real change was its impressive array of eye-candy. For Linux and other X Window system there is Compiz, and despite it’s eye-popping “wow!” factor, Compiz really does nothing to escape the WIMP type interface — it only adds a rotating cube and OS X-style eye-candy. Windows Visita introduced Flip 3D, but that only proved to be a sad, sad shadow of what Exposé and Compiz are able to do. Sun’s Project Looking Glass had the potential to become a true 3D desktop environment, but the project seems to have been abandoned since January 2007. One interesting new innovation for modern WIMPs that has caught on are “desktop widgets”. Widgets are single purpose blocks that performs a useful function, such as displaying the weather, providing a note pad, or displaying information on the computer’s system resources. However, today’s widgets are simply jammed on top of existing WIMP interfaces. In fact, none of these innovation go far enough to dethrone the haggard WIMP.
Prelude to the Next Big Thing
To begin with, we need a name to call this hypothetical class of user interfaces beyond the WIMP. Most of the time, such user interfaces as simply referred to as post-WIMP interfaces, but such a designation only describes what the interface is not and tells us nothing about what it is. From this point forward, I will call this post-WIMP interface NICE, which is short for Naturalistic Interactive Contextual Engagement. Cheesy, maybe, but who cares. NICE is nice.
There are four key technologies that underpin the coming advancement of NICE interfaces, and these are where the true innovators must focus. 1) the death of mice and keyboards, 2) multi-touch combined with heptic interaction, 3) naturalistic display of applications and information, and 4) seamless Web/OS/Application environments. Of all of these, point number 1 is most likely to raise some eyebrows. Obviously, people still need input devices of some kind, so it would be more accurate to say “the death of mice and keyboard as we know them”, but I will get to that shortly.
Many other technologies will likely go into NICE interfaces. Such technologies will include voice recognition, facial recognition, gesture recognition, eye-tracking, and further advances in 3D manipulation, better virtual environments, greater CPU power, and higher network speeds.
Toss that Mouse, Trash that Keyboard
Let’s make one thing clear, the keyboard has to go. After all, keyboards are just glorified typewriters based on technology dating back 150 years with an alphanumeric placement scheme identical to that which was popularized in the 1880s. Don’t get me wrong, I do like Victorian-era aesthetics, and once you learn to touch type on a QWERTY keyboard it is an amazingly fast method for inputting text and information. But, in all honesty, the device we call the keyboard has to go. For one thing, the keyboard is an extremely limiting device. Most have between 101 to 104 primary buttons, and a few “advanced” keyboards add some multi-media buttons and various hotkeys. Apart from entering text and hitting those few pre-defined buttons, what good is a keyboard? Another minor quibble is the fact they are really hard to clean, and after a few short months of use can become quite nasty. I dare you to inspect your keyboard closely. Go on — take good a look!
As for the mouse… well, it was certainly an innovative invention for its time, but that time has come and gone. The same holds true for simple mouse alternatives like trackballs, laptop trackpads, and stylus pads. Just as we should throw out the keyboard, so too should we toss the mouse. Yes, a mouse is a very simple and useful tool, but it is still a rather limiting pointing device with just a few buttons (usually just two or three, and maybe a scrolly thing). Plus, you can only use one mouse at a time. Most modern computers allow you to plug in more than one mouse at a time, but they do not provide a second pointer. Even if they did, using a second mouse with your off-hand is oddly difficult (give it a try!). Consider the fact that you can use both of your hands and all your fingers rather intuitively to point at things and manipulate physical objects in the real world. Why not on computer interfaces as well? The technology for this is already prevalent, and is currently in use on devices such as the iPhone and the new iPad.
You Can Touch This
So the mouse and keyboard are both mercifully gone — and good riddance. But what now? You are sitting in front of a computer screen with no keyboard and no mouse. What do you do? The most natural course of action is to simply touch something you see displayed on the screen. But this is nothing new. Touch screen technology has existed for decades, most prevalently on ATMs and kiosks. Touch screens are also found on in car navigation system, PDAs, cell phones, and of course the iPhone and iPad from Apple. In fact, touch screens for tablet PCs have existed for some time, long before the “marvelous” iPad, and can also be found for desktop PCs and even some laptops as well.
Touch screens have one key advantage over an indirect pointer — immediacy of interaction. We are born knowing how to manipulate objects with our hands and fingers. An object on a screen such as a box or button, or even 3D objects such as a sphere or cube, can be quickly and intuitively manipulated by hand. The only real downside of touch screens is added cost (which will certainly go down as demand rises) and getting fingerprints on the screen. As for that last point, touch screens are at least trivial to clean — just give it a quick wipe with a cloth.
Whereas a mouse and older touch screen technology is restricted to a single pointer, Apple’s iPhone and iPad, as well as Microsoft’s Surface, are the preeminent examples of multi-touch technology. Multi-touch allows a user to employ one, two, or more than two points of contact to control and manipulate objects. For instance, a user could press his finger on a photo, hold and drag it around, or he could touch two corners at the same time and “pull” the photo wider. He could place his fingers on two separate photos and drag them around independently, or pull them together to “merge” them into a single photo. This is but one small example — the possible number of multi-touch combination are literally endless.
But what about the keyboard? Surely, no one is going to type on a little virtual keyboard on their touch screen. The first concession we must make is that many people may prefer to use keyboards even in a post-WIMP world of NICE interfaces. But there are alternatives for those who wish to use them. The best alternatives are the so-called “virtual keyboards”. A virtual keyboard could simply be a touch screen of roughly keyboard sized dimensions (about 16 inches by 6 inches), which lays on the desk in the traditional keyboard position. This virtual keyboard, or control board more accurately, can project anything imaginable — including a traditional QWERTY keyboard layout. The real advantage, of course, is that this control screen is not limited to a QWERTY keyboard layout. It could display the keys in alphabetic order, or display keys for the Greek alphabet, or even Chinese characters. It could display additional function keys for an active application such as save, print, and open buttons, or buttons to control font and formatting. A graphics artist could draw directly on the control board with a stylus. When playing a movie or music, the control board could show multimedia controls. When playing a game, the game could display a customized control board complete with buttons, dials, gauges, spinner controls, action macros, and anything else the game designers want. And it would all work with full multi-touch capabilities.
Of course, the major draw back of a touch screen control board is the fact that you cannot feel the keys or buttons. This is a legitimate concern which needs to be addressed. One possibility comes from the field of haptic technology (“haptic” meaning “sense of touch, or force”). With haptic technology, it may be possible to create a touch screen with thousands of small actuators that can generate subtle but focused vibrations at the point of contact. These vibrations would stimulate the nerves in fingers upon contact, thus providing tactile feedback upon pushing the virtual button. If the control board was pressure sensitive such that lightly touching a button only activates the vibration, but does not actually trigger the button, then the user could “feel” the buttons without having to look down. Then the user could activate the button by applying more pressure from his finger — and more pressure would cause the virtual button to “push back” with greater vibration. The control board could also produce a “click” sound as well. This would simulate, at least roughly, the tactile feedback of a physical keyboard.
Interactive multi-touch display screens in combination with touch screen control boards will go a long way toward eliminating the need for mice and keyboards, as well as jumping light-years beyond their capabilities. But the post-WIMP world of NICE interfaces certainly does not end there. Voice recognition has come a long way in the past 20 years, but it also has a long way to go. As voice recognition continues to improve, we will be able to interact with computers with voice commands, or at least dictate text with a reliable degree accuracy. However, interacting with a computer by voice is not always practical or preferable, especially in office settings. Rather than typing or dictating, some people may prefer to write by hand (for those of you who do not know, I am referring to that ancient and lost art of handwriting). In a world of NICE interfaces, we may not even need a control board at all. The keyboard itself, or any other controls, could be entirely virtual: visible only with specially glasses or projected into a surface with laser beams (such a laser projection keyboard is already available from I-Tech). Perhaps a special glove or a device similar to a Nintendo Wii-mote could be used to interact with objects displayed in a full 3-D space and enabling gesture-type commands. Indeed, a special glove or device may not even be needed if the technology to “watch” a user’s hands becomes feasible.
Such 3-D gesture interactions combined with touch screens brings multi-touch capabilities into a whole new realm of possibilities. A user could tap the corner of a photo on his screen, pinch his finger and thumb together, and pull back — thus, “pealing back” the edge of the photo. He could rotate his hand to flip the picture over. He could nudge his hand forward, pushing the photo away, or pull his hand toward himself bringing the photo closer into view. The same could be done with any kind of object presented on the screen, with countless possible interactions enabled by an endless variety of multi-touch and 3-D multi-gesture combination.
Think Outside the Window
Perhaps the most important goal of NICE interfaces will be the elimination of the the prevailing WIMP paradigm itself — that is, the paradigm of windows, icons, menus, and pointers. The pointer issue has been addressed, but what about windows, icons and menus? Surely these cannot be entirely eliminated!
In truth, it may not be possible to entirely eliminate windows, icons and menus. They are, after all, pretty good ways of presenting things for a user to interact with. People simply expect to be able to select an application from either a menu or an icon and have it pop open in a window. But nothing says that computers have to behave that way. Applications and operating systems only work that way because people programmed them to work that way. They can be programmed to work differently — and, hopefully, in more natural and intuitive way.
To begin with, we have to consider for what purpose people use computers. In short, people use computers to run applications that do specific things like word processing or viewing a movie. Sometimes people run more than one application at once, but rarely will they be using more than one at a time (except perhaps listening to music or watching a video while browsing the internet or checking email). When a user wishes to “do something”, how does he go about telling the computer what he wants to do or what application to open? With WIMP computing, the user must open a main menu, scroll through a list of high-level menus, drill down to a sub-menu, and click on an application name. If he might even have to drill down through several sub-menus to find the application. The location in the menu true may not always make sense to the user either, or may be organized by the name of the company that made the software itself. Yuck.
A hypothetical NICE interface would do away with the desktop altogether and simply show a visually attractive representation of the computer system’s available services and applications (similar to the iTouch/iPad, but going well beyond). What this representation looks like could vary widely, and the user may be allowed to choose his preferred interface style, or customize it has he sees fit. One possibility is a sphere of bubbles floating a 3-D space, each bubble being an application, service, or utility. The bubbles could be big or small, near or far, depending on how often they are used or how important they are to the user. Selecting a bubble (touching it or using a gesture) would unravel the bubble into view. Another possible representation might take the form of a large rotating dial, or dials within dials, divided into functional areas such as “Work”, “Entertain”, “Communicate”, and so forth. Rotating the dial to one of these areas would cause a larger selection of applications to rotate into view. This expanded area could then be rotated to select the desired application. If 3-D environments are undesirable, or deemed excessive or impractical, simpler representations are of course possible as well.
In fact, why should a user need to select “word processor” or “email client” in the first place? Perhaps there is a more naturalistic and intuitive approach. What the user really wants to do is “get back to writing my thesis” or “send a note to my friend, Bob.” NICE interfaces will be based on this kind of natural interaction, rather than the traditional menu driven interactions we are currently used to performing. For example, using a dial-type interface, the user would have a set of initial options such as “Work”, “Entertain”, “Communicate”, “Shop”, “Read”, and perhaps “Manage” (for managing the system itself). Turn the dial to “Work”, and the dial expands to show “Use Calendar”, “Do Some Writing”, “Financial Records”, and “Garden Plans”. Selecting “Use Calendar” will then load the calendar into view. If instead he turns to “Entertain” he might see such options as “Watch a movie”, “Watch digital TV”, “Listen to music”, “Play a game”, or “Open photo albums”. Selecting “Watch digital TV” would open a listing of TV channels which he can view. Selecting “Watch a movie” or “Listen to music” will bring up a list, or additional search options, to find a particular movie or music (whether it is on his computer or access through a remote service over the Internet). Back on the main dial, he turns to “Communicate” he would get options for his contacts, a public directory, voice call, video call, text message (email), text chat (instant message), and perhaps even for writing and sending a physical letter or card through an online service. The “Shop” option would entail all forms of shopping (various online stores organized by specialty or preference), as well as finding physical shops to go to in town, or perhaps even ordering pizza to be delivered. Turning to “Read” would bring up options for reading books, magazines, and newspapers from online services, or for visiting web sites and searching the internet for information. The option for “Manage” exists because he has to have a way to manage various preferences about the way he interacts with the computer, or to browse for new software. Adding a new option to the dial or reorganize where things are on the dial, may be as simply as moving the items around and snapping them into place.
When an application itself loads it will still be a kind of window. Today, the term “window” is used in computer terms to mean a “window into the application through which users interact”. Really, then, a window is simply an interactive view. In a post-WIMP world, these views do not need be square or rectangular: they could be a circle, a cube, a collection of separate floating components, a set of interlocking frames, a multi-layered grid, or a interactive 3D world. Of course, because most things that we interact with on computers such as documents, emails, financial statements, books, and movies, are all inherently rectangular, most applications will likely also be rectangular. But the squares and rectangles need not necessary be a restriction, and so the idea of “windows” should naturally give way to the idea of “interactive views”.
It should also be pointed out that power users, system administrators, and programmers should have a views into the computer’s inner workings that is normally under the hood as far as general users are concerned. Users who are interested in browsing the file system, setting up a scheduled command script, optimizing the start-up sequence, or doing their own programming, should be able to do so through a system management area where more complex interfaces would exist.
Obviously, the example given here is purely hypothetical and is only meant to illustrate the point that we do not have to restrict ourselves to a window, icon, menu, pointer type interface.
Computer Power Without End
The final piece of the puzzle in developing a post-WIMP system is the computer itself — that big, noisy, power hungry, failure-prone box sitting next to you. There has been, in recent years, a great deal of talk about “software as a service” and “cloud computing”. If these ideas come to fruition, and wireless and Internet bandwidth speeds continue to improve, and if certificate-based encrypted communications (or something similar) can be made “really, really, really secure”, then the box on the desk may eventually become nothing more than a 3-D graphics and information rendering box. All your applications, data, movies, music, games, indeed everything, will be stored in either your personal online data vault (through a system that must be extremely secure and tamper proof), or available through subscription channels (i.e., music, books, movies, newspapers, and so forth). You may think “I don’t want everything online!” but consider the possibilities: where you go, so goes all your data and subscriptions. Sitting at a coffee shop, you could open up your cell phone or pocket computer and access any newspaper or movie service you are subscribed to. TVs will also be connected to your online subscriptions (indeed, TVs will just be big screen computers), and can access all your movies, music, shows, and games. In your car, you could dial up whatever music services you are subscribed too — or have books, newspapers, or magazines dictated to you. If you are at a friend’s house and want to show him that great novel you are working on, you can do so from his computer if you have shared a read-only version of it with him (which, if you forgot, you could access through your cell phone). Did you forget to bring those adorable baby pictures to work to show to your all colleges? Not a problem (except perhaps for them) — just call the picture up from any workstation that is linked to your personal data vault.
This idea is already here, in a very limited way, with such services as Flickr, Google Docs, and various social networking sites. You can also store files online through various secure data storage services. But none of these services have expanded to their logical conclusion, nor are they in any way consistent or transparently interconnected. Virtually all computers today still have hard drives where most users continue to store all their documents, photos, movies, and music. For people who do not backup frequently (of which their are many, simply because the average person does not know how), the risk of losing everything on their computer is very high. With full online storage, that issue goes away.
With the addition of cloud computing, much of the number crunching that goes on inside your computer could, in theory, be handled by massive, distributed computer systems dedicated to offloading CPU intensive tasks. Cloud computing also includes software-as-a-service, where complex facial recognition software, voice recognition software, or a “personal assistant AI”, are not actually running on your computer, but on a supercomputer which has the horsepower to handle it. The supercomputer will likely not be a single computer, but rather, a distributed system of thousands of servers. The distributed system may not even be owned by anyone, but could run on public nodes with every computer online acting as a “node” capable of processing information. Or may be not. Who knows what the future really holds?
The development of Apple’s iPhone multi-touch interface and the new iPad tablet mark the advent of the first generation of NICE interfaces in the coming post-WIMP world. They are not the end-all be-all of some revolutionary new idea, but rather, should be understood as prototypes of what is yet to come, much as Unix Motif and the Apple Lisa were prototypes of the WIMP interfaces of today. The next ten years will see this revolution move into full gear, and I predict that even Microsoft will deliver a partial post-WIMP interface with Windows 9 when it is released in 2016, although a true NICE interface may not be released until Windows 10 (or would that be… Windows X? Gasp!).
The early part of this decade is also a time for open source developers to take the lead and dive head first into the post-WIMP world. Linux, with X Windows (X11 and X.org) and its desktop environments such as KDE, GNOME, and Xfce, have an opportunity to leverage their position as open platforms to usher in a Renaissance of post-WIMP interface designs, prototypes, and free experimentation — the wilder the better. Linux developers should take pride in their efforts and seize the initiative to become the leaders in this new “think outside the window” movement. After all, Linux is better than Windows, right? Why not be better than Icons, Menus, and Pointers as well?
The only real obstacle that I foresee is the implementation of a multi-touch control board to replace the traditional keyboard. Some effort is already underway (for example, the Optimus Tactus keyboard from Art.Lebedev Studio), however, all are in the early prototype stages. When such control boards are released, they will likely be prohibitively expensive: perhaps $800 or more in 2011 or 2012. By 2015, the price will hopefully be in the $100 range making them at least reasonable for the average consumer, considering the immense advantage they hold over clunky old keyboards.
Then, and perhaps only then, will we see the NICE interface revolution ascend to its ultimate potential.