July 05 2009

Why Current Operating Systems Suck: the Mouse

Tags: windows, touch, mouse

Background: Long Zheng mentioned an offer inviting people to get paid and provide feedback about their current experiences with operating systems. I can’t fault him for working to bridge the gap between OS makers and users (that’s gotta be the end result out of any research, otherwise its about as useful as shouting at a canyon), but I thought I’d go in another direction and vent directly about what is currently broken with the current crop of operating system. Partly because 20 minutes is not enough time to outline years of frustration and tired concepts, but mostly because the issues I have also have some historical context.

 

[All of the following should not be considered as fact. This is just the musings of a jaded geek. – Ed]

The Computer Mouse as a Concept

The current form of the mouse is a concept that come out of XEROX PARC in the early 70s. It had a metal ball underneath and three buttons on the top. There were some earlier versions which were much more radical (one even used a bowling ball as a component) but the essence of the mouse remains unchanged.

AltoMouseI am glad that the mouse style improved, however. Source: DigiBarn Museum

So, why change something that has been around for 30+ years and is attached to just about every computer out there? After playing around with other types of inputs which are designed to better represent natural gestures, such as stylus, touchscreens, big damn tables, and even tyre-changing addons, I can’t help but feel that it is time to stop building operating systems and software designed specifically for the mouse, and focus on progressing towards user interfaces.

 

You want specific reasons? Read on.

Fitts’ Law

Ever get frustrated when using a mouse that you have to carefully move so that you don’t overshoot a target on the screen (for example, the button to close a popup that’s appeared and distracted you)? That’s Fitts’ Law in action.

 

Having come from a Computer Science background, I was introduced to this theory in a subject called Event Driven Computing (which had a fair bit of work around user interfaces). Basically, this law states that the time to position the mouse cursor to a specific target is proportional to two things. The first is the distance (D) to travel, and the second is the width (W) of the target.

 

The law was first published in 1954, and has proven to be remarkable accurate in subsequent studies.

FittsLaw
Apologies for the math. It won’t happen again (until the next time).

It became apparent when the graphical user interfaces started to become popular that the issue needed to be addressed. The most common fix for this is to put menus and commonly-used items at the edge of the screen, so that the user cannot overshoot the target area.

 

I’ve included a screenshot of how the position affects the “target region” when comparing Safari on OS X and Windows. OS X works around Fitts’ Law by moving the menu to the top of the screen (just don’t try and close the window itself). Windows just ignores the issue and forces the user to aim for a small region.

TargetAreaComparisonSafari for OS X (top) and Windows 7 (bottom)

Some operating systems work around Fitts’ Law better than others, but real estate at the edge of the screen is always at a premium (and if you have multiple monitors you sacrifice some of real estate to get a bigger desktop space), so I view this as a hack rather than a solution. Plus, most applications cannot push all their interactive elements to the outer (unless you’re doing something trivial like a document viewer, perhaps) so they need to design their applications so that the mouse gestures are optimised. That has lead to conventions like menus.

MenuStructure
Sure, menus organise a lot of functionality. But can it be done better?

See all those keyboard shortcuts? That’s a good indicator that, after a while, users get sick of navigating menus with the mouse. Some applications that have a lot of functionality (this is from Visual Studio - you need to know shortcuts to maintain your sanity with using it) need to resort to using menus.

 

[Ed: I want to do some testing of whether Fitts’ Law applies to touch input – while it has some similarity to mouse input, I feel that it will perform better than using mouse input. Stay tuned.]

Why no Multi-User Interactions?

Ever get frustrated when you plug in a second mouse and expect (or desire) to see a second cursor appear?

 

Your average operating system is a single user experience. These days you can log in as different users (like in Unix and its derivatives) and you can have access multiple desktops running concurrently (eg. Remote Desktop). But at the desktop level, the standard is to have a single cursor and a single active window. That may have been something more suitable in yesterday’s world - where 640K was enough for anybody - but not for today’s world.

 

These days, I find myself wanting more. I find myself doing a fair bit of pair programming these days – whether it is throwing around ideas for implementing features or solving particular issues - but sharing a computer between two users is a tedious experience. Either we have one person coaching the other or we throw the keyboard and mouse between each other until the task is done.

 

To be honest, I’m not sure what I want specifically. But I do know that I want more freedom. I’m sure there’s some human-computer interaction rules that are violated by having multiple users being active at once, but that is beside the point.

 

Touch Gesture != Mouse Click

The Engineering Windows 7 blog has a nice overview of how touch gestures in Windows 7 behave - but they underlying vibe I get from the gestures is that they’re focusing too much on making touch behave like mouse clicks, instead of deliberately separating these features out. Things like this:

 

“Drag – Touch and slide your finger on screen. Like a dragging with a mouse, this moves icons around the desktop, moves windows, selects text (by dragging left or right), etc. This works everywhere.

“Flicks – Flick left or right to navigate back and forward in a browser and other apps. This works in most applications that support back and forward.”

 

You’d think that these gestures are mutually exclusive, but what I’ve found while building applications is that the Flick is really specific (move X distance within a specific period of time), and the workaround of listening for a Drag on the window (and calling the navigation event you were expecting originally) introduces other issues – like FlowDocumentPageViewer text being highlighted.

 

“Zoom – Pinch two fingers together or apart to zoom in or out on a document. This comes in handy when looking at photos or reading documents on a small laptop. This works in applications that support mouse wheel zooming.

 

This may seem silly, but when I manipulate objects on a touch surface, I perform the opposite gesture. This is also opposite to how the Surface ScatterView items behave  - two fingers moving away from eachother will increase the size of the item, two fingers moving towards eachother will reduce the size of the item. [Ed: Here’s a quick video introducing the Surface concept that came up at REMIX Australia]

 

So which is it? I hope I’m wrong here (we ended up using our own code, rather than leaning on the default behaviours, so I can’t confirm that this is the default behaviour) but it shows that even a simple gesture such as zoom can be misinterpreted.

 

My personal favourite:

 

“Press-and-hold – Hold your finger on screen for a moment and release after the animation to get a right-click. This works everywhere. Or, press-and-tap with a second finger – to get right-click, just like you would click the right button on a mouse or trackpad. This works everywhere.”

 

1) You took my Hold Gesture (which is supposed to be a left-click) and turned it into a different mouse button after an arbitrary period of time. The only way I saw to get around this is to move your finger before the animation shows up, and that’s introducing a Drag gesture instead.

2) This is not intuitive. I found the first way of doing this by accident, and I didn’t even know about the 2nd way until I read the article. And don’t give me this whole “Macs were doing it for ages” excuse either – if its supposed to be a touch gesture, don’t map it to a right-click. Call it a context menu event or something similar.

 

I’m going to follow up on this particular topic as I’ve been throwing around some ideas with my boss (well, this was a while ago, I need to look back at what we actually discussed) about making gestures more discoverable. Surface is a great example of this – how do you guide the user through the functionality available without interfering with the user experience or requiring an instruction book to get started?

 

Got any solutions, angry guy?

As I said before, stay tuned. I want to delve a bit deeper into the current touch situation for Windows developers (I haven’t had a chance to try WPF 4.0 yet, I want to delve deeper into the Windows 7 Touch stuff once it RTMs, I want one of these to play around with) and I want to go into some greater detail about building applications that don’t rely on a mouse – and there are some tricky things like text selection and menu navigation that you’d think “No mouse? You must be mad!”.

Comments are closed