You know Dasher and Dancer
And Prancer and Vixen,
Comet and Cupid
And Donner and Blitzen.
Not that Dasher. Another one. A very cool text-entry system.
One of the limitations of following vaguely technology-related disciplines is that few people learn how to touch-type. We muddle through, with mediocre typing speeds. I grew up with computers, but only recently have I needed to type fast on a keyboard. I tried to teach myself to touch-type, but unfortunately, it did not work out. Nowadays I type using seven or eight fingers, while all the time looking at the keyboard to find the correct key.
The good news is that I can get up to pretty high speeds. The downside is that my fingers are following awkward movements.
The very bad news is that a few months ago, after a particularly intense period when I was using my awkward typing technique for 10 hours a day or more, I am starting to develop painful wrists and forearms with symptoms which are collectively and rather broadly known as repetitive strain injury
, or RSI
Since my sole income now depends upon my ability to enter text into a computer, I have become rather concerned about this, and have sought ways to overcome the potential problems of long-term disability through RSI.
The treatment for RSI, as many of you will know, is rest. Weeks of rest.
My employer is good, but not that good. They would find it hard to allow me to continue in my current role if I were unable to convert my ideas and opinions into text stored as bits and bytes on a computer.
This has focussed my mind rather sharply on alternatives to the standard-issue QWERTY
In strategic terms, my timing is good. If you listen to the tech broadcasts, the blogs and the grapevines, then you will have realised that the human/computer interface is going through something of a revolution.
and its non-phoning derivative, the iPod Touch
has sparked a great deal of interest in alternative text-entry systems.
However, none of these is yet being marketed to people like me who have to enter a few thousand words each day into a computer.
My first attempt was with speech recognition
My employers provide me with a Windows-based laptop that I can carry around and hook up to my home network. I found a cheap copy of IBM's ViaVoice
v10 and installed it.
About five years ago, I installed ViaVoice v2 on an older PC, but it could not recognise my voice.
I thought that the increase in processing power, and another five years' of development time might have improved matters. Well, yes, things have improved, but a few hours of training the system and speaking in a monotone has convinced me that speech recognition is still unrealistic
to use for that kind of volume of text.
Under perfect conditions, the machine would get nine words out of ten right. but under less than ideal conditions, it was getting every other word wrong. And often so badly wrong that it took more time and more keystrokes to correct its lacklustre attempts than it would have taken to type the whole thing in anyway.
No doubt some person out there will tell me that their favourite package-- Dragon Naturally Speaking
or some such -- is better than ViaVoice. And I have no reason to doubt you.
I've been around technology long enough to know that a piece of consumer software from one leading supplier is not going to be vastly better than the equivalent software from another leading supplier. Market leaders watch the well-known opposition all the time and if the opposition brings out something good, then smart companies tend to follow pretty quickly.
So I'm willing to believe that there is a better package than ViaVoice out there. I'm not willing to believe that anything can be so much better than Via Voice that it will perform up to my expectations. More to the point, I don't believe any voice recognition software will allow me to enter 2000 words of text with fewer mistakes than I make using a QWERTY keypad.
And even more to the point, at least when I make a typo in QWERTY, the reader can make a guess as to what I tried to type.
With voice recognition, that is all but impossible.
It's a shame, as I would dearly love to have a system which will accept a MP3
file of a conversation and convert it automatically into text. It looks like that particular dream will have to wait a few years.
So I looked for alternatives.
The most interesting was this rather odd Dasher program.
You use Dasher by making tiny, ultra-fine movements with the mouse. These movements allow the writer to "steer" through an on-rushing tide of letters, punctuation marks and other symbols, to produce accurately-spelled, formatted and properly grammatical text. And not just using the Roman alphabet
. It works just as well in Cyrillic
and other scripts.
You can control the speed of the onward rush by moving the mouse in one direction (left/right) and select individual letters, or strings of letters by moving in the other (up/down).
There's no clicking, or selecting to be done. Just steer through the onrushing letters in a single, fluid process. A process which results in finished text.
There are, as you might expect, a number of clever refinements. First, the internal software uses a predictive text
model. That is to say, it predicts which character you might want to select next. For example, if you have already steered through "t-e-x-" then the next letter is very likely to be "t".
In use, Dasher presents each option in its own square box. In the example, the "t" box is huge, while other letters have much smaller boxes. On the other hand, if I steer through T-e-x-, then the next most common letter is a, and the word it expects to see is either Texas or Texan. In that case, the t box appears as slightly smaller than the a. Other boxes are much smaller. The idea being, obviously, that larger boxes are easier to steer toward.
Even better, the predictive text model can be trained. If, like many people, you have a large amount of text that you have previously written, then you agglomerate as much as you want into a single text file and tell the software to import it as a training file. Once that is done, the programme bases its predictions on the writing it has been trained upon -- your own.
This training text can be used to set the language. Although English and French use broadly similar alphabets, French-language predictions would be useless in English and vice-versa. So the selection of training text is important for correct functioning of the software.
Contrary to the writeup above, the system does allow capitals
and punctuation marks
. You have to use the software preferences to select the relevant alphabet. The default option is a limited character set in English. But you can select the option with lots of other symbols. Once that is done, the display becomes more colourful, as the different boxes are colour-coded. Again, the alphabet is matched to the language. The Polish alphabet has characters such as Ł and Ź, while the German offers ü and ß.
In English, a bright green box contains all the punctuation marks. A yellow box contains all the capitals. A white box contains the space character and so on. You can tailor these colours to suit your own preferences. You can also select how fast the letters come at you. Although there are not many settings, if If you want to adjust one of them, you probably can.
The software can be run on windows PCs, Linux, Mac and other operating systems. It is a free download from http://www.inference.phy.cam.ac.uk/dasher/
Notes on using Dasher
First, although I have only had a few hours on the system, I can use it to write at around 25 words a minute
. I get the impression that is close to the likely maximum I can achieve. That is half the speed I can go at using my QWERTY keyboard. However, there are substantial benefits. First, my tendons do not ache nearly so much after using Dasher. Second, the accuracy of Dasher is much higher. It encourages operators to spell correctly.
The developers describe Dasher as a tool to navigate through all possible books. While that is a bit over-optimistic, the term navigation
is better than the term text-entry
Even better, I think, is that you use the software to steer
through those onrushing letters and select the right choices as they speed toward you. Very often, the software will present a whole word -- the one you wanted to use -- after steering through just a few boxes. Then, you can accelerate through that word and on to the next. If you need to slow down, then it is easy and intuitive to do just that.
Second, the system is quite simply fun. It is such a radical departure from QWERTY text entry that you realise how dreadful the QWERTY keyboard is.
Dasher was developed as as information-efficient text-entry system. According to the developer, David MacKay, speaking at a presentation, a pointing finger can generate information at a rate of 14 bits per second, which equates to about 14 characters/second or 170 words per minute. That's the theory.
The developers say a fingertip is capable of very high resolution motion. But on a keyboard, the keys are either on or off on a scale of a few millimetres. This, he said is a waste of information content. Furthermore, most languages are inherently predictable. Once the first few characters of a word or sentence have been entered, then the available choice of subsequent characters becomes increasingly limited as more characters are entered. Mobile phones can use this characteristic, but QWERTY keyboards do not.
Put these factors together with some intelligent design and further refinement, and you come up with Dasher.
I started this piece by suggesting that voice-recognition software is not consistent enough for text-entry. Interestingly, Dasher can help there, too.
One of the projects the team at the University of Cambridge are working on, is a system which links voice recognition with a dasher-style interface.
Instead of the voice-recognition software making its best guess about a word it has tried to interpret and displaying that as the only option on screen, the team are working on an system where the voice-recognition software offers the user three or four alternative words, with boxes adjusted for probability in a dasher-style interface.
In a simple example, where the user spoke the word 'the wood was theirs', the software would present the user with options. Instead of separate letters in the main Dasher programme, the system would use complete words, allowing the user to steer through the jungle of possible words and delivering accurate, unambiguous finished text.
I have watched the progress of voice recognition over a 5-year time scale. It has improved, but it is still unacceptable. I have used Dasher for a few weeks and am still thrilled with it. Put those two together, and you have a real possibility for a workable voice recognition solution.
And just think about Dasher on a PS3 or Wii, where you can use the wand to build up text, rather than using a clumsy on-screen keyboard. Or think about it on a mobile phone instead of current systems.
Download the software, play with it and I defy you to avoid the same sort of infectious enthusiasm I have developed for the idea.
More information at the website: http://www.inference.phy.cam.ac.uk/dasher/