Voice-recognition system? Listen up!
By Carol L. Schlein
QUESTION: Do you have any thoughts on voice-recognition
technology? What are the options and is the “testing period”
over? Most important, do voice-recognition systems actually
save time, considering the dictation, editing and formatting
required? I think there are many attorneys interested in this
issue.
Christopher
D. Byers
ANSWER: Anytime I get a question about a category of
hardware or software, my first instinct is to ask a little
more about the reason for the question. I am always concerned
when someone focuses on the type of software rather than its
intended purpose. If you were to ask about purchasing a
mega-sports utility vehicle, the salesperson would ask
questions like where you plan to use it, why this extra-large
SUV instead of a standard-sized SUV, van, truck or station
wagon. The salesperson already understands you need some
transportation to move from place to place, but hasn’t yet
learned what other needs you have beyond that. Do you need to
transport large items on a regular basis? Do you drive in
places where four-wheel drive is necessary? Well, you get the
idea. The same pattern is true with computer technology.
From your question, I gather you’re interested in shortening
the time it takes to prepare documents. There are many
different technological options, including voice recognition,
that can help accomplish this goal. The most-important thing
going into this process (or any other technology purchase for
that matter), is to be able to articulate your goals,
determine if they are reasonable and keep focused on those
goals during implementation to ensure success.
Many of my comments and observations, since they’re not based
on personal experience, are taken from a session at the
American Bar Association’s Techshow in March. Tom O’Connor, a
consultant with Courtlink in Bellevue, Wash., and Steven
Jones, an attorney from Little Rock, Ark., were the speakers.
Their presentation was full of practical tips about using this
technology.
Speech recognition has been around for many years. Initially,
it was developed by companies like Kurzweil for the
vision-impaired and disabled people. The early efforts were
hindered by substantial processor and memory requirements and
the limitations of the recognition algorithms. Algorithms are
the rules used by voice technology to understand and translate
spoken words into digital text or commands that can be used on
a computer. Until about two years ago, the technology was
limited to “discrete text” products. This meant that in order
for the computer to recognize your words, you _ had _ to _ say
_ them _ very _ slowly. This often meant multi-syllabic words
were not interpreted correctly. Instead, they would appear as
two separate words.
The three companies vying for market share are Dragon System
with its Dragon Naturally Speaking products, IBM’s line of Via
Voice programs, and Lernout & Hauspie’s Voice Xpress. If the
name Lernout & Hauspie is not familiar, that’s because it is a
Belgian-based company that bought Kurzweil two years ago.
Lurking largely on the horizon is Microsoft, which has a
project code-named “Whisper” that is intended to integrate
voice-recognition technology into its entire line of products
through the operating system. As best I can tell, this may
debut with the next version of Windows NT, to be called
Windows 2000.
The Pentium III chip from Intel Corp., which makes the
processor chips for the majority of office-grade computers, is
optimized for voice recognition. The other trends in this area
to be aware of are the blurring of lines between the different
applications, such as telephone, voice mail, e-mail and cell
phones, with voice recognition. (As an aside, watch for
products later this year that combine the Palm Pilot, a
hand-held calendar and Rolodex product, with your cell phone.
Talk about combining useful applications!)
IBM has a product marketed to disabled people called Home Page
Reader, which uses speech recognition to work with your
Internet browser. This should make its way into general use in
the next few years as well.
The other major trend is to have speech-recognition capability
available in a wide range of products. In addition to word
processors, you can expect to see it connecting to case
management programs and other law office tools over the next
two or three years.
Improvements
The
current crop of products has benefited from improved
algorithms as well as cheap RAM. These allow continuous speech
dictation, which means you can dictate in a more natural
manner. They require much less training than earlier versions;
the initial training takes about a half-hour. The recognition
will improve as you work with it because it is constantly
learning and watching you work.
That doesn’t mean, however, that there aren’t problems with
the technology. These programs rely on user voice files. A bad
cold could make you unrecognizable by your computer. A change
of environment such as home, office and hotel may require
significant retraining. One tip Jones shared at Techshow was
to consider doing separate user files for different locations
so you don’t mess up your office setup, which is most
frequently used.
During the ABA session, Jones attempted to demonstrate Dragon
Naturally Speaking with WordPerfect. Among his other
complaints about this technology are that it never “goes to
sleep” (the command that turns the microphone off) when you
want it to — such as when the senior partner of the firm comes
in to talk about a crisis involving the firm’s largest client.
He also noted these programs create a pest even worse than
typos. He referred to them as “word-o’s.” The reason they are
so troubling is they can elude your word processor’s
spell-checker. The voice-recognition programs will never
misspell a word. These word-o’s are the result — words that
are correctly spelled but are not even close to what you
actually said.
The
current algorithms used in these programs are
context-sensitive so they are better than their predecessors
in terms of getting it right the first time. However, Jones
made an interesting observation. He said that your happiness
and success with voice recognition will depend on your style
of dictation. For single-draft writers, which I am, speech
recognition can be a hindrance because you’re unable to ignore
your mistakes. For those who like to get it down in rough form
and clean it up later, this can be a good tool. However, he
warned, this type of dictation is much more prone to word-o’s
that are hard to edit since they are correct words but not the
ones you intended. For instance, if you meant to say
“influential,” it might be interpreted as “in the flu season.”
One tip from an audience member was to create and save a sound
file of your actual dictation along with the document so you
can listen to your original thoughts again if you have trouble
understanding what has appeared on your screen.
General
tips
During
Jones’ and O’Connor’s Techshow session, there were several
tips, regardless of which product you select. They suggested
you use the voice editor frequently to incorporate your own
often-used words, such as client names, into your user file.
They noted you can dictate directly into Corel’s WordPerfect 8
but not into previous versions of WordPerfect. In earlier
versions, you can use the commands from WordPerfect such as
Open File. He noted this technology is more effective for
longer documents, but can be used for short documents only
when the voice-recognition software already is loaded. The
implication of his comment was that these programs take a long
time to start, even on faster computers.
If you decide to use speech-recognition software, it is
generally recommended that you purchase a very fast computer
with a large hard drive and a lot of RAM memory — something in
the neighbourhood of a 366- to 400-MHZ processor with at least
128 MB of RAM. If you’re using it on an NT network, you will
want even more RAM. The good news is that it has become quite
inexpensive in recent years. You can anticipate more problems
with a laptop since the microphone is closer to the computer,
which makes some noise itself and which can cause
interference. Jones suggested getting a high-quality, noise-cancelling
headset and microphone to improve your chances of getting the
right words into your documents.
Choices,
choices
Each
major player in the speech-recognition market has more
varieties of its products than Howard Johnson’s ice cream has
flavors (or so it seems). Depending on which word processor
you want to use with it, your options may be more limited and
your choice clearer. Dragon Naturally Speaking from Dragon
Systems (www.dragonsys.com;
(617) 332-9575) comes in a standard edition ($109); a
preferred edition ($229), which includes the Dragon Naturally
Mobile option for transcribing text from a hand-held unit; the
professional edition, which adds more macros and features
suited to longer documents ($695); and the legal suite, which
adds a 230,000-word legal dictionary and includes all the
features of the professional edition ($895). Dragon Systems
also offers a legal suite with the mobile option and recorder
for $1,195.
Corel bundles the personal edition with its WordPerfect 8
Legal Suite ($499; an upgrade from other versions of
WordPerfect costs $229). You may want to upgrade from the
version in the legal suite if you want the mobile capability.
One of the differences between the versions is the quality of
the headset and the ability to make your own scripts or macros
for specific tasks. From comments I’ve heard, Dragon products
work best with WordPerfect.
IBM (www.software.ibm.com/speech;
800-825-5263) offers ViaVoice in several versions as well.
There’s a home edition of ViaVoice 98 for $49.95 and an office
edition for $89. These allow you to dictate directly into
Microsoft Word or their own SpeakPad. An executive edition
allows direct dictation into many Windows applications in
addition to Microsoft Word. This edition can be used in a
network environment with other people sharing the same PC for
voice recognition. IBM also offers an add-on legal dictionary
for $149.
Lernout & Hauspie (www.lhs.com;
(617) 238-0986), which bought Kurzweil, offers a standard
($49.99), advanced ($79.99) and professional ($149.95) version
of its product along with Voice Xpress for Legal ($249). Its
products are optimized to work best with Microsoft Word.
Microsoft Corp. owns approximately 8 percent of the company.
Lernout & Hauspie also has signed an agreement recently with
Novell to further develop its products to work in the Novell
network environment.
For an excellent overview of how another lawyer uses voice
recognition, take a look at Jim Eidelman’s article, “Talk to
Your Computer: You Can Practice with Speech Recognition
Software,” in the November/December 1998 issue of Law Practice
Management Magazine, published by the ABA’s Law Practice
Management Section. One of his suggestions is to purchase a
headset that can serve as both the microphone for the
speech-recognition program and for your telephone so that with
the press of a button, you can switch from dictating to
answering calls. This might solve the problem of needing to
turn off the microphone if someone walks into your office.
The other accessory he discusses is portable dictating
machines that now work as remote entry for the
speech-recognition programs. Depending on which manufacturer’s
speech-recognition software you choose, you can select the
portable transcriber best able to work with that brand.
Hardware
alternatives
If
you’re looking at voice recognition and don’t have the
hardware, the money for new hardware or the time or patience
to train the software, there are some other interesting
alternatives. Several companies, such as CyberSecretaries (www.cybersecretaries.com)
and CyberTranscriber from SpeechMachines (www.speechmachines.com)
are Internet-based service bureaus that allow you to call
their service or use a dictating machine to dictate a letter.
They then use voice-recognition software along with
secretaries who proofread the results before returning the
file by e-mail to the word processor of your choice. On a
per-document basis, these services are expensive. However, if
you weigh the convenience when you need it against the
hardware, software and time investment of doing it yourself,
it may be worthwhile.
Another alternative to keyboarding is the CrossPad by IBM (www.ibm.com/businesscenter/legal).
This looks like a letter-sized pad but with a notable
difference — when you write on it, it can save that text as a
file for your word processor. I haven’t actually seen or used
one of these, but they seem intriguing for lawyers still
huddling around their legal pads for initial document drafts.
I also could see where they could be useful for note-taking at
client meetings since they are much less-disruptive than
pulling out a laptop.
Most attorneys who use voice-recognition software regularly
seem to use it primarily for getting the basic content into
their word processor rather than formatting. While it’s a lot
better than it was, it still requires a time investment. I
wonder whether attorneys who haven’t invested the time to
learn basic keyboard skills by now will have the patience to
invest the time necessary to train and maintain a
voice-recognition software program.
Carol L.
Schlein is president of Law Office Systems, a Montclair-based
training and consulting firm specializing in law firms. She
formerly chaired the Computer and Technology Division of the
ABA Law Practice Management Section. A lecturer for ICLE, she
can be reached at (973) 746-6454 or
carol@losinc.com.
Questions for Carol Schlein on law office technology may be
faxed to New Jersey Lawyer at (732) 750-0010 or mailed to “Law
Technology Questions,” New Jersey Lawyer, Koll Corporate
Plaza, 485B Route 1, Suite 100, Iselin, N.J. 08830. |