Talking to the Machine

Finally, Software That Gets My Dichotomy — Or Is That Dike Hot to Me?

When I was 18, I decided to write a novel by dictating it into a tape recorder. the idea, as I recall, was to avoid the quagmire of writing a first draft, and jump directly to the revision stage. What I found, though, was that in seeking to make my writing process easier, I had made it far more difficult, for the taped draft, once transcribed, was full of endless sentences, rambling digressions, and other conversational misdirections that rendered it literally impossible to read. The trouble with dictation, I came to understand, had to do with the dichotomy between spoken and written speech, the way that, by its nature, talk is loose and formless, while writing cannot help but be more controlled. Even the most natural writer has two distinct voices, one for conversation and one for the page. And in trying to merge mine, I realized that the key to writing was writing, and that there could be no alternatives to putting in the necessary work.

Over the years, I've done my best to stick to that maxim, never again adopting dictation as a writing tool. Yet today, I'm leaving history behind to take another stab at translating talk into prose. My motivation here is simple— to test-drive Dragon NaturallySpeaking Preferred voice recognition software, a computer program that theoretically "recognizes" its user's speech and re-creates it in the form of written words. Experience to the contrary, such a notion continues to fascinate me, if for no other reason than my own laziness, my desire to find an easier way to work. After all, should it live up to even a fraction of its promise, voice recognition might represent a kind of electronic missing link between language and its electronic simulcra, enabling us to eclipse at least some of the distance between writing and speech. Sure, I tell myself, this is dictation, but unlike the static receptivity of the tape recorder, it's dictation of an interactive variety, in which you can see your sentences take shape as you frame them, much as you would with traditional text.

Of course, when it comes to voice recognition, interactivity takes place on a number of levels all at once. For me, the first involves actually having to leave my house and head to the office, where the NaturallySpeaking software has been installed on one of the paper's machines. It takes a lot of memory, after all, to run a voice recognition program, a lot more memory than I have at home. But if, on the one hand, I'm looking forward to seeing how NaturallySpeaking works on a computer that can handle it, I'm also more than a little wary about using it in what is, essentially, a public setting, with people working all around me as I sit in an office and talk to myself.

illustration: George Bates

That sense only increases when I settle in before the computer and prepare to customize, or "train," the software to identify the inflections of my words. First, I adjust the headset until the microphone is the proper distance from my mouth; then, I talk my way through a couple of preliminary steps, making sure, as NaturallySpeaking reminds me, to enunciate every syllable, as though I were talking to a recalcitrant child. After a minute or two, I start to feel more comfortable, but still, I can't keep from speaking quietly, at times so quietly the software can't "hear" me.

NaturallySpeaking asks me to spend 30 minutes reading into the microphone, offering a choice of pre-programmed texts, including Alice's Adventures in Wonderland and Charlie and the Chocolate Factory, which I recently read to my four-year-old son. There's something delightfully whimsical about discovering works like these in the middle of a software training process, and, as I begin to read from Charlie, I imagine a room full of computer programmers laughing somewhere, as if they and I are sharing a joke. Yet before too long, I feel the knife's edge of self-consciousness again, as if, in doing this, I've slipped the bounds of logic and fallen headfirst down the rabbit hole. In that regard, it might have been more appropriate to select Alice, more reflective of the effort to "communicate" with a computer— an idea that is, essentially, absurd.

How absurd begins to be apparent once I finish reading and start, as it were, to "write." I speak a sentence and watch as, seconds later, it emerges on my screen. Although it's a bit disconcerting to dictate punctuation marks— "When I was 18 comma," I say, "I decided to write a novel by dictating it into a tape machine period"— I am attracted by the computer's fluidity, its ability (or so it seems) to understand. That's the lure of voice recognition software, and, using it, I feel as if I'm living in science fiction, like I've become a character in a film. The first time I ever saw voice recognition was in the movie Being There, when the character played by Melvyn Douglas has a heart attack while dictating, and we watch his computer record his struggling gasps. And sitting here, with paragraphs flowing like water across my monitor, it's hard not to be tempted into believing that writing might one day be this simple, this unfiltered and direct.

Next Page »