When speech becomes text, what happens to writing?


I successfully put down the baby for her late morning nap half an a hour ago. After running quietly around in sock feet trying to do things while she was out cold, I sat down to answer email and messages. As I entered this post into WordPress, she awoke again.)

It’s not easy to respond quickly and at volume using one hand or thumb, though I’ve gotten much better at both over the past five months with a baby daughter.

Over that time, I’ve been struck by how good the voice recognition in iOS on my iPhone has become. I’ve been able to successfully dictate a rough draft of a long article into the email interface and respond to any number of inbound inquiries that way.

That said, neither the soft keyboard nor voice-to-text on the device are a substitute yet for the 15″ keyboard in my MacBook Pro when I want to write at length.

It’s mostly a matter of numbers: I can still type away at more than eighty words per minute on the full-size keyboard, far faster than I can produce accurate text through any method on my smartphone.

Capturing and sharing anything other than text on the powerful device, however, has become trivially easy, from images to video to audio recordings.

The process of “writing” has long since escaped the boundaries of tabulas, slate and papyrus, moving from pens and paper to explode onto typewriters, personal computers and tablets.

Today, I’m thinking about how the bards of today will  be able to reclaim the oldest form of storytelling — the spoken word — and apply it in a new context.

As we enter the next decade of rapidly improving gestural and tactile interfaces for connected mobile devices, I wonder how long until the generations that preceded me will be able to leave decades of experience with keyboards behind and simply speak naturally to connected devices to share what they thinking or seeing with family, friends and coworkers.

Economist Paul Krugman seemed to be thinking about something similar this morning, in a blog post on “techno-optimism”, when he commented on the differences between economic and technological stagnation:

…I know it doesn’t show in the productivity numbers yet, but anyone who tracks technology has a strong sense that something big has been happening the past few years, that seemingly intractable problems — like speech recognition, adequate translation, self-driving cars, etc. — are suddenly becoming tractable. Basically, smart machines are getting much better at interacting with the natural environment in all its complexity. And that suggests that Skynet will soon kill us all a real transformative leap is somewhere over the horizon, maybe not this decade, but this generation.

Still, what do I know? But Brynjolfsson and McAfee have a new book — not yet out, but I have a manuscript — making this point with many examples and a lot of analysis.

There remain big questions about how the benefits of this technological surge, if it’s coming, will be distributed. But I think this kind of thing has to be taken into account when we try to imagine the future; I’m a great Gordon admirer, but his techniques necessarily involve extrapolating from the past, and aren’t well suited to picking up what could be a major inflection point.

That future feels much closer this morning.

[Image Credit: Navneet Alang, “Sci-Fi Fantasies, Real-Life Disappointments]


Filed under blogging, journalism, research, scifi, technology

8 responses to “When speech becomes text, what happens to writing?

  1. As someone who writes a lot (keeping a daily journal and cranking out the occasion short story) I’d never thought about voice to text as an option for writing. I do like the Galaxy Note 3’s handwriting recognition. Coupled with Evernote, it fulfills all my needs. I’ve tried Voice-to-text recently, and while it’s scarily accurate, it doesn’t feel as fulfilling as good old fashioned pen and paper. The Galaxy Note 3 comes close to that experience for me, yet captures my words digitally. I think the technology is awesome! Whatever your preference, it’s cool to have so many options to capture ideas.

  2. I worry that, after watching physicians do speech based transcription (though that has improved over the years) that many emails, articles and written documents with be prolific with the ‘ummm’

  3. Reblogged this on Digital Me 2.0 and commented:
    Change can be scary, but we would be where we are today without it!

  4. I’ve started using Dragon NaturallySpeaking. Not all the time, but it’s another option. I tend to edit when I type, and that can be counterproductive, so speech-to-text can be a good way to write a rough draft or even half-baked ideas. In other words, I think voice-to-text is great for more easily cranking out jumbled, badly-written thoughts. Which is great as long as you remember that it’s only a rough draft. Anything that moves you along and gets you unstuck is good. But technology doesn’t eliminate the need for slow, careful revision.

  5. A very thought provoking question. Physicians and researchers have posed this question before, since so many rely on voice recorders while performing research or other tasks.

  6. I’m a newish Dad and can get frustrated that I can’t be as productive in my down time. My blog has suffered so I’ve turned to recording podcasts at night. Maybe I’ll give voice recog a go once they work out how to filter out baby noise.

  7. from so many years they improving voice-recognition but they have not updated it for indians acent.
    i am using it in windows but it is worst. i can not add new command for my own task.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.