I Watched My Voice Take Form on the Screen

One of my nighttime habits is to record myself speaking at the very end of the day before I go to sleep. I used to scribble notes, but after a day spent writing, the act of writing yet again at the very end of the day, just before sleep, can feel like one task too many. I generally sleep quite soundly, but part of preparing to sleep is winding down. To write, much as I enjoy writing — much as I am compelled to write — is to invoke work, which is not conducive to sleep. Also, my scribbles often prove illegible come morning, much as dreams can’t always be fully recalled.

In contrast, by simply recording stray thoughts with my voice at the end of the day, I can with ease unpack the day. To write is to work; to speak is to put work behind me. Speaking is unwinding, even if I’m only speaking to myself — well, to myself and to my phone. When I record my thoughts, I capture reflections on recent occurrences, and I make plans for the next day, and I collect extraneous bits of ideas. As with my scribbles, some of these I can’t even comprehend the next morning. If I’m particularly tired, the recordings can veer into the surreal, sometimes enjoyably so. (It can be an out-of-body experience, though that isn’t my goal.)

After simply listening to these recordings come morning, for years, I started using — or more to the point, beta-testing, a state many of us seem to be in in perpetuity — speech-to-text software tools. I spent a lot of time making the most of the tool built into Google Drive, and then the Recorder that comes with Android, and then the tool built into Apple Notes, among others. These are real-time recording tools: they transcribe as you speak. They trained me to speak more clearly, because as I spoke I watched my voice take form on the screen, and I self-corrected if the software was misunderstanding me. This was a positive feedback loop, but it also required me to observe my thoughts, which wasn’t as freeing as simply speaking aloud.

More recently I’ve gotten in the habit of using tools like MacWhisper and rev.com. These tools allow me to simply record something, and then after the fact have it transcribed into text. The quality of the results — the “fidelity,” to repurpose an audio term — is even higher, in my experience, than that of “real-time” tools such as Google Recorder and Apple Notes.

Now, one interesting thing about revisiting these auto-transcribed notes the next morning is that I also receive emotional cues: Was I terse or rhapsodic, prone to imagery or sticking to line items? I’m not recording my thoughts to keep track of my emotional state, but I can’t deny that is part of what I learn as the sun rises and I pull up the transcribed files. And, as it turns out, this is just as true about what happens between the words. The MacWhisper tool, in particular, lends an additional means by which I find myself gauging my emotional state: It actually characterizes my breathing and it notes the extended silences. The software is reading, so to speak, the way I communicate non-verbally, as then identified for me with brackets and parenthesis: “[sighs],” “[breathing],” “(yawns),” etc. It is eerie, fascinating, and, at a basic level, informative. And in my experience, not incorrect about what it observes.

 •  0 comments  •  flag
Share on Twitter
Published on September 24, 2023 21:10
No comments have been added yet.