On the Line: Physics, Design, LightSound

Some favorite recent sentences

“Deep neural network for learning wave scattering and interference of underwater acoustics”

Sometimes science is, to the rest of us, abstract poetry. The above is the title of a research paper to be published in the Physics of Fluids journal about an innovation “that harnesses the power of AI to accurately model how sound waves travel underwater [and] could help reduce the impact of noise pollution on marine life.”

. . .

"To find sources of real sound of people in pain is convincing, and the credibility of the sound you use is so important because you can fool the eye much more easily than you can the ear."

That is Oscar-nominated sound designer Johnnie Burn talking to Salon’s Gary M. Kramer about the production of the Holocaust film The Zone of Interest, based on a Martin Amis novel.

. . .

"This device isn’t just for a blind or low-vision person. It could also be a tool for a person that engages with data differently."

That is Harvard astronomy lab manager Allyson Bieryla talking to National Geographic’s Stephanie Vermillion about LightSound, “a smartphone-sized device that translates ambient brightness into sound.”

Sound Ledger: Deepfake Forensics

Audio culture by the numbers

250: Length in milliseconds of the segments employed to identify details about the headline-making deepfake of U.S. president Joe Biden

155: Total number of segments employed in the forensic effort

122: Number of text-to-speech engines (and other technological tools) included in the research data set

Source: pindrop.com.

Speak, Memory — or at Least Instapaper

On text-to-speech interfaces and app adoption

I’ve been trying out various apps and services that facilitate the collection of browser-based documents. I don’t need much in terms of long-term collation, mostly items on a week-by-week basis, mostly for my This Week in Sound newsletter. Instaper is a leading such option. There are also: Pocket, synced bookmarks (I’m a longtime user of pinboard.in, just shy of a decade), and the Read Later panel in the Safari browser, among numerous — some days it feels like countless — other options.

These tools have their pros and cons, their pluses and minuses, and they have their idiosyncrasies. Some, including Instapaper, allow you to hear a document read back to you — what’s referred to as “text-to-speech.” What struck me in particular about Instapaper is the voice implementation (perhaps archaic, as text-to-speech goes) and also an aspect of the text-to-speech interface, which displays the following:

It shows a pair of headphones next to the word “Speak.” The headphones icon suggests one listens, which is appropriate. The word “Speak” suggests — and I say this at the risk of, you know, jumping to a conclusion — that the user speaks, which is counterintuitive. Why, I ask, doesn’t this show headphones and the word “Listen”? Especially today, in a time of widespread speech-to-text utilization.

Complicating the scenario further is the phrase “Speed Read” directly above. This is, in fact, an especially idiosyncratic function within Instapaper. What it does is flash the words, one at a time. For some people this is apparently a means for reading quickly. For others it might feel like being a subject of brainwashing in A Clockwork Orange.

In any case, what matters is that the appearance of “Speed Read” above “Speak” gets confusing, because “Speak” is the tool that reads to you, and “Speak” in fact does have a submenu that lets you adjust the speed at which it reads to you. You can see how this can become not so much confusing as concerning. Odd incongruences can taint an interface and, by extension, a product.

There’s a thing about utilities like Instapaper, and I think of it as the “if only” factor, for every app has its little shortcomings: if only it was cheaper, if only it didn’t have a subscription cost, if only it had folders, if only the fonts appealed to me, if only it worked offline, if only it were cross-platform, and so forth. If Instapaper didn’t have this text-to-speech feature, I might not have missed it. Since the app has the feature, I find myself factoring my relative satisfaction with the implementation into my decision-making in terms of adoption.

I have a lot of friends who swear by Instapaper, and several of them didn’t even know about or at least don’t use the “Speak” function. It all comes down to personal habits, but personal habits in aggregate are a powerful force in the development and evolution of interface norms. It’s somewhat difficult to imagine that the word “Speak” next to a pair of headphones will become the universal symbol of text-to-speech. Note, above, how the New York Times, which recently launched its own audio-specific service, signals the same feature on its main website. And below is how The New Yorker, on its website, which has actual humans (not digital facsimiles toiling tirelessly syllable by syllable) doing the reading, highlights the availability:

And no, I still haven’t decided on an option. But such is digital life.

On Repeat: Birdsong, Iceberg, Cello

Home/office playlist

I try to at least quickly note some of my favorite listening from the week prior — things I’ll later regret having not written about in more depth, so better to share here briefly than not at all.

▰ This is a live performance reworking of sounds from the forest, emphasis on the birdsong, by Mark Harrop, aka UMCorps, based in Cornwall in the U.K. He employs various techniques on “Endless Woodland,” like pitch-shifting as well as the re-insertion of samples from the original material, turning the source audio into something cinematic, a combination of the everyday and the psychological experience of an imagined scenario.

▰ Heejin Jang’s new album, Human Iceberg, is the score to a collaborative project by that name that teams her with writer Lim Jina and visual artist Lee SunHo on what sounds, from the description, like a science fiction fable about climate change. Jang scored the project, expressing various settings, from the melting of an iceberg to technological failures to seemingly supernatural occurrences. While there are atmospheric moments, it gets loud; as Jang says on her Bandcamp page’s bio: “I make something noisy.” More at instagram.com/humaniceberg.official, best viewed on a laptop or desktop computer. Jang is based in Seoul, Korea.

https://heejinjang.bandcamp.com/album/human-iceberg

▰ Henrik Meierkord’s cello, sounding like it’s deep in cavern, combines on “Warum” with the samples and tape work of Marco Lucchi, moaning swells that turn the piece into a sort of conversation between the instruments. Meierkord is based in Sweden, Lucchi in Italy.

Scratch Pad: Rain, Lullabies, Emoji

From the past week

I do this manually at the end of each week: collating (and sometimes lightly editing) most of the recent little comments I’ve made on social media, which I think of as my public scratch pad. Some end up on Disquiet.com earlier, sometimes in expanded form. These days I mostly hang out on Mastodon (at post.lurk.org/@disquiet), and I’m also trying out a few others. I take weekends and evenings off social media.

▰ The day has begun with what sounds like fake rain from a YouTube channel, too perfect to be real, though that’s probably just the result on my part of a combination of drought-induced confusion and too much research about field recordings

▰ Since you’re wondering, guitar class went well. It’s a small thing for most people learning music, but for me being able to transcribe a melody from an old-school Disney flick, sort the key, and guess at some chords felt nice. Been deep in “Stay Awake” and, now, “Baby Mine.”

Just to be very clear: I am taking guitar lessons, not giving them. :)

▰ I’m looking forward to listening to your next drone album but fair warning my microwave has really been upping its game. May be an end-of-life cycle for the machine, revisiting its hits with diminishing energy but greater depth.

▰ My only Oscars comment is a question about the best score category: will the dead guy and the elder statesman split the codger vote and thus given one of the other three the win?

▰ A consequence of studying lullabies for guitar practice is that, as with any song you play over and over (and over), they get lodged in your head. These being lullabies, then you spend the rest of the day trying not to fall asleep to the sleepy-time earworms playing endlessly in your mind.

▰ Next up in guitar practice, after “Stay Awake” and “Baby Mine”: transposing “Easy Living” to match a Chet Baker recording — a natural transition, Baker being lullaby incarnate

▰ Just proofed some liner notes I wrote for an upcoming album. Had no idea how cool the graphic design was going to be, how appropriate to the material, nor how large the tape cassette booklet. I’ll share when it’s out (or getting closer to the release date).

▰ The city got a reprieve from rain for good behavior so I did open the window this morning, first thing. In came dog barks, a passing bus’ rumble, and the inimitable hum of a driverless electric car (steadier than most humans manage). Well, the barks were persistent but my brain eventually muted ’em.

▰ There’s an emoji for taking a look that’s two eyes 👀. The closest listening approximation is one ear👂so I guess we’re listening in mono. There’s an ear with a hearing aid, which is inclusive 🦻 and may come to suggest a broader array of mediated listening. There’s a deaf symbol 🧏. And of course 🌽.

▰ Sometimes I find things on last.fm that seem to make no sense, and then they end up being recordings that haven’t been announced yet, let alone released, but someone somewhere had been “scrobbling” embargoed recordings

▰ Probably old news for some people but it’s pretty great to use a secondary device, like an iPad, as a music player, and direct the sound to play through your MacBook (my M1 Pro has amazing sound). You can keep an eye on what you’re playing, and keep your laptop screen reserved for working.

▰ I recently set up a Mac Mini as a Plex jukebox and it has totally transformed how I listen to music. I’ll detail how I set it up in a proper blog post soon.

(And yes, I realize the concepts both of listening to MP3s/equivalent and of writing a blog post seem antiquated to some, but so be it.)

▰ Finished reading one novel this week, my third of this year: Mick Herron’s The Secret Hours, maybe his best and I’ve read ’em all. It’s in the Slow Horses world and will mean more if you’ve read them but it also stands on its own. It dials back the humor slightly, and the action is less slapstick. One moment in particular made the hair rise on my arms. Herron cares more for words than most writers of thrillers, spy or otherwise. Meanwhile, I’m still working slowly through an ancient Oulipo novel, and I restarted one I’d begun from Steven Soderbergh’s list of what he read last year.

▰ Finished reading three graphic novels this week: two more volumes of Minami Katsuhisa’s The Fable manga (5: The author sure keeps stacking the deck against the guy, but then again that’s the point; 6: It gets both darker in some of the incidents and lighter in tone at other points. Minami, the illustrator-author, starts to introduce incredible two-page spreads, and the depictions of action, especially hand-to-hand, really take off), and Kingdom by Jon McNaught: Wow, just wow. Incredibly beautiful, graphically intense, not-quite-wordless, onomatopoeia-packed depiction of a family vacation in which both very little and quite a bit happen. If M.C. Escher and Eric Drooker had a baby, the kid might hope to draw like this.