DALL-E, "Photo of a surreal scene where an individual operates a machine while another tries out macarons. Beside them, a sign that reads 'PAX' glows. In the background, there's a larger-than-usual weighing scale labeled 'kilo kilo kilo'. Nearby, a water-filled vase with soil floats above, with a label 'Number 3'. People around them wear unusual restaurant uniforms, and among them, someone holds a tiny black object, looking slightly uncomfortable."

Whisper Hallucinations

OpenAI’s Whisper speech-to-text model is pretty terrific. It’s one of the few “open” AI models that OpenAI has actually released into the world, and it powers ChatGPT Plus. When it works, it’s better than magic. When it doesn’t, you get something like this.

This was an [under-recorded] interview I did with one of the volunteer mushroom identifiers at the Puget Sound Mycological Society‘s Wild Mushroom Show this weekend.

[Oh. I forgot to say, the reason it does this is when the audio is under-recorded, such that there is only very sparse sound from which to “fill in the gaps” of the data model the algorithm compares it to, Whisper is more likely to hallucinate an un-related but similar-enough word. When I say, “Yeah,” which apparently is my go-to active-listening tool, I am being picked up better because I was closer to the microphone, so that’s why it sounds like Beatlemania.]

Yeah. Yeah. Yeah.
What you get used to this one, um, you have a total of the iron side of the kilo. Uh, I, I probably didn't.
Uh, you can't tell what it is.
Just say pax. Go back to operating. So, if there's other macarons like this one, that's a good one.
I'm so pretty close there now. Yeah. Yeah.
And the hell that this one is a larger kilo kilo kilo.
And number three is going on water soil.
Okay. All right. Yeah.
I want to see. There are no pressure buds that are dangerous. I want more of a little area.
Okay. Yeah. Some people wear a unique restaurant.
Okay. All right. How about the other one?
We have a black and black. And that might be uncomfortable.
These are teeny tiny.
I'm in a school. I don't have an exercise. I'll show you what I think about the other one. With a future, you got to go under and let your son take photo underneath.
Okay. So, I'm sweating.
I'm sweating.
Yeah. Yeah.
Yeah. Yeah.
Yeah. Yeah.

DALL-E 3, "Illustration of a whimsical school setting where a student doesn't exercise but instead gazes at two options before her. One shows the 'future' with a tunnel, hinting at going under, and a camera ready for a photo. The other option is unclear, leaving her contemplative. The ambient temperature seems high, suggesting the repeated mentions of 'sweating'."
Images generated by DALL-E 3 prompted with the Whisper transcript.