YouTube AI mistakenly inserts foul language into kids’ videos, study finds

(NEXSTAR) — Well, f—! Newly released data shows trusted kid-friendly YouTube videos may not be as safe as intended, due to explicit transcribing flubs on videos.

A new study performed at the Indian School of Business in Hyderabad, India, tested over 7,000 videos from some of YouTube’s most popular kids’ channels, with researchers finding the automatic speech transcription (ASR) system incorrectly switched some innocent words with more explicit ones. Researchers call this an “inappropriate content hallucination.”

While YouTube Kids more closely filters content, automated captions are currently only offered on the standard YouTube site.

Researchers observed YouTube Kids’ ASR transcribe “crab” to “crap,” “corn” to “porn,” “that is” to “panties,” and “brave” to “rape.” Several other four-letter words were also inadvertently transcribed on children’s videos.

A list of 1,300 “taboo” terms was created by the researchers, with their findings discovering at least 40% of captions automatically pulled from the sample videos contained words on the list. Some of the most frequent mistakenly auto-generated explicit words include, “b—-” and “penis.”

“Inappropriate content hallucinations” happen across other ASR technologies, however. Wired reports a transcript of a phone call changed a Persian woman’s name, Negar, to a racist slur. Meanwhile, researchers also tested ASR by Amazon, which also made similar mistakes. Amazon spokesperson Nina Lindsey declined comment to Wired but forwarded resources for developers to correct such mistakes.

“We are continually working to improve automatic captions and reduce errors,” YouTube spokesperson Jessica Gibby told Wired. She added that children under 13 are recommended to use YouTube Kids.

Researchers stressed the point of the study – which is still a preprint pending peer review – hopes to highlight ways to improve ASR for children’s videos. Some solutions offered include more closely programming the AI to avoid transcribing certain words and phrases, in addition to keeping children in mind when developing ASR systems to begin with. It’s also suggested that having a “human in the loop” to field discrepancies would help mitigate such issues.