OpenAI: ChatGPT can now 'see, hear and speak'

ChatGPT can now ‘see, hear and speak’ after major update

ChatGPT can talk back and forth with users, the company said Monday
With the new update, the chatbot will also be able to process images
The new features will be released in the next two weeks

Updated: Sep 25, 2023 / 02:13 PM CDT

FILE – The OpenAI logo is seen on a mobile phone in front of a computer screen displaying output from ChatGPT, March 21, 2023, in Boston. Several news organizations, writers and photographers groups are seeking regulations to govern the fast-moving artificial intelligence technology that threatens upheavals for their businesses. In an open letter sent on Wednesday, Aug. 9, 2023, outlined priorities for setting rules on the technology, which is developing faster than regulators can keep up with. (AP Photo/Michael Dwyer, File)

(NewsNation) — ChatGPT is about to be a lot more chatty.

OpenAI, the company behind the popular chatbot, released an updated version Monday that lets users talk back and forth with the artificial intelligence tool.

The new update to the company’s iOS and Android apps will allow the AI chatbot to talk out loud in five different voices that were created with help from professional voice actors.

“Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate,” OpenAI wrote in a blog post on Monday.

ChatGPT also has a new set of eyes to go along with its voice.

For the first time ever, the chatbot will be able to respond to images. Users can simply snap a photo and ask ChatGPT questions about it.

For example, you can upload a photo of the food in your fridge, and the chatbot will come up with recipes based on those ingredients.

“Like other ChatGPT features, vision is about assisting you with your daily life,” the company wrote. “It does that best when it can see what you see.”

OpenAI isn’t the first company to roll out a digital assistant, but its underlying technology is different than others like Amazon Alexa and Apple’s Siri. That’s because ChatGPT is built on a large language model, a type of artificial intelligence that learns skills by analyzing massive amounts of text, which means it’s able to respond on the fly.

Last week, Amazon announced the release of its new Alexa large language model, a sign that other digital assistants are set to become more like ChatGPT’s.

In a blog post, OpenAI acknowledged the possible risks that come with enhanced voice capabilities, like the “potential for malicious actors to impersonate public figures or commit fraud.”

For that reason, the company said, the tech is limited to a specific use case: voice chat.

The new voice and image features will be released to ChatGPT Plus and Enterprise users over the next two weeks.

Monday’s announcement is just the latest news out of OpenAI, which recently unveiled the third version of its generative AI visual art platform, DALL-E.

OpenAI has also partnered with Spotify to launch a tool that allows podcasters to create foreign-language versions of their episodes in their own voices.