Everyone's favorite chatbot can ?? ???now see and hear and speak. On Monday, OpenAI announced new multimodal capabilities for ChatGPT. Users can now have voice conversations or share images with ChatGPT in real-time.
Audio and multimodal features have become the next phase in fierce generative AI competition. Meta recently launched AudioCraft for generating music with AI and Google Bard and Microsoft Bing have both deployed multimodal features for their chat experiences. Just last week, Amazon previewed a revamped version of Alexa that will be powered by its own LLM (large language model), and even Apple is experimenting with AI generated voice, with Personal Voice.
SEE ALSO: OpenAI expands ChatGPT 'custom instructions' to free usersVoice capabilities will be available on iOS and Android. Like Alexa or Siri, you can tap to speak to ChatGPT and it will speak back to you in one of five preferred voice options. Unlike, current voice assistants out there, ChatGPT is powered by more advanced LLMs, so what you'll hear is the same type of conversational and creative response that OpenAI's GPT-4 and GPT-3.5 is capable of creating with text. The example that OpenAI shared in the announcement is generating a bedtime story from a voice prompt. So, exhausted parents at the end of a long day can outsource their creativity to ChatGPT.
This Tweet is currently unavailable. It might be loading or has been removed.
Multimodal recognition is something that's been forecasted for a while, and is now launching in a user-friendly fashion for ChatGPT. When GPT-4 was released last March, OpenAI showcased its ability to understand and interpret images and handwritten text. Now it will be a part of everyday ChatGPT use. Users can upload an image of something and ask ChatGPT about it — identifying a cloud, or making a meal plan based on a photo of the contents of your fridge. Multimodal will be available on all platforms.
As with any generative AI advancement, there are serious ethics and privacy issues to consider. To mitigate risks of audio deepfakes, OpenAI says it is only using its audio recognition technology for the specific "voice chat" use case. Also, it was created with voice actors they have "directly worked with." That said, the announcement doesn't mention whether users' voices can be used to train the model, when you opt in to voice chat. For ChatGPT's multimodal capabilities, OpenAI says it has "taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy." But the real test of nefarious uses won't be known until it's released into the wild.
Voice chat and images will roll out to ChatGPT Plus and Enterprise users in the next two weeks, and to all users "soon after."
Topics Artificial Intelligence ChatGPT
Michelle Obama calls out Barack for his not so funny dad jokes in Christmas addressRidiculous Fox anchor to Teen Vogue writer: 'stick to the thighDonald Trump thanked someone and you'll never guess whoRelive 1997 with 20 songs turning 20 in 201720 tweets from Cher that will help you remember 2016 in a better lightThese are the 'nightmare' queues in airports after nationwide customs system outageParents say their Hatchimals are swearing like sailors in their sleepToddler's adorable reaction to getting adopted has the internet in tearsPETA replaces every single ad in Tube station with vegan postersTrump's brief Twitter follow reveals possible love for adorable kittens Apple's iPad Pro could come with a matte display Netflix is unleashing 'Hades' to your iPhone this month ACLU calls proposed US TikTok ban unconstitutional NYT's The Mini crossword answers for March 21 11 best tweets of the week, including raccoons on boars and a cheese wheel AI shows clear racial bias when used for job recruiting, new tests reveal How to spot AI Montana State vs. Grambling livestreams, game time How to watch the Microsoft Surface event 2024 Wordle today: The answer and hints for March 20
0.1481s , 9847.3359375 kb
Copyright © 2025 Powered by 【?? ???】ChatGPT rolls out voice and image capabilities,Global Hot Topic Analysis