top of page
Writer's pictureDarshna Kalla

ChatGPT's New Voice and Image Update


ChatGPT new voice and image feature
Image credits to: OpenAI.com

INTRODUCTION


“ChatGPT can now see, hear and speak”


OpenAI is starting to roll out a new update with voice and image capabilities in ChatGPT, which will offer a more intuitive type of interface where the users can have a full voice conversation and even show ChatGPT what they are talking about/seeing in real time.


DIVING INTO CHATGPT’S NEW VOICE AND IMAGE FEATURE


Now, if you are a movie buff or a nerd, like me, you probably immediately thought of some pop cultural reference or blockbuster movie like Her. The protagonist is struggling to form and maintain human relationships and is never happy with them and then comes across an AI which, very quickly, becomes his best friend and eventually, romantic interest. In Her, the protagonist can also show the AI what he is seeing in real-time, converse with her and even have deep meaningful talks at night.

While this movie remains an exceptional piece of art in expressing the complexity that lies within humans and how this complexity carries forward to effect other aspects of life, such as friendship, love and mental health, this new ChatGPT update is much MUCH bigger than that. It can, soon, become a crucial part of everyday life.


VOICE


The voice and image upgrade allows for more usage out of ChatGPT in daily life. “Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.” (Open AI)

The possibilities are nearly endless and with this comes a lot of opportunity for various industries and businesses to profit from, if integrated within their business effectively. Another thing worth noting is that the audio you would be talking to is a human-like voice, created through collaboration with professional voice actors. You have the choice to pick 1 of 5 voices you find the most comfortable.

This feature has begun rolling out and should be available on iOS and Android in a few days, if it has not already. To install this feature, go to your settings -> New Features and opt into voice conversations. From there, you tap the headphone button on the top right corner of the home screen ad choose your preferred voice.


IMAGE


You can now show ChatGPT one or more images. Troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in our mobile app.” (Open AI)

To install this feature, tap the photo button top capture an image (tap the plus button first). There is also a guide tool to help you through the steps.


TRANSPARENCY AND THE INTRICACIES


Open AI wants to ensure there is no fraudulent behavior on this update, such as impersonating famous actors voices and committing fraud, therefore, they are gradually rolling out the technology rather than all at once and have used voice chat. This tool was created with the voice actors they directly worked with and anything outside of that is not part of this update. Voice chat is also being used by Spotify to help podcasters reach larger audiences by translating their podcasts into different languages in the podcaster’s voice.

In terms of the image update, they want to avoid issues such as hallucinations and co-dependency on the tool’s interpretation of images in high-stake situations, therefore, they are using the Be My Eyes app to understand the uses and limitations of visual technology. This feature will rely on the image being seen by the user rather than its own created images, resulting in hallucinations and false visuals. This rollout will rely on real world usage and feedback to continue improving the technology and keeping the tool safe and useful.


To Sum up…


ChatGPT has many limitations as well, such as the inability to transcribe text that is non-English and does not have roman-scripture, its limitations when it comes to research and the accuracy in high-stake situations, therefore, it is advised, by both OpenAI and us, at Codemasters Agency, please use ChatGPT cautiously and intelligently. This is not something to co-depend on or use every minute of the day. Use it as a quick and easy tool to enrich your daily experiences and gain knowledge, however, do not entirely rely on it for all your knowledge or relational dynamics.



Comments


bottom of page