We are beginning to roll out new voice and image capabilities in ChatGPT

We are introducing voice and image capabilities in ChatGPT, offering a more intuitive interface. Users can engage in voice conversations and show ChatGPT images to enhance their interactions. These features have various applications, such as having a live conversation about a landmark while traveling or analyzing images to troubleshoot a problem. Voice conversations can be initiated by opting into the feature in the mobile app settings and selecting a preferred voice from five options. The voice capability is powered by text-to-speech technology and professional voice actors. Similarly, image understanding is enabled by multimodal GPT-3.5 and GPT-4 models. OpenAI aims to deploy these capabilities gradually to ensure safety and refinement. Voice chat and image input pose new risks, but appropriate measures have been taken to mitigate them. The vision feature has been guided by collaborations with organizations such as Be My Eyes, ensuring usefulness and privacy protection. However, users should be aware of the model’s limitations and exercise caution in higher-risk use cases. These new capabilities will be rolled out to Plus and Enterprise users first and then expanded to other groups.

https://openai.com/blog/chatgpt-can-now-see-hear-and-speak

To top