OpenAI surprised us in May of this year with the announcement of a voice mode in the purest ‘Her’ style for ChatGPT. The company led by Sam Altman promised at the time that the novelty would arrive in “the next few weeks”, but a month later announced that it needed a little more time to address some security challenges.
The days have passed and the long-awaited new voice mode it’s here. We are not facing a massive launch of a final version, but rather a rather restrained rollout of an alpha version for ChatGPT Plus users. This will run until August, when all users of the aforementioned paid plan should have access to the feature.
Waiting for vision capabilities
If you are one of the users chosen to test the new voice mode, you will receive an in-app message. Once activated, you will be able to interact with ChatGPT powered by GPT-4o much more naturally. Remember that one of the improvements over the original voice mode is that it is possible to interrupt him and that he can also have emotional conversations.
On a slightly more technical level, the original voice mode worked very differently. You converted the voice to text, GPT-4 processed that text, and the response was converted back to voice. GPT-4o is a multimodal modelso everything is processed directly. The consequence? As we have seen, it is extremely low latency.
One good news is that it won’t be limited to English alone. OpenAI says it has tested the voice mode with over 45 languages. However, there are some changes from what we saw at the launch day. Despite its ability to reproduce other voices, at least for now, there are only four voices available (Juniper, Breeze, Cove and Ember).
Sky, the voice that resembled Scarlett Johansson’s, It wont be availableThe actress who played Samantha in the film ‘Her’, had expressed “shock, anger and disbelief” after hearing a voice so similar to hers on ChatGPT. Johansson had rejected an offer from Altman to lend his voice to the famous artificial intelligence chatbot.
In the demo months ago we saw ChatGPT helping children with homework or describing what was in a room as OpenAI employees chatted fluidly. These features are powered by GPT-4o’s vision capabilities, but The firm says that “will arrive at a later date.”
Images | OpenAI
At Xataka | Instagram creators will be able to chat with all their followers. But it won’t be them, but their bots
Add Comment