14th May 2024 – (Cupertino) OpenAI’s latest offering, GPT-4o, has set a new benchmark for voice assistants, making Apple’s Siri appear archaic by comparison. Unveiled at the recent OpenAI Spring Update, GPT-4o promises to revolutionise the way we interact with technology, offering capabilities that far surpass those of any existing voice assistant. This article delves into the groundbreaking features of GPT-4o and explores why it is poised to leave Siri, along with other competitors like Google Assistant and Amazon’s Alexa, struggling to keep up.

OpenAI’s GPT-4o is not just an incremental upgrade; it is a quantum leap in the realm of voice assistance. The new model showcases a myriad of advanced features, including the ability to express real emotions, perform real-time translations, and utilise vision functionalities akin to Google Lens. These enhancements enable GPT-4o to assist with tasks ranging from solving linear equations to gauging a user’s mood through facial recognition.

One of the standout demonstrations at the OpenAI event involved Mark, an OpenAI researcher, who showcased GPT-4o’s ability to engage in real-time conversational speech. When Mark expressed his nervousness about giving a live demo, GPT-4o responded with genuine excitement, guiding him through a calming breathing session. The chatbot’s humorous remark, “you’re not a vacuum cleaner,” when Mark exaggerated his breathing, highlighted its ability to inject personality into interactions. Despite minor hiccups with audio synchronisation, the overall performance was highly impressive, particularly the feature allowing users to interrupt and change the conversation’s direction seamlessly.

Another groundbreaking feature of GPT-4o is its ability to perceive and generate emotions. During the demo, Mark instructed the assistant to read a bedtime story with more expressiveness and drama. GPT-4o responded by delivering a passionate rendition, even switching to a robot voice on command. This capability extends to singing on request, showcasing the model’s versatility and potential as a companion for various tasks.

GPT-4o’s vision capabilities were demonstrated through a linear math equation. Initially, the assistant prematurely attempted to solve the problem, humorously acknowledging its mistake with a “whoops, I got too excited.” This instance underscored the model’s ability to recognise and correct its errors, adding a layer of human-like interaction. When finally presented with the equation “3x + 1 = 4”, GPT-4o provided hints without revealing the answer, positioning itself as an invaluable tool for educational purposes.

In a nod to its versatility, GPT-4o can also recognise and analyse code on a user’s PC. This feature, combined with its ability to provide real-time feedback on graphs, makes it an indispensable assistant for professionals and students alike.

GPT-4o’s real-time translation capabilities were another highlight. When tasked with translating a conversation from English to Italian, the assistant responded with “Perfecto!” and performed the translation accurately and amiably. This function positions GPT-4o as an ideal travel companion, capable of breaking down language barriers seamlessly.

In a feat that truly sets GPT-4o apart, the assistant can detect emotions from selfies taken with a phone’s front camera. During the demo, it recognised a smile and inquired, “want to share the reason for your good vibes?” This feature not only enhances user interaction but also opens up new possibilities for mood-based applications.

With these innovative features, GPT-4o is light years ahead of Siri, Google Assistant, and Alexa. The competition is stiff, but the advancements presented by OpenAI have set a new standard. Apple, reportedly working on Siri 2.0, and Google, with its upcoming I/O event, now face immense pressure to catch up.

GPT-4o will roll out in the coming weeks, making its advanced features accessible to a broad user base. The update includes a new desktop client for Mac, with potential releases for Windows and Linux still under consideration. Both free and paid users of ChatGPT will benefit from this upgrade, with free users gaining access to features previously behind a paywall.

Engineered to be fully functional in 50 languages, GPT-4o covers 97% of internet users. OpenAI has also enhanced security to accommodate the model’s increased visual and audio capabilities, although specific details were not disclosed during the conference.

A significant shift in OpenAI’s strategy is its increased support for free users. By removing paywalls and opening up the GPT store to all users, OpenAI risks alienating its paid user base but stands to gain from a burgeoning free user community that has surpassed a billion people. This move aligns with OpenAI’s vision of democratising AI and making advanced technology accessible to everyone.

GPT-4o represents a monumental leap in the world of voice assistants, offering features that are not just ahead of Siri but also of any other competitor in the market. With capabilities ranging from real-time translation to emotion detection and advanced code analysis, GPT-4o is set to transform how we interact with technology. As it rolls out globally, the landscape of digital assistants will undoubtedly shift, compelling tech giants like Apple and Google to innovate at an unprecedented pace.