News
News Categories

OpenAI releases GPT-4o for free, and boy is it fast!

By Ken Wong - on 14 May 2024, 11:36am

OpenAI says the new model will be able to respond to audio inputs in as little as 232 milliseconds, which is similar to human re

From left to right, OpenAI CTO Mira Murati, Mark Chen Head of Frontiers Research, and Head of Post-training  Barret Zoph.

During its Spring update along with a blog and twitter post, OpenAI has announced the launch of its new flagship generative AI product, GPT-4o (“o” for “omni”), that has near human response times.  OpenAI says the new model will be able to respond to audio inputs in as little as 232 milliseconds, which is similar to human response time in a conversation.

While GPT-4o can receive queries in any combination of text, audio, and image and will generate responses in the same vein, it was demoed using voice mode to talk to ChatGPT. On the blog page, you can get one of six preset voices read the page to you. The voices included three male and three female voices.

The different voice presets available.

According to the demos, GPT-4o can recognise and respond to screenshots, videos, photos, documents, uploaded charts, and also recognise facial expressions, and handwritten information. Demos we found interesting was the ability to create a detailed summary of a video presentation and the ability to create a summary of a meeting with multiple attendees from a voice recording. Transcription tool Otter.ai could be in trouble here.

Developers can also now access GPT-4o in the API as a text and vision model for half the price, and with 5x higher rate limits compared to GPT-4 Turbo.

There will also be a new desktop app on offer for the Mac, with a Windows version coming soon.

Join HWZ's Telegram channel here and catch all the latest tech news!
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.