all about the new multimodal AI model

The multimodal capabilities of GPT-4

This was one of the big questions around GPT-4. Will the new OpenAI model be able to interpret or generate a format other than text?

We now have the answer: GPT-4 is able to integrate a prompt composed of a text and an image. On the other hand, the results provided by GPT-4 will be limited to text format. Recent rumors, distilled by a Microsoft executive, gave hope for other possibilities related to the generation of videos. But the mix picture and text already represents an impressive novelty…

OpenAI shows an example of using its multimodal artificial intelligence. © Open AI

Creativity, image and context management with GPT-4

OpenAI presents the three major new features of its language model:

  • Creativity : GPT-4 is able to better meet the creative needs of its users. OpenAI evokes the conception of tasks such as musical composition, screenwriting and/or reproduction of the author’s style.
  • Size: GPT-4 therefore accepts images as input. This new capability allows you to generate legends, classifications or perform analyzes based on image interpretation.
  • Context : GPT-4 is capable of handling more than 25,000 words, which makes it possible to create longer texts, participate in richer conversations, carry out more comprehensive searches and analyzes of documents.
Creativity increased with GPT-4. © Open AI

The quality of the results obtained with GPT-4

OpenAI publishes research on GPT-4 so that the quality of the results obtained can be measured. Benchmarks have thus been produced in order to evaluate the texts proposed by the new language model. And unsurprisingly, GPT-4 vastly outperforms GPT-3.5 (ChatGPT’s model), on all tests.

GPT-4 greatly outperforms the results obtained with GPT-3.5. © Open AI

Another very interesting information, GPT-4 makes it possible to obtain very good results in many languages. In French, for example, the level of precision obtained with GPT-4 is higher than that obtained in English with GPT-3.5. Our language is one of the best managed by the new OpenAI model.

Cock-a-doodle Doo ! © Open AI

Beyond the numbers: This means that users will be able to get higher quality results using GPT-4, compared to GPT-3.5. We will see with use how much the quality of the answers is superior to those of ChatGPT. OpenAI indicates that it has been working for 6 months on the security of the answers provided by its new language model.

GPT-4 is 82% less likely to respond to requests for unauthorized content and 40% more likely to produce factual responses, compared to GPT 3.5.

OpenAI has integrated more human feedback, including that collected by ChatGPT, to improve “the behavior of GPT-4”. 50 experts were invited to improve AI safety and security. The publisher has also based itself on the observed uses of its previous models “in the real world”. OpenAI promises regular updates to continuously improve GPT-4.

The integration of GPT-4 in applications

When OpenAI presented its previous model (GPT-3), in May 2020, developers were able to access it 2 months later via the API. Users were therefore able to take advantage of this technology from the second half of 2020.

With GPT-4, it will go much faster: OpenAI has worked with several vendors to create new possibilities in popular applications. Duolingo, Be My Eyes, Stripe, Morgan Stanley, Khan Academy and the Government of Iceland are today unveiling new features based on GPT-4.

Duolingo is already presenting novelties based on GPT-4. © OpenAI / Duolingo

How to access GPT-4?

Want to access GPT-4? For developers, the new language model will be accessible via the OpenAI API. It’s by invitation at the moment: to get one, you must first register on the GPT-4 waitlist. You can also benefit from the superior capabilities of GPT-4 via ChatGPT Plus!

Access to GPT-4 is currently only for text input. For image-based and text-based prompts, OpenAI says it works with only one partner for now, Be My Eyes.

GPT-4, and after?

Although GPT-4 seems much more powerful – and more precise – than GPT-3 and 3.5, OpenAI recalls that a large number of limitations and risks associated with its technology remain present and known: social prejudices, unexpected responses… The publisher’s objective is to mitigate as much as possible the problems related to its language models in order to avoid malicious uses or problematic results. We will see in the coming months if the uses of GPT-4 make it possible to erase the limits of ChatGPT and other services based on OpenAI technologies.

The best alternatives to ChatGPT