Gemini AI

This web content provides a breakdown of the performance of different models in various reasoning tasks. The Gemini Ultra model demonstrates strong results in natural image understanding, OCR on natural images, document understanding, infographic understanding, mathematical reasoning in visual contexts, English video captioning, and video question answering. Surprisingly, the Gemini Ultra also outperforms the GPT-4V model in most of these tasks, showcasing its prowess in multi-discipline reasoning. However, it is worth noting that the Gemini Pro model excels in automatic speech translation and automatic speech recognition, outperforming the Whisper v2 and Whisper v3 models, respectively.

https://deepmind.google/technologies/gemini/

To top