How we used GPT-4o for image detection with 350 similar illustrations

This story describes how our small engineering team tackled a challenging project for a museum seeking an app to match car illustrations on their exhibition walls to related content. Initially planning to use web-AR technology, we faced limitations and pivoted to a MobileNet image classification model. Despite challenges with transfer learning and data augmentation, we eventually incorporated a multimodal LLM model, such as gpt-4o, to successfully match images. This innovative solution improved our image matching service and highlighted the transformative impact of large language models on engineering, product development, and AI integration. The future looks promising as AI tools become more accessible and versatile.

https://olup-blog.pages.dev/stories/image-detection-cars