Cohere has recently achieved a significant breakthrough in its Embed3 search model by seamlessly integrating image search functionality with text retrieval for the first time. This innovation enables businesses to conduct unified searches for both images and text within the same database, revolutionizing the management of vast collections of product images, design documents, and reports.
Technically, the new system adopts a unified storage architecture, effectively eliminating the need for businesses to maintain multiple independent databases. It supports popular image formats such as PNG, JPEG, WebP, and GIF, with a single file size limit of 5MB. Currently, the system only supports single image queries, with batch processing capabilities still under development.
Leveraging core technology, the system transforms commercial data into vector representations, significantly enhancing the retrieval efficiency of complex business data. Developers can access the new features through the existing Embed API, with images submitted in the form of Base64 encoded data URLs.
It is worth noting that the updated model supports over 100 languages and boasts strong cross-platform compatibility. In addition to running on Cohere's own platform, it can also be deployed on Microsoft Azure and Amazon SageMaker. The company, founded by a team specializing in Transformer architecture, secured $500 million in funding last July.
Against the backdrop of increasing importance in multi-modal content search, tech giants like Google and OpenAI have also launched similar products. The current focus of competition has shifted towards the processing speed, accuracy, and security required for enterprise-level applications.