A research team has demonstrated an innovative visual place recognition technology called Revisit Anything, which can identify the location depicted in an image simply by inputting the picture.

Google Street View Map

Image source note: The image was generated by AI, with image licensing provided by Midjourney.

This technology integrates the latest SAM (Segment Anything Model) and DINO (Self-Distillation with No Labels) to enhance the retrieval effectiveness of image segments, thereby achieving more precise place re-identification.

The core of this technology lies in its image segmentation retrieval capabilities. The research team utilized a series of datasets, including Baidu, VPAir, Pitts, and 17places, to provide a comprehensive testing foundation. For ease of use, researchers recommend starting with the smaller 17places dataset, allowing for a quick learning curve.

When preparing the datasets, users need to ensure that the folder names match those in the configuration files to ensure smooth data reading.

Next, users can choose to use the DINO or SAM model for feature extraction and generate VLAD cluster centers. It's important to note that generating cluster centers is optional, as existing centers can be directly called from the cache. After feature extraction, users need to extract the PCA model and then run the main SegVLAD pipeline to obtain the final results. All results can be saved for convenient subsequent offline retrieval calculations.

This research not only offers a new visual place recognition solution but also showcases how modern deep learning models can be utilized for image analysis, propelling further advancements in the field.

Project entry: https://github.com/AnyLoc/Revisit-Anything

Key Points:

🌟 The study combines SAM and DINO technologies to introduce a novel method for visual place recognition.

📊 Users can quickly get started and run experiments by preparing specific datasets and setting up configuration files.

🔍 The research provides detailed steps and scripts to help users achieve efficient results with SegVLAD.