SAM 2: Segment Anything in Images and Videos

SAM 2 is a cutting-edge model developed by Meta’s FAIR team, led by a group of talented individuals including Nikhila Ravi and Valentin Gabeur. SAM 2 revolutionizes promptable visual segmentation in images and videos, boasting a simple transformer architecture for real-time video processing. Notably, SAM 2 introduces the SA-V dataset, the largest video segmentation dataset to date. The model’s standout feature is its ability to improve model and data through user interaction, resulting in strong performance across various tasks and visual domains. SAM 2 provides image and video prediction capabilities, offering APIs for promptable segmentation and tracking in videos with multiple objects.

https://github.com/facebookresearch/segment-anything-2