Meta Unveils SAM 3 and SAM 3D: Advanced AI Models for Visual Intelligence and 3D Reconstruction

Reviewed byNidhi Govil

5 Sources

Share

Meta releases its third-generation Segment Anything Models (SAM 3 and SAM 3D) featuring enhanced object detection, text-based prompting, and 3D reconstruction capabilities. The open-source models advance visual intelligence for content editing, wildlife conservation, and AR applications.

Revolutionary Visual Intelligence Technology

Meta has unveiled its third-generation Segment Anything Models (SAM 3 and SAM 3D), marking a significant advancement in computer vision and visual intelligence technology.

1

These models represent a major leap forward from traditional object detection systems, introducing "promptable concept segmentation" capabilities that allow users to identify and segment visual elements using natural language descriptions.

Source: SiliconANGLE

Source: SiliconANGLE

Enhanced Object Detection and Segmentation

The SAM 3 model addresses a fundamental challenge in computer vision: linking natural language to specific visual elements in images and videos.

2

Unlike previous models that were limited to fixed label sets and could only segment common concepts like "person" or "car," SAM 3 can handle nuanced descriptions such as "the striped red umbrella" or "yellow school bus."

3

The model accepts various input types including text prompts, image exemplars, masks, bounding boxes, and points, providing users with multiple ways to specify what they want to detect or track.

4

This flexibility enables applications ranging from simple photo editing to complex video analysis tasks.

Source: CNET

Source: CNET

3D Reconstruction Capabilities

SAM 3D introduces groundbreaking three-dimensional reconstruction capabilities, consisting of two specialized models: SAM 3D Objects for object and scene reconstruction, and SAM 3D Body for human pose and shape estimation.

3

These models can generate detailed 3D meshes and textured geometry from single 2D images, handling real-world complexity including occlusion and clutter.

The 3D reconstruction technology has immediate applications in augmented reality, robotics, and spatial computing. Users can potentially reconstruct 3D likenesses of people from photographs and import them into videos or virtual reality environments, opening new possibilities for creative content and memorial applications.

Training and Technical Architecture

Meta developed SAM 3 using an innovative data engine that combines human annotators with AI systems, including Llama-based captioners.

5

The model architecture builds upon Meta's Perception Encoder and incorporates components from the DETR model, which pioneered the use of transformers for object detection.

The training process involved processing over 4 million visual concepts, significantly reducing annotation time compared to traditional methods.

5

Meta claims SAM 3 doubles the cgF1 scores (a measure of concept recognition and localization) compared to existing models and consistently outperforms foundational models like Gemini 2.5 Pro.

Real-World Applications and Partnerships

The models are already being deployed in scientific research and conservation efforts. Meta partnered with ConservationX Labs and Osa Conservation to create the SA-FARI dataset, containing over 10,000 camera trap videos of more than 100 species with detailed annotations.

2

This collaboration demonstrates the technology's potential for wildlife monitoring and conservation research.

Source: Digit

Source: Digit

Additionally, Meta is working with FathomNet and MBARI to advance AI tools for ocean exploration, providing segmentation masks and benchmarks tailored for underwater imagery.

2

These partnerships highlight the models' versatility across different scientific domains.

Platform Integration and Accessibility

Meta is integrating these technologies across its platforms, with SAM 3 powering enhanced editing capabilities in Instagram's Edits app and the Vibes platform.

4

The Facebook Marketplace "View in Room" feature utilizes SAM 3D to help users visualize furniture and home decor items in their own spaces before purchasing.

To democratize access to these advanced capabilities, Meta launched the Segment Anything Playground, an online platform where users can experiment with the models without technical expertise.

5

The company is releasing model weights, code, and evaluation benchmarks under the SAM License, supporting both research and commercial applications.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo