3 Sources
3 Sources
[1]
The Star Trek Holodeck just got closer - Apple's new AI tool generates 3D scenes from your photos in under a second for VR memories
Remember when Apple introduced spatial Lock Screen photos in iOS 26? This feature added a stereoscopic effect to flat images on your Lock Screen, and it's neat, if a little gimmicky. Now, though, Apple has revealed a new trick that takes the effect to an entirely new level - and it could be a brilliant addition to your Apple device if it gets a wider roll-out. That new tool is called SHARP, and it's just been unveiled in a research paper published by Apple. Titled "Sharp Monocular View Synthesis in Less Than a Second," the paper outlines a new tool that can turn 2D images into 3D spatial scenes in under one second. SHARP uses a neural network - artificial intelligence (AI), in other words - to quickly generate a 3D map of your image. That's the part that is performed in less than a second. Once that's complete, the image can be rendered in real time. Apple says it trained the model on around eight million synthetic images created in-house and 2.65 million licensed photographs, with the result that SHARP could learn to discern depth and scale and apply that knowledge to input images. It does this while maintaining consistency when it comes to aspects like scale and distance, meaning you shouldn't see the sorts of stretching and warping that can come about when performing a 2D-to-3D conversion. That's key to maintaining immersion and producing a 3D image that users actually want to keep. Right now, SHARP is more a proof-of-concept than a prime-time feature, and there's no indication of when - or if - it will come to Apple's devices. While it's available to download on GitHub, it's not yet baked into the likes of iOS 26 or macOS Tahoe. That said, it seems like a natural evolution of the spatial photos feature that Apple has already released. If the Photos app on your iPhone will let you explore your images in this way, it could be an attractive selling point for a lot of people. Add it to the Vision Pro headset and it'll be even more immersive. All that said, SHARP does have some drawbacks. For one thing, it's focused on rendering nearby scenes, meaning you can't stray too far from the original viewpoint before fidelity starts to suffer. But as a starting point, it's certainly promising, especially when you consider how quickly it can operate. Don't be surprised to see it roll out in Apple's operating systems at some point in the future.
[2]
You Can Now Try Apple's New AI Model That Creates A 3D Scene From A Single Image In Under A Second
Apple is continuing to increase its AI capabilities at a fairly rapid pace. As a case in point, consider the Cupertino giant's latest AI model that is able to create an entire 3D scene from a single 2D image, and that too in under a second. Apple has now published a study, titled "Sharp Monocular View Synthesis in Less Than a Second." The study details how Apple's engineers were able to train an AI model, called SHARP, to generate a "photorealistic" 3D view from a single 2D image. Critically, Apple claims that the view generation takes "less than a second on a standard GPU via a single feedforward pass through a neural network." Basically, SHARP predicts what a 3D scene, distilled from a 2D image, would look like by taking into account the image's "nearby viewpoints." The study notes: "The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements." For the benefit of those who might not be aware, 3D Gaussian Splatting is a technique that is used to create photorealistic 3D scenes by representing such scenes as millions of "splats," which are basically tiny colored blobs. To create a full scene, however, often requires numerous 2D images from various angles. Apple's SHARP differs in that it is able to recreate an entire photorealistic scene from a single 2D image by predicting depth and colors, and too in under a second. What's more, you can now try Apple's SHARP AI model for free by heading over to the dedicated GitHub page.
[3]
Apple's SHARP can turn a photo into a 3D scene in under a second
It seems all the biggest tech companies are working on AI-driven 3D tools at the moment, from Google to Meta plus lots and lots of smaller developers. But with the benefit of its own hardware in the form of the iPhone and Vision Pro, could Apple have the edge when it comes to developing an entire workflow? The Cupertino tech giant has published the code for SHARP, an experimental AI model that can quickly turn 2D images into 3D gaussian splats that can then be viewed on Vision Pro. Some now think Apple's AI work may have been underestimated. Instead of traditional polygons, gaussian splatting uses millions of fuzzy 3D ellipsoids with defined position, size, orientation, colour and transparency to represent and render intricate 3D scenes in real-time so that they look highly accurate from a particular viewpoint. Most techniques require lots - sometimes hundreds - of images of a scene from different angles. But Apple's SHARP uses AI to predict the scene from just one photo in under a second on a standard GPU. Apple trained SHARP on swathes of synthetic and real-world data to teach it to identify frequent depth and geometry patterns so it can predict the position and appearance of 3D Gaussians via a single forward pass through a neural network. According to the research paper, distances and scale remain consistent in real-world terms. The representation is metric, with absolute scale, supporting metric camera movements. The compromise is that SHARP only accurately renders nearby viewpoints, not unseen parts of the scene, which means users can't venture far from that viewpoint. The code has been published on GitHub, and people have been testing out the tool (see the examples below). This week also saw the launch of SpAItial AI's Echo, which can turn 2D images into editable 3D worlds on which users can apply different styles. The company hopes to add full prompt-based scene manipulation, allowing users to add, remove, rearrange, or restyle objects.
Share
Share
Copy Link
Apple unveiled SHARP, an AI model that converts 2D photographs into 3D spatial scenes in under a second. The tool uses neural networks trained on 8 million synthetic images to predict depth and geometry, creating photorealistic 3D scenes that can be rendered in real time on Vision Pro. The code is now available on GitHub for developers to test.
Apple has published a research paper introducing SHARP, an AI model that creates a 3D scene from a single image in under a second. Titled "Sharp Monocular View Synthesis in Less Than a Second," the paper details how Apple's new AI tool generates 3D scenes from photos using advanced neural networks
1
. The Cupertino tech giant trained the model on approximately 8 million synthetic images created in-house and 2.65 million licensed photographs, enabling SHARP to discern depth and scale with striking accuracy1
.Apple's new AI model operates by predicting what a photorealistic 3D scene would look like by analyzing the image's nearby viewpoints
2
. The system uses 3D Gaussian Splatting, a technique that represents scenes as millions of tiny colored blobs called "splats" with defined position, size, orientation, color and transparency2
. Unlike traditional methods requiring hundreds of images from various angles, SHARP can turn a photo into a 3D scene using just one image. The 3D Gaussian representation produced by SHARP maintains consistency in scale and distance, avoiding the stretching and warping common in 2D-to-3D conversions1
.The Sharp Monocular View Synthesis process completes the 3D mapping in less than a second on a standard GPU via a single feedforward pass through a neural network
2
. Once the initial mapping is complete, the image can be rendered in real time, producing high-resolution photorealistic 3D scene outputs1
. According to the research paper, the representation is metric with absolute scale, supporting metric camera movements while maintaining real-world consistency in distances and scale3
. This speed and efficiency distinguish SHARP from competing 3D generation tools currently being developed by Google, Meta, and smaller developers3
.Related Stories
While SHARP remains a proof-of-concept, the tool appears to be a natural evolution of Apple's existing spatial photo features introduced in iOS 26
1
.
Source: TechRadar
The model can quickly turn 2D images into 3D gaussian splats that can be viewed on Vision Pro, Apple's mixed reality headset
3
. If integrated into the Photos app on iPhone, users could explore their images in immersive 3D environments. The code has been published on GitHub, allowing developers and researchers to test the tool and experiment with the technology2
. However, there's no indication yet of when or if SHARP will be integrated into iOS 26 or macOS Tahoe1
.SHARP does have constraints that users should consider. The model focuses on rendering nearby scenes, meaning fidelity degrades when users stray too far from the original viewpoint
1
. The AI model only accurately renders nearby viewpoints rather than unseen parts of the scene, limiting exploration beyond the initial perspective3
. Despite these limitations, the speed at which SHARP operates makes it a promising starting point for consumer applications. With Apple's control over both hardware like iPhone and Vision Pro, plus software ecosystems, some observers believe Apple's AI work may have been underestimated in the competitive landscape of 3D generation tools3
. As AI-driven 3D tools continue evolving across the industry, watch for potential announcements about SHARP integration into Apple's operating systems and devices in future updates.Summarized by
Navi
[2]
[3]
07 Oct 2024•Technology

14 May 2025•Technology

03 Dec 2024•Technology

1
Policy and Regulation

2
Technology

3
Technology
