hngopher.com

       [HN Gopher] The iPhone 15 Pro's Depth Maps
       ___________________________________________________________________
        
       The iPhone 15 Pro's Depth Maps
        
       Author : marklit
       Score  : 176 points
       Date   : 2025-06-04 17:57 UTC (5 hours ago)
        
 (HTM) web link (tech.marksblogg.com)
 (TXT) w3m dump (tech.marksblogg.com)
        
       | caseyohara wrote:
       | Cool article. I assume these depth maps are used for the depth of
       | field background blurring / faux bokeh in "Portrait" mode photos.
       | I always thought it was interesting you can change the focal
       | point and control the depth of field via the "aperture" _after_ a
       | photo is taken, though I really don 't like the look of the fake
       | bokeh. It always looks like a bad photoshop.
       | 
       | I think there might be a few typos of the file format?
       | 
       | - 14 instances of "HEIC"
       | 
       | - 3 instances of "HIEC"
        
         | marklit wrote:
         | Fixed those. Cheers for pointing them out.
        
         | dheera wrote:
         | I think the reason it looks fake is because they actually have
         | the math wrong about how optics and apertures work, and they
         | make some (really bad) approximations but from a product
         | standpoint can please 80% of people.
         | 
         | I could probably make a better camera app with the correct
         | aperture math, I wonder if people would pay for it or if mobile
         | phone users just wouldn't be able to tell the difference and
         | don't care.
        
           | dylan604 wrote:
           | most people just want to see blurry shit in the background
           | and think it makes it professional. if you really want to see
           | it fall down, put things in the foreground and set the focal
           | point somewhere in the middle. it'll still get the background
           | blurry, but it gets the foreground all wrong. i'm guessing
           | the market willing to pay for "better" faked shallow depth of
           | field would be pretty small.
        
             | dheera wrote:
             | Yeah that's why I didn't write the app already. I feel like
             | the people who want "better faked depth" usually just end
             | up buying a real camera.
        
               | tene80i wrote:
               | Sample of one, but I'm interested. I used to use a real
               | camera and now very rarely do. But I also often find the
               | iPhone blurring very fake and I've never understood why.
               | I assumed it was just impossible to do any better, given
               | the resources they throw at the problem. If you could
               | demonstrate the difference, maybe there would be a
               | market, even if just for specific use cases like
               | headshots or something.
        
               | dylan604 wrote:
               | Lytro had dedicated cameras and inferior resolution so
               | they failed to gain enough traction to stay viable. You
               | might have a better chance being that it's still on the
               | same device, but the paid for app would be a push.
               | 
               | However, you could just make the app connect to localhost
               | and hoover up the user's data to monetize and then offer
               | the app for free. That would be much less annoying than
               | showing an ad at launch or after every 5 images taken. Or
               | some other scammy app dev method of making freemium apps
               | successful. Ooh, offer loot boxes!!!
        
           | semidror wrote:
           | Would it be possible to point out more details about where
           | Apple got the math wrong and which inaccurate approximations
           | they use? I'm genuinely curious and want to learn more about
           | it.
        
             | dheera wrote:
             | It's not that they deliberately made a math error, it's
             | that it's a _very_ crude algorithm that basically just
             | blurs everything that 's not within what's deemed as the
             | subject with some triangular, Gaussian, or other
             | computationally simple kernel.
             | 
             | What real optics does:
             | 
             | - The blur kernel is a function of the shape of the
             | aperture, which is typically circular at wide aperture and
             | hexagonal at smaller aperture. Not gaussian, not
             | triangular, and the kernel being a function of the depth
             | map itself, it does not parallelize efficiently
             | 
             | - The blurring is a function of the distance to the focal
             | point, is typically closer to a hyperbola; most phone
             | camera apps just use a constant blur and don't even account
             | for this
             | 
             | - Lens aberrations, which are often thought of as defects,
             | but if you generate something _too_ perfect it looks fake
             | 
             | - Diffraction effects happen at sharp points of the
             | mechanical aperture which create starbursts around
             | highlights
             | 
             | - When out-of-focus highlights get blown out, they blow out
             | more than just the center area, they also blow out some of
             | the blurred area. If you clip and _then_ blur, your blurred
             | areas will be less-than-blown-out which also looks fake
             | 
             | Probably a bunch more things I'm not thinking of but you
             | get the idea
        
           | willseth wrote:
           | If it's all done in post anyway, then it might be a lot
           | simpler to skip building a whole camera app and just give
           | people a way to apply more accurate bokeh to existing photos.
           | I would pay for that.
        
           | lcrs wrote:
           | There are a few projects now that simulate defocus properly
           | to match what bigger (non-phone camera) lenses do - I hope to
           | get back to working on it this summer but you can see some
           | examples here: https://x.com/dearlensform
           | 
           | Those methods come from the world of non-realtime CG
           | rendering though - running truly accurate simulations with
           | the aberrations changing across the field on phone hardware
           | at any decent speed is pretty challenging...
        
       | andrewmcwatters wrote:
       | There's Reality Composer for iOS which has a LIDAR-enabled
       | specific feature allowing you to capture objects. I was bummed to
       | find out that on non-LIDAR equipped Apple devices it does not in
       | fact fall back to photogrammetry.
       | 
       | Just in case you were doing 3d modeling work or photogrammetry
       | and wanted to know, like I was.
        
         | WalterGR wrote:
         | Polycam does fall back.
         | 
         | I've also heard good things about Canvas (requires LiDAR) and
         | Scaniverse (LiDAR optional.)
        
           | zevon wrote:
           | I've had pretty good success with https://3dscannerapp.com -
           | it's mostly intended for people with access to iDevices with
           | LiDAR and an Apple Silicon Mac and in this combination can
           | work completely offline by capturing via the iDevice and
           | doing the processing on the Mac (using the system API for
           | photogrammetry). AFAIK, there are also options for using just
           | photos without LiDAR data and for cloud processing but I've
           | never tried those.
        
           | andrewmcwatters wrote:
           | I'd really like to use Polycam, but it's unclear what
           | features are free and what's paid.
           | 
           | I'd be fine with paying for it, but it's clear that they want
           | to employ basic dark patterns and false advertising.
        
         | H3X_K1TT3N wrote:
         | I've had the most success doing 3d scanning with Heges. The
         | LiDAR works pretty well for large objects (like cars), but you
         | can also use the Face ID depth camera to capture smaller
         | objects.
         | 
         | I did end up getting the Creality Ferret SE (via TikTok for
         | like $100) for scanning small objects, and it's amazing.
        
           | tecleandor wrote:
           | Oh! $100 is a great price. I always see it at around $300-350
           | and I haven't bought it...
        
             | H3X_K1TT3N wrote:
             | I take it back; I double checked and it was more like $180.
             | Still worth it IMO.
        
           | klaussilveira wrote:
           | Does it scan hard surfaces pretty well, or does it mangle the
           | shapes? Think car parts.
        
       | itsgrimetime wrote:
       | site does something really strange on iOS chrome - when I scroll
       | down on the page the font size swaps larger, when I scroll up it
       | swaps back smaller. Really disorienting
       | 
       | Anyways, never heard of oiiotool before! Super cool
        
       | layer8 wrote:
       | You can make autostereograms from those.
        
       | onlygoose wrote:
       | LIDAR itself has much much lower resolution that the depth maps
       | shown. It has to be synthesized from combined LIDAR and regular
       | camera data.
        
         | mackman wrote:
         | Yeah I thought LIDAR was used for actual focus and depth map
         | was then computed from the multi-camera parallax.
        
       | wahnfrieden wrote:
       | anyone combining these with photos for feeding to gpt4o to get
       | more accurate outputs (like for calorie counting as a typical
       | example)?
        
         | cenamus wrote:
         | Calorie counting as never gonna be accurate, just how would you
         | know what's hiding inside a stew or curry? How much oil or
         | meat? How much dressing is on the salad? There's a reason
         | people do caloriy counting with raw ingredients (or pre bought
         | stuff) and not by wheighing and measuring plates of food
        
           | wahnfrieden wrote:
           | I know that (and I'm not building a calorie counter). The
           | question is about whether 4o can read photos better with
           | depth maps or derived measurements provided alongside the
           | original image and the example was chosen as it's inaccurate
           | but could perhaps be improved with depth map data (even if
           | not to the point of "accurate")
        
             | duskwuff wrote:
             | The answer to that question is "probably not".
             | 
             | First: the image recognition model is unlikely to have seen
             | very many depth maps. Seeing one alongside a photo probably
             | won't help it recognize the image any better.
             | 
             | Second: even if the model knew what to do with a depth map,
             | there's no reason to suspect that it'd help in this
             | application. The lack of accuracy in a image-to-calorie-
             | count app doesn't come from problems which a depth map can
             | answer like "is this plate sitting on the table or raised
             | above it"; they come from problems which can't be answered
             | visually like "is this a glass of whole milk or non-fat" or
             | "are these vegetables glossy because they're damp or
             | because they're covered in butter".
        
           | criddell wrote:
           | It's probably accurate enough for most people most of the
           | time.
           | 
           | The labels on your food are +/-20%. If you are analyzing all
           | of your meals via camera, it's probably not too far off that
           | over a week.
        
       | 1oooqooq wrote:
       | > *describes a top of the line system
       | 
       | > I'm running Ubuntu 24 LTS via Microsoft's Ubuntu for Windows on
       | Windows 11 Pro
       | 
       | this is like hearing someone buying yet another automatic super
       | car.
        
         | mmmlinux wrote:
         | Yeah, section seemed like some weird brag.
        
           | washadjeffmad wrote:
           | It's common practice in the sciences to include details about
           | any equipment used in your lab notes.
        
             | BobbyTables2 wrote:
             | True, but a physicist wouldn't normally document the shoes
             | they were wearing.
             | 
             | The details he documented don't seem relevant for what he
             | did and nothing performed would seem to stress even a low
             | end ancient system.
             | 
             | Definitely felt like a brag to me.
             | 
             | Only thing that makes me think otherwise is he also
             | documented the line counts of the scripts. Seems more like
             | a bizarre obsession with minutiae... (would have been more
             | meaningful to document the git commit/branch for a GitHub
             | project instead of the line count!!)
        
           | throitallaway wrote:
           | He's running an Nvidia 1080 GPU, a non-XD AMD processor, and
           | Windows 11. None of that is a brag.
        
             | BobbyTables2 wrote:
             | My system with a non-accelerated iGPU and 16GB DDR4 RAM
             | would differ...
        
       | just-working wrote:
       | Cool article. I read the title as 'Death Maps' at first though.
        
         | heraldgeezer wrote:
         | Me too! I wanted a world map of where iphone 15 users died :(
        
           | kridsdale3 wrote:
           | That could be approximated pretty well just combining income
           | data and age data.
        
             | bigyabai wrote:
             | Or by drawing a red circle around the United States
             | labelled "~95%"
        
       | Uncorrelated wrote:
       | Other commenters here are correct that the LIDAR is too low-
       | resolution to be used as the primary source for the depth maps.
       | In fact, iPhones use four-ish methods, that I know of, to capture
       | depth data, depending on the model and camera used. Traditionally
       | these depth maps were only captured for Portrait photos, but
       | apparently recent iPhones capture them for standard photos as
       | well.
       | 
       | 1. The original method uses two cameras on the back, taking a
       | picture from both simultaneously and using parallax to construct
       | a depth map, similar to human vision. This was introduced on the
       | iPhone 7 Plus, the first iPhone with two rear cameras (a 1x main
       | camera and 2x telephoto camera.) Since the depth map depends on
       | comparing the two images, it will naturally be limited to the
       | field of view of the narrower lens.
       | 
       | 2. A second method was later used on iPhone XR, which has only a
       | single rear camera, using focus pixels on the sensor to roughly
       | gauge depth. The raw result is low-res and imprecise, so it's
       | refined using machine learning. See:
       | https://www.lux.camera/iphone-xr-a-deep-dive-into-depth/
       | 
       | 3. An extension of this method was used on an iPhone SE that
       | didn't even have focus pixels, producing depth maps purely based
       | on machine learning. As you would expect, such depth maps have
       | the least correlation to reality, and the system could be fooled
       | by taking a picture of a picture. See:
       | https://www.lux.camera/iphone-se-the-one-eyed-king/
       | 
       | 4. The fourth method is used for selfies on iPhones with FaceID;
       | it uses the TrueDepth camera's 3D scanning to produce a depth
       | map. You can see this with the selfie in the article; it has a
       | noticeably fuzzier and low-res look.
       | 
       | You can also see some other auxiliary images in the article,
       | which use white to indicate the human subject, glasses, hair, and
       | skin. Apple calls these portrait effects mattes and they are
       | produced using machine learning.
       | 
       | I made an app that used the depth maps and portrait effects
       | mattes from Portraits for some creative filters. It was pretty
       | fun, but it's no longer available. There are a lot of novel
       | artistic possibilities for depth maps.
        
         | snowdrop wrote:
         | For method 3 that article is 5 years old, see:
         | https://github.com/apple/ml-depth-pro?tab=readme-ov-file
        
         | heliographe wrote:
         | > but apparently recent iPhones capture them for standard
         | photos as well.
         | 
         | Yes, they will capture them from the main photo mode if there's
         | a subject (human or pet) in the scene.
         | 
         | > I made an app that used the depth maps and portrait effects
         | mattes from Portraits for some creative filters. It was pretty
         | fun, but it's no longer available
         | 
         | What was your app called? Is there any video of it available
         | anywhere? Would be curious to see it!
         | 
         | I also made a little tool, Matte Viewer, as part of my photo
         | tool series - but it's just for viewing/exporting them, no
         | effects bundled:
         | 
         | https://apps.apple.com/us/app/matte-viewer/id6476831058
        
       | arialdomartini wrote:
       | Just wonder if depth maps can be used to generate stereograms or
       | SIRDS. I remember having playing with stereogram generation
       | starting from very similar grey-scaled images.
        
         | kridsdale3 wrote:
         | They do. The UI to do this is apparently only included in the
         | VisionOS version of the Photos app. But you can convert any
         | photo in your album to "Spatial Format" as long as it has a
         | Depth Map, or is high enough resolution for the ML
         | approximation to be good enough.
         | 
         | It also reads EXIF to "scale" the image's physical dimensions
         | to match the field of view of the original capture, so wide-
         | angle photos are physically much larger in VR-Space than
         | telephoto.
         | 
         | In my opinion, this button and feature alone justifies the
         | $4000 I spent on the device. Seeing photos I took with my Nikon
         | D7 in 2007, in full 3D and correct scale, triggers nostalgia
         | and memories I've forgotten I had for many years. It was quite
         | emotional.
         | 
         | Apple is dropping the ball on not making this the primary
         | selling-point of Vision Pro. It's incredible.
        
       | kccqzy wrote:
       | I might be missing something here but the article spends quite a
       | bit discussing the HDR gain map. Why is this relevant to the
       | depth maps? Can you skip the HDR gain map related processing and
       | but retain the depth maps?
       | 
       | FWIW I personally hate the display of HDR on iPhones (they make
       | the screen brightness higher than the maximum user-specified
       | brightness) and in my own pictures I try to strip HDR gain maps.
        
         | jasongill wrote:
         | I thought the same about the article and assumed I had just
         | missed something - it seemed to have a nice overview of the
         | depth maps but then covered mostly the gain maps and some
         | different file formats. Good article, just a bit of a
         | meandering thread
        
       | yieldcrv wrote:
       | Christ, that liquid cooled system is totally overkill for what he
       | does. I'm so glad I don't bother with this stuff anymore, all to
       | run his preferred operating system in virtualization because
       | Windows uses his aging Nvidia card better
       | 
       | Chimera
       | 
       | The old gpu is an aberration and odd place to skimp. If he
       | upgraded to a newer nvidia gpu it would have linux driver support
       | and he could ditch windows entirely
       | 
       | And if he wasn't married to arcgis he could just get a mac studio
        
       | heliographe wrote:
       | Yes, those depth maps + semantic maps are pretty fun to look at -
       | and if you load them into a program like TouchDesigner (or
       | Blender or Cinema 4D whatever else you want) you can make some
       | cool little depth effects with your photos. Or you can use them
       | for photographic processing (which is what Apple uses them for,
       | ultimately)
       | 
       | As another commenter pointed out, they used to be captured only
       | in Portrait mode, but on recent iPhones they get captured
       | automatically pretty much whenever a subject (human or pet) is
       | detected in the scene.
       | 
       | I make photography apps & tools (https://heliographe.net), and
       | one of the tools I built, Matte Viewer, is specifically for
       | viewing & exporting them: https://apps.apple.com/us/app/matte-
       | viewer/id6476831058
        
       | kawsper wrote:
       | Aha! I wonder if Apple uses this for their "create sticker"
       | feature, where you press a subject on an image and can extract it
       | to a sticker, or copy it to another image.
        
       ___________________________________________________________________
       (page generated 2025-06-04 23:00 UTC)