How iPhone 11 Deep Fusion Works - AI Computational Photography

Apple Deep Fusion is a major feature in a camera-phone designed for the consumption of the masses. Similar techniques have been used in the past to get better results from the high end cameras [1] to stay relevant with the advent of better camera functions in people's smartphones, which are able to rival them with their relative ease of use as well as the quality. Now the scenario is possibly going to change, with the onset of Apple’s Deep Fusion. In the words of Apple’s Phil Schiller “The computational photography is a Mad Science”[5].

Phil Schiller presenting latest features in iPhone 11

How it works

Apple has included three cameras [2](visual in the image below) in their latest device that will, each one generating multiple images with variable aperture controls in order to maximize the capturing of the required attributes. What it effectively comes on to is that when you click your device to capture a frame in the specific mode, your camera will actually capture four different images or frames, one of these frames or images will be a long exposure and the subsequent ones will be four secondary images. These different frames then are combined to get an image output that has all the attributes of the captured scene.

More detail and less noise with Deep Fusion.

In total there are nine different instances that are fused together to get the best possible image, the fusion isn't compulsory, the system is smart enough to determine if one of the images that have been received from the cameras with variable shutter speeds are actually good enough or not to be considered for fusion or have to be selected independently of all the solutions. The process of capturing of the image as well as the processing of the neural engine is actually completed within a second [3], which is quite a remarkable [6]step given the resolution of the images and the known computational lag the deep networks exhibit even in the inference mode.

The Power - Under the Hood

The ability to generate inference on such high resolutions at such speeds is indeed a very remarkable step and a huge achievement by many yardsticks. The real magic however is not in the AI alone itself, the effective use of technology is also attributable to the cutting edge technology of the power boosted A13 Bionic chip [4] which makes all these supernatural speeds of calculations possible in the real world situation and gives definitive results in real-time. The domain of computation photography has been existing since a significant amount of time, but it was never possible to use it in real-time for practical implementations as the speeds were slow in the best case scenario. The introduction of new chip gives Apple the edge required to allow the use of the technology for real-time implementations.

Comparison with the Next Best

Previous year Google[7], one of the big names[8] in the field of computational photography, introduced Night Sight [9], which is a marvelous combination of camera hardware and intelligent image processing, which is capable of transformation of images taken in the darkness into having great detail for color and contents. Apple was expected to bring a similar product that was able to rival the features that are provided by the Google Night Stand and the response from apple is the introduction of Night Mode [10] on the new iPhone 11. The Night mode is capable of a number of mind-blowing features, whereby the images, taken in complete darkness that would in the previous versions will have absolutely no features visible can now have great amount of details captured in itself, as is evident in the comparison of images from the previous version of the iPhone itself to the latest light mode.

Comparison with/out night mode on the iPhone 11 Pro

How AI Helps you to Become a Better Photographer

An image has a great many details captured in itself, the latest technology has enabled the automation of the tasks that otherwise would have required a large amount of time and dedication in the learning procedure. One such example of the same can be the auto-focus[11] features that is nowadays so common in almost all the devices was up until a few years took a great deal of effort to learn that is now so effortlessly executed almost daily without a single day of training. The vision in the introduction of the discussed details actually attempts to repeat the success of the same, only instead of the hard coded logic we have artificial neurons which have learnt important characteristics on a wide variety of images and are now capable of making decisions on the important features as well as their intensities in a given image and can now extract important information from a given set of images (taken from the different cameras in the new iPhone with variable features) and integrate them into a single representation of the frame with all the information intact and the resultant image that can actually rival a professional.

Learn More

If you'd like to learn how to develop your own AI apps in Computer Vision (AI-CV) check out our training called Ultimate AI-CV Practitioners PRO. This course will teach you how to become a PRO AI-CV developer with: Object Detection, Instance Segmentation, Pose Estimation, Android and Raspberry Pi Development! Click the link or image below to attend the FREE Webinar. - Click Here