How Oculus uses AI for Hand Tracking

Oculus has been working on the concept of hand tracking from quite some time now. The idea was sometimes dismissed as impractical by some and difficult by others. Therefore it was quite surprising for many when Oculus announced that they have created the hand tracking software and that it will be rolled out through the Oculus Quest next year. The tech is the brainchild of Facebook Reality Labs [1] and Oculus [2], who worked together to develop the state of the art, one of the only completely articulated hand tracking [3] system for Virtual Reality [4] that doesn’t use any kind of assistance from the expensive hardware or sensors. Conventionally the technology has had always relied on the

  • Depth Sensors
  • Specialized Gloves
  • Cables etc.
Points Detection with monochromatic Cameras

How it is Better than Everything that came Before

The new tech has many exciting things for a consumer of VR technologies. The new Oculus headset will be free of any cable attachments and will be the first of its kind to incorporate tracking based solely on computer vision, with the help of monochrome cameras and without the use of any additional equipment, which not only incorporate the use of expensive hardware but with the use of gloves etc also make the user experience unnatural and the tracking process cumbersome. It makes use of the four cameras [5], along with the new techniques in the field of machine learning models that are able to track the depths and positions of the hands to a great accuracy. This enables the new device to be built at a fraction of the existing technologies in terms of all the important parameters.

  • Reduced size: adds more natural user experience when using the app
  • Weight: it is very critical, since the device is supposed to be worn for longer duration of times.
  • Power : allows user to spend more time with the device and increase the duration of single application and games that require longer time to conclude.
  • Cost : the finest achievement means getting rid of the expensive hardware and sensors and therefore reducing the costs.
  • Processing in the device : One of the most outstanding features of the new tech is that despite the use of high end artificial intelligence for hand tracking the processing is optimized and happens on the device itself.

How it all Works

Tracking has been the most important component in the VR/AR[6], as the applications and equipment that have been able to track the relative position of its user, can then successfully project the image or the augmented world back to the user. The positioning has been mostly dominated by conventional means of tracking, if the computer vision was being used the use was only limited to the conventional trackers, which had plenty of inbuilt problems and challenges [7] like such as occlusion, background clutter [8], illumination changes, scale variations, to name a few. These challenges hindered the more comprehensive use of the technology. Oculus is using deep neural networks with SLAM[9] (simultaneous localization and mapping), the technique has been in development with Facebook in its number of its lightweight implementation for tracking and similar activities in mobile devices. These architecture follow a distinct pipeline for tracking which includes

  • Prediction and localization of the hands including the important characteristics in it such as the joints, finger ends etc.
  • These important points on the hand are used to create a 26 degree-of-freedom pose of the person’s hand, where the distinct points and characteristics help to pinpoint the exact location of the fingers and other important joints (The architecture here is similar to PoseNets[10] used for detection of postures )
Posenet Detections can be applied to Hand Tracking.
  • The Data is then further processed and a second networks helps in the construction of the 3D model that includes the Geometry of the hand to a great precision and helps in interactions in the virtual world.
3D Model From actual image with Augmented Reality

What can go Wrong

The tech is relatively new and untested on a broader scale, while it has performed well in the lab environments and worked well in the initial phase a more thorough performance review is only possible when the actual products is at least a few months old in the market. The challenges to technology actually come from the very features that make it so intuitive in the first place, that being the camera based tracking. The 3D model created from the tracked joints is robust but a failure to detect even a single joint incorrectly can produce a very different and difficult to augment result. A backgrounds that blends with the color of the skin will be especially challenging when trying to determine depth or distance of the hands of the user.
Even with all the possible challenges the truth remains that it is and will be the technology of the future and with further improvements in the tracking domain in artificial intelligence the accuracy and the possibilities are only going to increase with time.

Learn More

If you'd like to learn how to develop your own AI apps in Computer Vision (AI-CV) check out our training called Ultimate AI-CV Practitioners PRO. This course will teach you how to become a PRO AI-CV developer with: Object Detection, Instance Segmentation, Pose Estimation, Android and Raspberry Pi Development! Click the link or image below to attend the FREE Webinar. - Click Here

References

  1. https://research.fb.com/category/augmented-reality-virtual-reality/
  2. https://www.oculus.com/?locale=en_US
  3. https://www.oculus.com/blog/introducing-hand-tracking-on-oculus-quest-bringing-your-real-hands-into-vr/?locale=en_US
  4. https://en.wikipedia.org/wiki/Oculus_Quest
  5. https://en.wikipedia.org/wiki/Augmented_reality
  6. https://arxiv.org/pdf/1812.07368.pdf
  7. https://arxiv.org/pdf/1812.07368.pdf
  8. https://en.wikipedia.org/wiki/Hidden-surface_determination#Occlusion_culling
  9. https://en.wikipedia.org/wiki/Simultaneous_localization_and_mapping
  10. https://www.tensorflow.org/lite/models/pose_estimation/overview
  11. https://www.digitaltrends.com/vr-headset-reviews/oculus-quest-hand-tracking-hands-on-review/
  12. https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5

Comments

Popular posts from this blog

Movidius NCS (with Raspberry Pi) vs. Google Edge TPU (Coral) vs. Nvidia Jetson Nano – A Quick Comparison

DeepSORT - Deep Learning applied to Object Tracking

OpenCV AI Kit - An Introduction to OAK-1 and OAK-D