By: Dijam Panigrahi
We don’t know which of the world’s largest tech-forward companies will power the best future tools, technologies, and resources for manufacturing, healthcare, construction, and more. Because of this, organizations have been working extremely hard to ensure they are creating changes that will greatly impact humanity. This begins as recent technological advances with artificial intelligence (AI) and immersive mixed reality technologies such as augmented reality (AR), and virtual reality (VR) have been made.
Although these technologies differ from each other, they are currently working together in advanced three-dimensional (3D) applications and environments, as it benefits companies and their customers.
In virtual reality, a user wears a headset that enables entry into a new world, one that may even imitate the real world. Virtual reality allows for users to be given both a visual and audible experience that will duplicate a real-world setting in a manufacturing environment.
Augmented reality is conceptually similar to virtual reality. However, augmented reality displays digital content in the real world. This allows for manufacturers of power, utility, or industrial equipment involved in the creation of new machinery to see virtual specs of the design. In turn, they also see how it could function in a real utility or power generation environment.
Certainly, these technologies offer promise. The challenge, though, is that they require heavy doses of data, the ability to process large amounts of data at remarkable speeds, and the ability to scale projects in a technological environment that is infrequently allowed in typical office environments.
Immersive mixed reality calls for a precise and persistent fusion of both the real and virtual worlds. Therefore, rendering complex models and scenes in photorealistic detail, rendered at the correct physical location with the correct scale and precise pose is required. To leverage AR/VR to design, build, or repair components, persistent accuracy and precise nature are needed.
This is currently achieved by using discrete GPUs from servers and delivering the rendered frames wirelessly to the head-mounted displays (HMDs) such as the Microsoft HoloLens and the Oculus Quest.
One of the main requirements for mixed reality applications is to precisely overlay on an object its model or the digital twin. This way, work instructions can be provided for assembly and training, and possible errors in manufacturing can be caught as well. This allows the user to also track the object and modify the rendering while the work advances.
The majority of on-device object tracking systems use 2D image and/or marker-based tracking. This severely limits overlay accuracy in 3D, as 2D tracking is unable to estimate depth with high accuracy, and therefore, the scale and the pose. Although users may receive what appears to be a good match when looking from one angle or position, the overlay loses alignment as the user moves around in 6DOF.
Also, object registration, which is the object detection, identification and its scale and orientation estimation, is achieved. In most cases, this is achieved computationally or by using simple computer vision methods with standard training libraries (examples: Google MediaPipe, VisionLib). This may work well for regular, smaller, and simpler objects such as hands, faces, cups, tables, chairs, wheels, regular geometry structures, and so on. For large, more complex objects in enterprise use cases, however, labeled training data (more so in 3D) is not available easily. As a result, using the 2D image-based tracking to align, overlay, and persistently track the object and fuse the rendered model with it in 3D is extremely difficult, if not impossible. Enterprise-level users are defeating these obstacles by leveraging 3D environments and AI technology into their immersive mixed reality design-build projects.
Deep learning-based 3D AI allows users to identify 3D objects of arbitrary shape and size in various orientations with high accuracy in the 3D space. This approach is scalable with any arbitrary shape and is amenable to use in enterprise use cases requiring rendering overlay of complex 3D