At the recently concluded Google I/O 2016, Daydream, the search giant’s VR solution, received much media attention. On the developer’s side of things however, Project Tango was given quite a fair bit of attention. There were almost as many Tango developer sessions as Daydream sessions. In fact, during developer sessions, Tango was frequently mentioned alongside with VR. Not to mention, Project Tango is one of the very few technologies where there is actual hardware with which developers can get started (as opposed to the bulk of the other announcements at I/O 2016). In fact, Lenovo just announced its first consumer-grade Tango device, the Phab 2 Pro.
So, what is Project Tango about and why it does it matter more than VR in the short run?
In its basic form, Project Tango is a mobile technology platform whereby it utilizes specialized hardware and computer vision to accurately map, track, and measure a location without active use of GPS.
Before 2016, there were only two dev-kit devices. The first device, codenamed ‘Peanut’, had a phone form factor, and was released in early 2014. However, what is still being sold online is the second device, codenamed 'Yellowstone'. It has a 7-inch table form factor, powered by NVIDIA Tegra K1 processor. Originally costing US$1,024, Google has since reduced the price to US$512.
The unit that the author owns is this ‘Yellowstone’ device.
From the front, the Project Tango dev kit looks just like an ordinary 7-inch tablet. In fact, it looked like a more compact variant of the NVIDIA Shield Tablet.
The true power of the Tango device is on its back. Distinguishing it from other ordinary tablets are the presence of multiple sensors. The Yellowstone device has two cameras, of which one of them is a 4MP RGB-IR camera sensor. That might sound low resolution, but the camera can capture images, along with infrared information. IR data is important to assist with depth sensing. The other camera is a fisheye lens that primarily tracks motion. In addition to these cameras, there is an IR blaster for longer range depth sensing.
On the software side, the Tango tablet is still running on Android 4.4.2. Ancient indeed, but for developers of Tango apps, the focus is more on the Tango SDK than the Android OS.
Since the Tango tablet is more of a technology platform preview, we'll focus more on what Project Tango enables than the review of the hardware.
There are three fundamental features of Project Tango:
Imagine you are blindfolded. You make movements. Sure, you know how hard you walked. But do you know how fast you have moved, or where you are now, relative to where you have started? Most modern mobile devices have accelerometers and gyroscopes, and hence, are able to track a user’s movement to a limited degree, such as hand movements and step counting. However, they do not perform well when tracking the user’s motion, relative to its starting location. As such, the only way to determine a user’s relative location and motion is by GPS.
Project Tango solves that by combining vision along with the motion sensors. In a nutshell, Tango estimates its current position by looking at the difference between two snapshots separated in a single timestep, combined with sensor information from the accelerometer and gyroscope. This allows Tango to record down the actual distance and speed travelled. This process is also known as visual-inertial odometry.
Despite the power of visual-inertial odometry, all devices cannot escape what is termed as sensor drift, whereby over time, the sensors will detect the device as drifting very slowly, even if it stationary. This effect is inherent in all movement sensors. As such, over a long period, motion tracking on an ordinary mobile device will deviate from its true motion by a big margin.
Tango has the capability to correct that by first learning the area, extracting various landmarks and features through a combination of visual and depth sensors, and stores it into memory. When the user navigates, Tango will offset the drift based on the landmarks and features previously extracted.
Such area learning can also be used to estimate a person’s position in an enclosed environment, without the need for hardware markers or GPS. Taking a human analogy, imagine you are trying to navigate streets, without using GPS. How do you recognize whether you have previously visited the place? You spot and remember various visual landmarks, such as a signboard or lamp post, such that you will know that you have passed by the same place while navigating around the streets.
The inclusion of depth sensors in a Tango dev kit allows the device to perform 3D reconstruction of the surrounding environment. In addition, it also allows one to perform rough measurements of space, without needing a measuring tape. Here's an example:-