Educational AI forum coincidentally started out in 1998 as a ‘Tomorrow Lab’. Here, some 350 data scientists, mathematicians, and developers are working across more than 40 different areas of expertise at any one time.
At the forum, the talk was of data lakes, random forests, and “boiling the ocean.” Nature metaphors aside, it was a chance to experience our newest techniques for analyzing and modeling data to help our clients solve some of their most intractable problems.
“The thinking in AI has changed from ‘What’s possible?’ to ‘How do I do this?’” explains Rafiq Ajani, the partner who leads the North American analytics group. The general session was followed by a whirlwind tour of demos from our experts.
Up first was Luke Gerdes developing analytics to help counter terrorist networks and insurgents.
Today, he focuses on natural language processing, a branch of artificial intelligence that uses analytics to derive insights from unstructured text—that is, text that is not organized into a structured table but appears in written documents.
These could be legal contracts, consumer complaints, social media—and, in one example, conversation logs between pilots and towers.
“Almost 90 percent of all data is unstructured,” he explains. “We are just at the tip of the iceberg in developing potential applications.”
Following next was Jack Zhang, who has been with the analytics group from its humble beginnings of Excel sheets and traditional statistical models. Today, he leads our work on AI-enabled feature discovery (AFD), which he describes as “boiling the ocean” to find the insights.
AFD is about testing every possible variation on a set of information to understand outcomes, such as why customers cancel a service or patients make certain choices.
“It’s a statistical-modeling concept that’s been around for years—but automation has changed everything,” he says. We can test every possible variation of immense data sets—hundreds of millions of variations—in a fraction of the time, and it highlights pockets of features we can dive into for new insights.”
Adrija Roy, a geospatial expert, demoed our OMNI solution. It combines geospatial data (transit hubs, foot traffic, demographics) with customer psychographics (shopping history) and machine-learning techniques. Businesses are using OMNI to understand the economic value of each of their locations in the context of all of their channels. It’s guiding decisions on optimizing location networks.
The final demo was about deep learning, presented by Vishnu Kamalnath, an electrical engineer and computer scientist, who did early work training humanoid robots.
In deep learning (DL), algorithms ingest huge sets of unstructured data, including text, audio, and video, and process them through multiple layers of neural networks, often producing insights humans or less complex models could not grasp. DL algorithms can detect underlying emotion in audio text, are used in facial recognition, and can track small, fast-moving objects in satellite imagery, such as fake license plates in traffic.
“People used to think of deep learning as a really expensive proposition, requiring complex hardware,” says Vishnu, “But the cost is dropping; tools are becoming commoditized. Don’t shy away from it as an esoteric solution.” Deep-learning applications are estimated to be $3.5 trillion to $5.8 trillion in value annually across 19 industries. While technologies are advancing at a healthy pace, one challenge remains unchanged: getting right kind of data for the right use case at the right time. “The data component takes 80 percent of the time to gather, clean, and run in any project,” says Jack Zhang. “We still live by the maxim ‘garbage in, garbage out’.”