As we learned in our previous blog post, approximately 2.5 trillion million bytes of data are collected - both actively and "passively" - on a typical day here on Planet Earth. (To give you some perspective on the magnitude of that figure, 2.5 trillion million equals 1 quintillionA or 1 x 1018.) And as we continue to develop additional data touch points to add to those that already intersect with our lives each day, such as sensors, retail checkout registers, social media, and RFID tags, the amount of data we collect will just continue to expand exponentially.
Even someone who does not work in a data-centric profession will likely recognize that all of that data must have some potential to be assembled together, even if the data were originally culled from disparate sources, and used for purposes that are far flung from those for which it was originally collected. Ideally, Big Data can be leveraged to help end users more effectively drive strategic business decisions or increase revenue, especially when customary data analytic techniques are insufficient. Of course, the extent to which different data sets - stored in the cloud or in servers around the globe - can be reliably combined depends upon how compatible or uniform the data are and how reliable the various sources of the data may be. Ideally, the data need to be able to be aggregated so that there are no "seams" between the different sets of data to interfere with how we can slice, dice, and puree the information.
Some companies have already become quite proficient at leveraging the "Big Data" that they can access today, using the tools that are available to them. In today's post, we'll continue our exploration of several of these companies with some help from Gartner analyst Doug Laney; you'll see how various client companies have been able to expand their thinking to embrace Big Data as more than a buzzword, and use it to their strategic advantage over others within their own market space.
Market Sector: Retail
In 2012, retail giant Walmart launched Polaris, a revolutionary search engine. This powerful search engine is unique in its use of Big Data filtered through semantic search technology; this technology enables the search engine to anticipate the intent of a shopper's search, and to subsequently deliver highly relevant results.Its stellar performance is a result of a joint effort between experts in information retrieval, machine learning and text mining combined with those with experience from top search and e-commerce companies and renowned research institutions.
Polaris is based on the Social Genome project. This is a platform that connects people to places, events, and products; it provides a more in-depth level of understanding about customers and products. Standard search engines typically only return result based strictly on what a customer has typed into the search query box; in contrast, Polaris may return results that a user didn't know were of interest until he saw them on the screen. The new search engine uses advanced algorithms including query understanding and synonym mining to glean user intent in delivering results. For example, when a user types in the word "denim", it returns results on denim jeans and denim shorts; if a user types "chlorine tablets", the results returned are related to swimming pool equipment.
Polaris focuses on engagement understanding, which takes into account how a user is behaving with the site to provide the best results for them. It delivers a new and intuitive results page when browsing for topics instead of giving a standard list of search results; this allows shoppers to discover new items they may not have previously considered. When a user types in "patio furniture", they get a colorful page with multiple patio set options for the backyard along with a banner ad showing related sale items.
Walmart says adding semantic search has improved online shoppers completing a purchase by 10% to 15%. Of course - as Gartner analyst Laney reminds us - in Walmart terms... that translates to billions of dollars.
Company: Edmunds.com, Inc.
Market Sector: Online Automotive Marketing
Online publisher of car-shopping information Edmunds.com, Inc. has been able to use Big Data to breathe new life into so-called "dark information". "Dark Information" is the descriptive term for information assets that organizations collect, process and store in the course of their regular business activity, but generally fail to use for other purposes; such information was previously considered to be too old to be useful by the time it was made available to business users for analysis. Now, thanks to advances in technology (like Hadoop clusters for distributed processing of large data sets and NoSQL databases to process large volumes of data), it's now feasible for Edmunds and other companies to incorporate such long-neglected information into big data analytics applications.
Quicker access to these formerly "dark" data have allowed workers who manage keyword acquisition for the company's paid-search and online advertising efforts to quickly assess incoming data; they are now able to more swiftly determine how changes in buying tactics will affect marketing initiatives. This expedience has saved Edmunds.com a significant amount of money - reportedly nearly $2 million as of June 2014.
Market Sector: Law Enforcement
Law enforcement agencies nationwide are facing budget freezes and deep cuts; they are being mandated to manage their resources more effectively while still responding to public demand for crime prevention and reduction. In response to these economic constraints and the desire to decrease crime within their jurisdictions, police departments in precincts across the United States, a team of educators, and a company called PredPol ("Predictive Policing") have taken an algorithm used to predict earthquakes, tweaked it, and begun feeding it local cloud-based crime data. The only three variables necessary are type, place and time of crime; the algorithm requires no personal information at all. This is proving to be repurposing at its best: Astonishingly, the software - which takes advantage of the power of adaptive computer learning - can now predict where and when crimes will occur down to an area of 500 square feet.
Here are three divergent jurisdictions of many in California that have recognized substantial reductions in crime since deploying PredPol:
In contrast to technology that simply maps past crime data, the software applies advanced mathematics and adaptive computer learning, and filters Big Data through the entire system. By building on the knowledge and experience that already exists, the result has been predictions twice as accurate as those made through previously-existing best practices! It's also notable that PredPol uses no personal information about individuals or groups of individuals, eliminating any personal liberties and profiling concerns.
As evidenced by these case studies... the intelligent collection and analysis of data - from a variety of seemingly unconnected sources - can lead to enhanced insights and potentially significant financial implications that were never even available to businesses before.
For more information on how Nebu can help you with your Big Data challenges, please contact us or visit nebu.com/nebu-data-hub.
A Due to divergent naming conventions, in US, Canadian, and modern British usage, a Quintillion = 1 x 1018; in traditional British use and in Continental Europe, however, a Quintillion = 1 x 1030. Either way, it's a BIG number!
Photo: Research Data Management by user jannekestaaks on Flickr.