Big data is all the data (both structured and unstructured) that a company has and keeps receiving everyday. So obviously, there’s an amazing amount of data created and stored daily around the world. The question is: what do we do with it?
If big data is used correctly, it can be analyzed for insights that lead to better decisions and strategic business moves. The main point is that having data is important, but what’s really crucial is how organizations and governments use it. Actually, it doesn’t matter how much information they collect: it’s how humans analyse it and what they do with that data. Big data is just a tool in human hands; it’s not the end itself. Data is just that, neutral information waiting to be analysed. So, human beings have to learn to properly understand and apply that data in order to create a safer and smarter society.
An example: using big data to prevent traffic jams
Big data is increasingly being used to analyze global problems in order to find solutions, and predicting traffic jams is one of governments’ favourites. In Boston, the administration is using Uber and Waze (a Google-owned smartphone app that crowd-sources traffic updates from about 450,000 local users) to deploy bicycle cops to ticket and/or tow cars double parked. Data from Uber, Waze and street cameras feeds into the city’s traffic management centre so officials can manipulate traffic lights based on what’s happening. This data is also compared with what has happened historically at that place over a period of time.
The other big anti-jam project is the Traffic Prediction Project, Microsoft’s plan to predict traffic jams up to an hour in advance by using big data. The giant has partnered with the Federal University of Minas Gerais, one of Brazil’s biggest universities. The aim is to take all traffic data, including historical numbers where available, from transport departments, road cameras, Microsoft’s Bing traffic maps, and even drivers’ social networks, to see if established patterns can help foresee traffic jams 15-60 minutes before they happen. Google aspires to do so by combining the data provided by thousands of active cell phones to determine how swiftly traffic is moving, and three types of traffic sensors: radar, active infrared and laser radar.
First reactions to the “innovative” proposal didn’t take long to appear. In fact, users already have the solution to avoid traffic congestions, and it’s quite simple: don’t drive on rush hour. Pretty revolutionary, right? Once more, humans found the response.
The paradigm of big data application: Google search
If there’s a good example of someone who knows how to apply big data, that is of course Google. Have you ever wondered how can Google know you so well? Probably yes. Well, the answer is very simple: thanks to big data. You tap a query into the search bar and in just some milliseconds, there you have millions of answers, ranked in terms of relevancy. Google’s dream has always been to create a search engine that thinks like a human, with the ability to understand a phrase and determine an individual search query. And thanks to semantics, they made it.
To understand Google’s complex process, first we need to know how Google search works. Google results come from two places: Indexed Pages and Knowledge Graph database. On the one hand, Indexed Pages is a collection of webpages stored to respond to search queries. Google indexes around 20 billion pages per day. On the other hand, Knowledge Graph is a separate database able to differentiate between words and phrases with different meanings and finding out their relationship to each other. Thanks to Knowledge Graph, more and more, Google is learning to interpret your words: it’s what we call semantic searches. This is very useful for SEO purposes as well: the trend is to shift into semantic keywords, meaning that a page doesn’t need to always use a fixed keyword. This allows for a more human way of writing for the web, instead of a robotic, SEO-focused text. Let’s take a dental clinic as an example: imagine you have to write an article about dental implants, whose keyword is “dental implants”. Thanks to semantic search, the writer can now write “dental implants”, but also just “implants” or “implantology” along the text, and Google will realize all the words are connected, inside the same semantic field, so it will understand all the words can work as the main keyword.
Now, when you type something into the search bar, Google analyses the words with both a literal and a semantic search. For the literal one, the engine looks for a match of a part or of the entire phrase. The root of your search phrase is then found, examined and expanded upon to find a better result. For the semantic search, Google tries to understand the context of your phrase by analyzing the terms and language in the Knowledge Graph database, in order to directly answer a question with specific information.
After an intricate but yet amazingly fast process, Google checks millions of websites and petabytes of data to provide you the best results. Finally, a combination of results from Google’s index pages and from Knowledge Graph database will bring you the most relevant outcome in almost no time. With Google understanding you and shifting to semantic search, the question is: how human can Google (and any machine) really become? A bit scary, isn’t it? Follow this conversation on our Forum.