The pace of graph data science adoption for business is accelerating. While graph technology has traditionally been used for transactional workloads, it is increasingly at the heart of graph-powered analytics work. As a result, developers have the chance to apply graph-powered analytics to their connected data — the result being a science-driven approach termed ‘graph data science’.
Graph data science can garner useful business knowledge from the relationships and structures held frozen in your data. It is typically used to power insightful strategic predictions to help data scientists answer tough business questions and explain outcomes. It is a powerful technique that can reason about the ‘shape’ of the connected context for each piece of data through graph algorithms. As a result, graph data science enables rich machine learning predictions.
Leveraging connected data in machine learning work
Graph technology has already made a distinct contribution in several use cases, from fraud detection to tracking a customer journey, by leveraging the connections between data points for more accurate and interpretable predictions. In a drug discovery use case, this means identifying new associations between genes, diseases, drugs and proteins, while providing context to assess the relevance or validity of any such discovery. For customer recommendations, it means learning from user journeys to make accurate, data-driven recommendations to customers for future purchases, and presenting options drawn from previous buying history.
Most data science teams in the corporate world are still learning how to leverage connected data in their machine learning work. However, adopters of this graph data science approach find their best machine learning work is unlocked with graph technology.
Real-world graph data science
One real-world example of graph data science in action is in the US healthcare sector. New York-Presbyterian Hospital's analytics team uses graph data science to track infections and take rapid action to contain them. Its developer team says graph data science offers an efficient way to connect the ‘what’, ‘when’ and ‘where’ dimensions of an event. Empowered with these insights, the team created a ‘time’ and ‘space’ tree to model all the treatment rooms on-site. This initial model revealed a large number of inter-relationships, while an
event entity was included to connect the time and location trees. The resulting graph-based data model means the analytics team is able to analyse everything that happens in the hospital facilities and identify and contain any outbreaks before they spread.
Graph data science is also supporting the medical supply chain. Global medical device manufacturer Boston Scientific is using the technique to isolate the causes of product faults. Numerous teams, often in different countries, collaborate on the same problems together, but staff had to resort to analysing data in spreadsheets. This led to inconsistencies and difficulty tracking down the underlying sources of defects. Boston Scientific says graph technology has secured a more streamlined means for analysing, coordinating and improving its manufacturing methods across its locations. Users can conduct meaningful, data science-enhanced searches, with query times dropping from two minutes to 10 to 55 seconds — an improvement that increases overall efficiency and streamlines the entire analytical process. The company can identify specific components that are more likely to fail. Another benefit is that because the graph data model is so simple, it’s easy to communicate to others.
The UK government’s central online resource, GOV.UK, also uses graph data science, where data scientists are deploying their first machine learning model built with the help of graph technology. The resulting system automatically recommends content to users based upon the page they are visiting. From a data science perspective, the application learns continuous feature representations that can be used for various machine learning tasks, such as recommending content.
The government data scientists noted, “Through this process, we learned that creating the necessary data infrastructure which underpins the training and deployment of a model is the most time-consuming part.” Finally, in a more commercial setting, a senior data scientist at leading media and marketing services company Meredith reports that graph algorithms are allowing the transformation of page views into pseudonymous identifiers with rich browsing profiles. This means Meredith can better understand customers — understanding which translates into significant revenue gains and an improved customer experience.
The foundation of modern data and analytics
Graph-enabled data science is set to become a key part of business analytics, delivering beneficial business insights, in 2021 and beyond. In its June 2020 ‘Top 10 Data and Analytics Technology Trends for 2020’ report, Gartner predicts that the ability to find relationships in combinations of diverse data using graph technology at scale “will form the foundation of modern data and analytics”.
Gartner has also polled companies about their use of AI and machine learning techniques and found a remarkably high 92% said they plan to employ graph technology within five
years. Clearly, graph data science is set to become part of advanced data and analytics capabilities at the enterprise level, so now is a perfect time to evaluate its potential.
The author is Lead Product Manager – Data Science at the world’s leading graph database, Neo4j