Database definition

What is a graph database? – Definition of WhatIs.com

A graph database, also known as a semantic database, is a software application designed to store, query, and modify network graphs. A network graph is a visual construct made up of nodes and edges. Each node represents an entity (such as a person) and each edge represents a connection or relationship between two nodes.

Graph databases have been around in some variations for a long time. For example, a family tree is a very simple graphical database.

The concept of using databases to digitally map relationships began to be used in business around 2015 when increasing computing power, in-memory computing and agreed standards pushed the concept from academics to real uses in commercial and enterprise computing.

Graph databases are well suited for interconnection analysis, which is why there has been a lot of interest in using graph databases to mine social media data. Graph databases are also useful for working with data in business disciplines that involve complex relationships and dynamic schema, such as supply chain management, identifying the source of a telephony problem IP and the creation of recommendation engines “customers who bought this also looked…”. .

The concept behind the graphical representation of a database is often attributed to the 18th century mathematician Leonhard Euler.

The structure of a graph database

Traditionally classified as a type of NoSQL database, graph databases are sometimes referred to as triple stores. Indeed, this type of database uses a special index that stores information about nodes, edges and the relationship between them in groups of three.

A triple, which can also be called a affirmation, has three main fields: a subject, a predicate and an object. Each subject, predicate, or object is represented by a unique resource identifier (URI).

How information is indexed

In a triple store, the first database field contains the subject URI, the second field contains the predicate URI, and the third field contains the object URI. Although there are a number of different strategies that graph databases can use to store triples, most use an index that abbreviates the three main fields to {?s, ?p, ?o}.

For example, if the visual construction of a graph is given as follows:

Then the index will look like this:

Row

?s

?p

?o

1

:Bob

:married to

: Julia

2

:Bob

:brother of

:Steve

3

:Bob

:listen

:Rock music

4

: Julia

:listen

:Rock music

5

: Julia

:sister-in-lawTo

:Steve

6

:Jim

:works for

:IBM

How information from a graph database is queried

Each triple in a graph database is stored only once in the index. Just like relational databases, it is a simple process to perform a direct search query in a graph database.

  • If the query is for known information about Bob, the indexer programming only needs to search rows 1 through 3 of the database.

The real power and speed of a graph database comes from indexing combinations of triples. Here are some examples :

  • If the query is for the person Bob is married to, the indexer will look for the :marriedTo predicate in lines 1-3 and then retrieve the matching object. (Bob is married to Julie.)
  • If the query is to identify everyone who listens to the same genre of music as Bob, the indexer will first ask for {:Bob:listensTo ?o } and identify :RockMusic as the object.

In the second query, the results will return:RockMusic in rows 3 and 4. The subject of row 3 is Bob himself, so whoever is the subject of row 4 will be the other person who listens to rock music . (Turns out it’s Julie, Bob’s wife.)

Types of graph databases

Historically, graph databases have been divided into two categories – property graphs which simply support nodes and edges, and knowledge graphs like the one above which can focus on the semantic aspects of data and storing the information in triplets. Generally speaking, the indexing strategies for both types are similar.

It is expected that over time knowledge graphs and property graphs will merge and the architectural distinctions between these two types of graph databases will blur.

Use cases for graph databases

Current use cases for graph databases include the following:

  • Enable data analysts to federate datasets without having to create and run complex queries that join combinations of tables, as in the relational database model.
  • Help developers build the back-end for voice assistants by mapping possible user questions to correct answers.
  • Identify groups of events that are unusually connected to detect fraud.
  • Examine direct connections to identify potential indirect connections for recommendation engines.

The future of graph databases

Graph databases are expected to play a major role in fields as diverse as machine learning, Bayesian analysis, data science, and artificial intelligence, as well as helping to manage enterprise data and l data exchange over the next decade.

One of the most significant impacts on this type of database will be improved data federation. When knowledge graphs can be easily federated, a database will be able to determine that it needs data that it does not have and automatically retrieve that data from another knowledge graph. With this capability, the federation is likely to help developers build blockchains that use relevant metadata to authenticate transactions in banking, finance, voting, and smart contracts.

See also: social graph, graph search