The Rise of The Knowledge Graph
The Rise of The Knowledge Graph
The Rise of The Knowledge Graph
m
pl
im
en
ts
of
The Rise
of the
Knowledge
Graph
Toward Modern Data Integration
and the Data Fabric Architecture
REPORT
Let’s build your
connected enterprise.
Anzo® makes turning siloed data into enterprise-
scale knowledge graphs faster and easier than
ever. From there, anything’s possible.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. The Rise of the
Knowledge Graph, the cover image, and related trade dress are trademarks of
O’Reilly Media, Inc.
The views expressed in this work are those of the authors, and do not represent the
publisher’s views. While the publisher and the authors have used good faith efforts
to ensure that the information and instructions contained in this work are accurate,
the publisher and the authors disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from the use of or
reliance on this work. Use of the information and instructions contained in this
work is at your own risk. If any code samples or other technology this work contains
or describes is subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
This work is part of a collaboration between O’Reilly and Cambridge Semantics. See
our statement of editorial independence.
978-1-098-10037-7
[LSI]
Table of Contents
iii
The Rise of the Knowledge Graph
Executive Summary
While data has always been important to business across industries,
in recent years the essential role of data in all aspects of business has
become increasingly clear. The availability of data in everyday life—
from the ability to find any information on the web in the blink
of an eye to the voice-driven support of automated personal
assistants—has raised the expectations of what data can deliver for
businesses. It is not uncommon for a company leader to say, “Why
can’t I have my data at my fingertips, the way Google does it for the
web?”
This is where a structure called a knowledge graph comes into play.
A knowledge graph is a combination of two things: business data in a
graph, and an explicit representation of knowledge. Businesses man‐
age data so that they can understand the connections between their
customers, products or services, features, markets, and anything else
that impacts the enterprise. A graph represents these connections
directly, allowing us to analyze and understand the relationships
that drive business forward. Knowledge provides background infor‐
mation such as what kinds of things are important to the company
and how they relate to one another. An explicit representation of
business knowledge allows different data sets to share a common
reference. A knowledge graph combines the business data and the
business knowledge to provide a more complete and integrated
experience with the organization’s data.
What does a knowledge graph do? To answer that question, let’s
consider an example. Knowledge graph technology allows Google to
1
include oral surgeons in a list when you ask for “dentists”; Google
manages the data of all businesses, their addresses, and what they do
in a graph. The fact that “oral surgeons” are a kind of “dentist” is
knowledge that Google combines with this data to present a fully
integrated search experience. Knowledge graph technology is essen‐
tial for achieving this kind of data integration.
We’ll start by looking at how enterprises currently use data, and how
that has been changing over the past couple of decades.
Introduction
Around 2010, a sea change occurred with respect to how we think
about and value data. The next decade saw the rise of the chief data
officer in many enterprises, and later the data scientist joined other
categories of scientists and engineers as an important contributor to
both the sum of human knowledge and the bottom line. Google cap‐
italized on the unreasonable effectiveness of data, shattering expect‐
ations of what was possible with it, while blazing the trail for the
transformation of enterprises into data-driven entities.
Data has increasingly come to play a significant role in everyday life
as more decision making becomes data-directed. We expect the
machines around us to know things about us: our shopping habits,
our tastes, our preferences (sometimes to a disturbing extent). Data
is used in the enterprise to optimize production, product design,
product quality, logistics, and sales, and even as the basis of new
products. Data made its way into the daily news, propelling our
understanding of business, public health, and political events. This
decade has truly seen a data revolution.
More than ever, we expect all of our data to be connected, and con‐
nected to us, regardless of its source. We don’t want to be bothered
Introduction | 3
with gathering and connecting data; we simply want answers that
are informed by all the data that can be available. We expect data to
be smoothly woven into the very fabric of our lives. We expect our
data to do for the enterprise what the World Wide Web has done for
news, media, government, and everything else.
But such a unified data experience does not happen by itself. While
the final product appears seamless, it is the result of significant
efforts by data engineers, as well as crowds of data contributors.
When data from all over the enterprise, and even the
industry, is woven together to create a whole that is
greater than the sum of its parts, we call this a data fabric.
How do we get from the current state of affairs, where our data is
locked within specific applications, to a data fabric, where data can
interact throughout the enterprise?
The temptation is to treat this problem like any other application
development challenge: find the right technology, build an applica‐
tion, and solve it. Data integration has been approached as if it is a
one-off application, but we can’t just build a new integration appli‐
cation each time we have a data unification itch to scratch. Instead,
we need to rethink the way we approach data in the enterprise; we
have to think of the data assets as valuable in their own right, even
separate from any particular application. When we do this, our data
assets can serve multiple applications, past, present, and future. This
is how we make a data architecture truly scalable; it has to be dura‐
ble from one application to the next.
Data strategists who want to replicate the success of everyday search
applications in the enterprise need to understand the combination
of technological and cultural change that led to this success.
Weaving a data fabric is at its heart a community effort, and the
quality of the data experience depends on how well that communi‐
ty’s contributions are integrated. A community of data makes very
specific technological demands. So it comes as no surprise that the
data revolution came to life only when certain technological advan‐
ces came together. These advances are:
Introduction | 5
Distributed data
This refers to data that is not in one place, but distributed across
the enterprise or even the world. Many enterprise data strate‐
gists don’t think they have a distributed data problem. They’re
wrong—in any but the most trivial businesses, data is dis‐
tributed across the enterprise, with different governance struc‐
tures, different stakeholders, and different quality standards.
Semantic metadata
This tells us what our data and the relationships that connect
them mean. We don’t need to be demanding about meaning to
get more value from it; we just need enough meaning to allow
us to navigate from one data set to another in useful ways.
Connected data
This refers to an awareness that no data set stands on its own.
The meaning of any datum comes from its connection to other
data, and the meaning of any data set comes from its connec‐
tions to other data.
The knowledge graph, made up of a graph of connected data along
with explicit business knowledge, supports all of these new ways of
thinking about data. As a connected graph data representation, a
knowledge graph can handle connections between data. The explicit
representation of knowledge in a knowledge graph provides seman‐
tic metadata to describe data sources in a uniform way. Also, knowl‐
edge graph technology is inherently distributed, allowing it to
manage multiple, disparate data sets.
In our context, recent breakthroughs in graph database technologies
have proven to be key disruptive innovations marked by vast
improvements in data scales and query performance. This is allow‐
ing this technology to rapidly graduate from a niche form of analyt‐
ics to the data management mainstream. With the maturity of
knowledge graph technology, today’s enterprises are able to weave
their own data fabric. The knowledge graph provides the underpin‐
nings and architecture in which data can be shared effectively
throughout an enterprise.
In the next sections, we’ll introduce the two key technology compo‐
nents you’ll need to build a knowledge graph—namely a graph rep‐
resentation of data and an explicit representation of knowledge. We
will explore how these two important capabilities, in combination,
support the realization of a data fabric architecture in an enterprise.
Semantic Systems
Knowledge graphs draw heavily on semantic nets. A semantic net is,
as its name implies, a network of meaningful concepts. Concepts are
given meaningful names, as are their interlinkages. Figure 3 shows
an example of a semantic net about animals, their attributes, their
classifications, and their environments. Various kinds of animals are
related to one another with labeled links. The idea behind a
Data Representation
One of the drivers of the information revolution was the ability of
computers to store and manage massive amounts of data; from an
early date, database systems were the main driver of information
technology in business.
Early data management systems were based on a synergy between
how people and machines organize data. When people organize data
on their own, there is a strong push toward managing it in a tabular
form; similar things should be described in a similar way, the think‐
ing goes. Library index cards, for example, all have a book title,
an author, and a Dewey Decimal classification, and every customer
has a contact address, a name, a list of things they order, etc. At the
same time, technology for dealing with orderly, grouped, and linked
tabular data, in the form of relational databases (RDBMS), was fast,
efficient, and easily optimized. This led to advertising these data sys‐
tems by saying that “they work in tables, just the way people think!”
Tables turned out to be amenable to a number of analytic
approaches, as online analytical processing systems allowed analysts
to collect, integrate, and then slice and dice data in a variety of ways.
This provided a multitude of insights beyond what was apparent in
the initial discrete data sets. A whole discipline of formal data mod‐
eling, now also known as schema on write, grew up, based on the
premise that data can be represented as interlinked tables.
RDBMS are particularly good at managing well-structured data at
an application level; that is, managing data and relationships that are
pertinent to a particular well-defined task, and a particular applica‐
tion that helps a person accomplish that specific task. Their main
drawback is the inflexibility that results because the data model
required to facilitate data storage, and to support the queries neces‐
sary to retrieve that data, must be designed up front. But businesses
need data from multiple applications to be aggregated, for reporting,
analytics, and strategic planning purposes. Thus enterprises have
become aware of a need for enterprise-level data management.
Anti-money laundering
In finance today, money laundering is big business. On a surpris‐
ingly large scale, so-called bad actors conceal ill-gotten gains in the
legitimate banking system. Despite widespread regulation for
detecting and managing money laundering, only a small fraction of
cases are detected.
In general, money laundering is carried out in several stages. This
typically involves the creation of a large number of legal organiza‐
tions (corporations, LLCs, trust funds, etc.) and passing the money
from one to another, eventually returning it, with an air of legiti‐
macy, to the original owner.
Graph data representations are particularly well suited to tracking
money laundering because of their unique capabilities. First, the
known connections between individuals and corporations naturally
form a network; one organization owns another, a particular person
sits on the board or is an officer of a corporation, and individual
Collaborative filtering
In ecommerce, it is important to be able to determine what products
a particular shopper is likely to buy. Advertising combs to bald men
or diapers to teenagers is not likely to result in sales. Egregious
errors of this sort can be avoided by categorizing products and cus‐
tomers, but highly targeted promotions are much more likely to be
effective.
A successful approach to this problem is called collaborative filtering.
The idea of collaborative filtering is that we can predict what one
customer will buy by examining the buying habits of other custom‐
ers. Customers who bought this item also bought that item. You
bought this item—maybe you’d like that item, too?
Graph data makes it possible to be more sophisticated with collabo‐
rative filtering; in addition to filtering based on common purchases,
we can filter based on more elaborate patterns, including features of
the products (brand names, functionality, style) and information
Identity resolution
Identity resolution isn’t really a use case in its own right, but rather a
high-level capability of graph data. The basic problem of identity
resolution is that, when you combine data from multiple sources,
you need to be able to determine when an identity in one source
refers to the same identity in another source. We saw an example of
identity resolution already in Figure 8; how do we know that Chris
Pope in one table refers to the same Chris Pope in the other? In that
example, we used the fact that they had the same name to infer that
they were the same person. Clearly, in a large-scale data system, this
Figure 12. Two graphs with similar information that look very differ‐
ent. We already saw part (A) in Figure 3. Part (B) has exactly the
same relationships between the same entities as (A), but is laid out dif‐
ferently. What criteria should we use to say that these two are the
“same” graph?
Technology Independence
When an enterprise invests in any information resource—docu‐
ments, data, software—an important concern is the durability of the
underlying technology. Will my business continue to be supported
by the technology I am using for the foreseeable future? But if I do
continue to be supported, does that chain me to a particular tech‐
nology vendor, so that I can never change, allowing that vendor to
potentially institute abusive pricing or terms going forward? Can I
move my data to a competitor’s system?
For many data technologies, this sort of flexibility is possible in
theory, but not in practice. Shifting a relational database system
from one major vendor to another is an onerous and risky task that
Figure 13. The graph from Figure 3 (repeated here as part [A]), broken
down into its smallest pieces (part [B]).
Once we have identified the triples, we can sort them in any order,
such as alphabetical order, as shown in Figure 14, without affecting
the description of the graph.
While there are of course more technical details to RDF, this is basi‐
cally all you need to know: RDF represents a graph by identifying
each node and link with a URI (i.e., a web-global identifier), and
breaks the graph down into its smallest possible parts, the triple.
Figure 15. Data about authors, their gender, and the topics of their
publications (A). The pattern (B) finds “Female authors who have pub‐
lished in Science.”
Graph data on its own provides powerful capabilities that have made
it a popular choice of application developers, who want to build
applications that go beyond what has been possible with existing
data management paradigms, in particular, relational databases.
While graph data systems have been very successful in this regard,
they have not addressed many of the enterprise-wide data manage‐
ment needs that many businesses are facing today. For example, with
the advent of data regulations like GDPR and the California
Consumer Privacy Act (CCPA), enterprises need to have an explicit
catalog of data; they need to know what data they have and where
they can find it. They also need to be able to align data with
What Is a Vocabulary?
A vocabulary is simply a controlled set of terms for a collection of
things. Vocabularies can have very general uses, like lists of coun‐
tries or states, lists of currencies, or lists of units of measurement.
Vocabularies can be focused on a specific domain, like lists of medi‐
cal conditions or lists of legal topics. Vocabularies can have very
small audiences, like lists of product categories in a single company.
Vocabularies can also be quite small—even just a handful of terms—
or very large, including thousands of categories. Regardless of the
scope of coverage or the size and interests of the audience, all vocab‐
ularies are made up of a fixed number of items, each with some
identifying code and common name.
What Is an Ontology?
We have already seen, in our discussion of controlled vocabularies, a
distinction between an enterprise-wide vocabulary and the repre‐
sentations of that vocabulary in various data systems. The same ten‐
sion happens on an even larger scale with the structure of the
various data systems in an enterprise; each data system embodies its
own reflection of the important entities for the business, with
important commonalities repeated from one system to the next.
We’ll demonstrate an ontology with a simple example. Suppose you
wanted to describe a business. We’ll use a simple example of an
online bookstore. How would you describe the business? One way
you might approach this would be to say that the bookstore has cus‐
tomers and products. Furthermore, there are several types of prod‐
ucts—books, of course, but also periodicals, videos, and music.
Figure 16. A simple ontology that reflects the enterprise data structure
of an online bookstore.
Put yourself in the shoes of a data manager, and have a look at Fig‐
ure 16; you might be tempted to say that this is some sort of sum‐
marized representation of the schema of a database. This is a natural
observation to make, since the sort of information in Figure 16 is
very similar to what you might find in a database schema. One key
feature of an ontology like the one in Figure 16 is that every item
(i.e., every node, every link) can be referenced from outside the
Data catalog
Many modern enterprises have grown in recent years through merg‐
ers and acquisitions of multiple companies. After the financial crisis
of 2008, many banks merged and consolidated into large conglom‐
erates. Similar mergers happened in other industries, including life
sciences, manufacturing, and media. Since each component com‐
pany brought its own enterprise data landscape to the merge, this
resulted in complex combined data systems. An immediate result of
this was that many of the new companies did not know what data
Data harmonization
When a large organization has many data sets (it is not uncommon
for a large bank to have many thousands of databases, with millions
of columns), how can we compare data from one to another? There
are two issues that can confuse this situation. The first is terminol‐
ogy. If you ask people from different parts of the organization what a
“customer” is, you’ll get many answers. For some, a customer is
someone who pays money; for others, a customer is someone they
deliver goods or services to. For some, a customer might be internal
to the organization; for others, a customer must be external. How do
we know which meaning of a word like “customer” a particular
database is referring to?
The second issue has to do with the relationships between entities.
For example, suppose you have an order for a product that is being
sent to someone other than your account holder (say, as a gift).
What do you call the person who is paying for the order? What do
you call the person who is receiving the order? Different systems are
likely to refer to these relationships by different names. How can we
know what they are referring to?
An explicit representation of business knowledge disambiguates
inconsistencies of this sort. This kind of alignment is called data
harmonization; we don’t change how these data sources refer to
Data validation
As an enterprise continues to do business, it of course gathers new
data. This might be in the form of new customers, new sales to old
customers, new products to sell, new titles (for a bookstore or media
company), new services, new research, ongoing monitoring, etc. A
thriving business will generate new data all the time.
But this business will also have data from its past business: informa‐
tion about old products (some of which may have been discontin‐
ued), order history from long-term customers, and so on. All of this
data is important for product and market development, customer
service, and other aspects of a continuing business.
It would be nice if all the data we have ever collected had been
organized the same way, and collected with close attention to qual‐
ity. But the reality for many organizations is that there is a jumble of
data, a lot of which is of questionable quality. An explicit knowledge
model can express structural information about our data, providing
a framework for data validation. For example, if our terminology
knowledge says that “gender” has to be one of M, F, or “not pro‐
vided,” we can check data sets that claim to specify gender. Values
like “0,” “1,” or “ ” are suspect; we can examine those sources to see
how to map these values to the controlled values.
Having an explicit representation of knowledge also allows us to
make sure that different validation efforts are consistent; we want to
use the same criteria for vetting new data that is coming in (say,
from an order form on a website) as we do for vetting data from ear‐
lier databases. A shared knowledge representation provides a single
point of reference for validation information.
Data and metadata standards can answer questions like these. Now,
can someone develop standards of this sort, or do we have to just
build our own bespoke way of recording and sharing metadata in
our organization? Once again, the World Wide Web Consortium
(W3C) comes to the rescue, by providing standards for knowledge
sharing. In this report, we have identified two major types of knowl‐
edge that we want to share—reference knowledge and conceptual
knowledge. The W3C has two standards, one for each type of
knowledge. The first is called the Simple Knowledge Organization
System (SKOS), for managing reference knowledge. The second is
called the RDF Schema language (RDFS), for managing conceptual
knowledge.
Both of these languages use the infrastructure of RDF as a basis, that
is, they represent the knowledge in the form of a graph. This allows
1 This link is often labeled simply as type, but that can be confusing in examples like this,
so we clarify that this is a relationship by calling it has type.
The labels on the arrows in Figure 20 are the same as the column
headers in the table in Figure 19. Notice that the N/A values are not
shown at all, since, unlike a tabular representation, a graph does not
insist that every property have a value.
What is the connection between the data and the ontology? We can
link the data graph in Figure 20 to the ontology graph in Figure 16
simply by connecting nodes in one graph to another, as shown in
Figure 21. There is a single triple that connects the two graphs,
labeled has type. This triple simply states that SKU1234 is an
instance of the class Product. In many cases, combining knowledge
and data can be as simple as this; the rows in a table correspond
directly to instances of a class, whereas the class itself corresponds to
the table. This connection can be seen graphically in Figure 21: the
data is in the bottom of the diagram (SKU1234 and its links), the
ontology in the top of the diagram (a copy of Figure 16), and a sin‐
gle link between them, shown in bold in the diagram and labeled
with “has type.”
But even in this simple example, we have some room for refinement.
Our product table includes information about the format of the
product. A quick glance at the possible values for format suggests
that these actually correspond to different types of Product, repre‐
sented in the ontology as classes in their own right, and related to
the class Product as subclasses. So, instead of just saying that
SKU1234 is a Product, we will say, more specifically, that it is a
Book. The result is shown in Figure 22.
There are a few lessons we can take from this admittedly simplistic
example. First, a row from a table can correspond to an instance of
more than one class; this just means that more than one class in the
ontology can describe it. But more importantly, when we are more
specific about the class that the record is an instance of, we can be
more specific about the data that is represented. In this example, the
ontology includes the knowledge that Books have pages (and hence,
numbers of pages), whereas Videos have runtimes. Armed with this
information, we could determine that SKU2468 (The Modeled Fal‐
con) has an error; it claims to be a video, but it also has specified a
number of pages. Videos don’t have pages, they have runtimes. The
Figure 22. Data and knowledge in one graph. In this case, we have
interpreted the format field as indicating more specifically what type of
product the SKU is an instance of. We include SKU2468 as an instance
of video as well as SKU1234 an instance of Book.
Modular knowledge
Knowledge is better in context, and context is supplied by more
knowledge. A large-scale enterprise will manage a wide variety of
data about products, services, customers, supply chains, etc. Each of
these areas will have knowledge in the form of metadata and con‐
trolled vocabularies that allow the enterprise to manage its business.
But much of the interest in the business will lie at the seams between
these areas: sales are where customers meet products, marketing is
where features meet sales records, etc.
Merging data and knowledge in a single graph lets us treat knowl‐
edge as a separate, reusable, modular resource, to be used through‐
out the enterprise data architecture.
Self-describing data
When we can map our metadata knowledge directly to the data, we
can describe the data in business-friendly terms, but also in a
machine-readable way. The data becomes self-describing as its
meaning travels with the data, in the sense that we can query the
knowledge and the data all in one place.
Customer 360
Knowledge graphs play a role in anything 360, really: product 360,
competition 360, supply chain 360, etc. It is quite common in a large
enterprise to have a wide variety of data sources that describe cus‐
tomers. This might be the result of mergers and acquisitions, or sim‐
ply because various data systems date to a time when the business
was simpler and do not cover the full range of customers that the
business deals with today. The same can be said about products,
supply chains, and pretty much anything the business needs to
know about.
We have already seen how an explicit representation of knowledge
can provide a catalog of data sources in an enterprise. A data catalog
tells us where we go to find all the information about a customer:
you go here to find demographic information, somewhere else to
find purchase history, and still somewhere else to find profile infor‐
mation about user preferences. The ontology then provides a sort of
road map through the landscape of data schemas in the enterprise.
The ontology allows the organization to know what it knows, that is,
to have an explicit representation of the knowledge in the business.
When we combine that road map with the data itself, as we do in a
knowledge graph, we can extend those capabilities to provide
insights not just about the structure of the data but about the data
itself.
Right to privacy
A knowledge graph builds on top of the capabilities of a data cata‐
log. As we discussed earlier, a request to be forgotten as specified by
GDPR or CCPA requires some sort of data catalog, to find where
appropriate data might be kept. Having a catalog that indicates
where sensitive data might be located in your enterprise data is the
first step in satisfying such a request, but to complete the request, we
need to examine the data itself. Just because a database has customer
PII does not mean a particular customer’s PII is in that database; we
need to look at the actual instances themselves. This is where the
capabilities of a knowledge graph extend the facility of a simple data
catalog. In addition to an enterprise data catalog, the knowledge
graph includes the data from the original sources as well, allowing it
to find which PII is actually stored in each database.
Sustainable extensibility
The use cases we have explored all have an Achilles’ heel: how do
you build the ontology, and how do you link it to the databases in
the organization? The value of having a knowledge graph that links
all the data in the enterprise should be apparent by now. But how do
you get there? A straightforward strategy whereby you build an
ontology, and then painstakingly map it to all the data sets in the
enterprise, is attractive in its simplicity but isn’t very practical, since
the value of having the knowledge graph doesn’t begin to show up
until a significant amount of data has been mapped. This sort of
delayed value makes it difficult to formulate a business case.
A much more attractive strategy goes by the name sustainable exten‐
sibility. It is an iterative approach, where you begin with a simple
ontology and map a few datasets to it. Apply this small knowledge
graph to one of the many use cases we’ve outlined here, or any
Let’s take a look at how this new paradigm deals with our simple
example of NAICS codes. A large part of the value of a standardized
coding system like NAICS is the fact that it is a standard; making ad
hoc changes to it damages that value. But clearly, many users of the
NAICS codes have found it useful to extend and modify the codes in
various ways. The NAICS codes have to be flexible in the face of
these needs; they have to simultaneously satisfy the conflicting needs
of standardization and extensibility. Our data landscape needs to be
able to satisfy this sort of apparently contradictory requirements in a
consistent way.
The NAICS codes have many applications in an enterprise, which
means that the reusable NAICS data set will play a different role in
combination with other data sets in various settings. A flexible data
landscape will need to express the relationship between NAICS
codes and other data sets; is the code describing a company and its
business, or a market, or is it linked to a product category?
The problems with management of the NAICS codes become evi‐
dent when we compare the typical way they are managed with a
data-centric view. The reason why we have so many different repre‐
sentations of NAICS codes is that each application has a particular
use for them, and hence maintains them in a form that is suitable for
that use. An XML-based application will keep them as a document, a
database will embed them in a table, and a publisher will have them
as a spreadsheet for review by the business. Each application main‐
tains them separately, without any connection between them. There
is no indication about whether these are the same version, whether
one extends the codes, and in what way. In short, the enterprise does
not know what it knows about NAICS codes, and doesn’t know how
they are managed.
If we view NAICS codes as a data product, we expect them to be
maintained and provisioned like any product in the enterprise. They
will have a product description (metadata), which will include infor‐
mation about versions. The codes will be published in multiple
forms (for various uses); each of these forms will have a service level
agreement, appropriate to the users in the enterprise.
References
Darrow, Barb. “So Long Google Search Appliance”. Fortune, Febru‐
ary 4, 2016.
Dehghani, Zhamak. “How to Move Beyond a Monolithic Data Lake
to a Distributed Data Mesh”. martinfowler.com, May 20, 2019.
Dehghani, Zhamak. “Data Mesh Principles and Logical Architec‐
ture”. martinfowler.com, December 3, 2020.
Neo4j News. “Gartner Identifies Graph as a Top 10 Data and Analyt‐
ics Technology Trend for 2019”. February 18, 2019.
References | 79
Weinberger, David. “How the Father of the World Wide Web Plans
to Reclaim It from Facebook and Google”. Digital Trends, August 16,
2016.
Wikipedia. “Database design”. Last edited December 20, 2020.
Wikipedia. “Network effect”. Last edited March 3, 2021.