Document 4
Document 4
Document 4
Abbreviations
View PDF
Show more
y
Grover and Leskovec, 2016
Google Scholar
B. Weisfeiler, A. Leman
DEXPI is a machine-readable P&ID exchange format under development by the DEXPI
Initiative. The initiative consists of owner operators, engineering, procurement &
construction companies, software vendors and research institutions. The latest data
model and the associated DEXPI specification 1.3 (Theißen and Wiedau, 2021) were
published in 2021. Within the specification, different international standards for
the description of engineering relevant data for P&IDs are combined (e.g. ISO 15926
(International Organization for Standardization, 2013), ISO 10628 (International
Organization for Standardization, 2012a), IEC 62424 (International Electrotechnical
Commission, 2016), ISO 10209 (International Organization for Standardization,
2012b). In particular, these include plant breakout structures, instrumentation,
properties of equipment and components, and piping topology. The DEXPI information
model is already offered by some manufacturers and is exchangeable via a Proteus
XML schema (Proteus XML, 2017). At the same time, DEXPI provides the possibility to
be used as a platform for digital plant data in process industry (Wiedau et al.,
2019), which can significantly reduce the development time of chemical and
biotechnological production plants. Additionally, interoperability increases due to
the continuous integration of DEXPI into existing engineering software (Fillinger
et al., 2017). The uniform and machine-readable format as well as the increasing
acceptance of the DEXPI format in the process industry improve the potential for
the application in the field of data science and allow the application of
artificial intelligence (Wiedau et al., 2021).
is defined (Bahdanau et al., 2014; Hamilton, 2020). In the context of this work, a
softmax approach (9) is used to calculate the weight factor
StellarGraph 2020
Google Scholar
Captures
(4)
View in ScopusGoogle Scholar
IEEE Trans. Signal Process., 45 (1997), pp. 2673-2681, 10.1109/78.650093
Tables (1)
A. Grover, J. Leskovec
Separation units 15
Elsevier logo with wordmark
Graph-based P&ID formats are a promising way to improve machine readability of
important process information. The standardized DEXPI format, which is defined and
will be improved continuously by the DEXPI Initiative, is able to store the
topology of process plants as well as all apparatus specifications in a structured
way and make them available in a machine-interpretable manner. This standardization
enables the retrieval of information and a simple application of AI models. Thus,
in addition to the possible accessibility of P&ID data, two approaches have been
presented in this work, which make it possible to recognize patterns in P&IDs and
use AI to help support and increase efficiency in P&ID synthesis. First, P&ID
graphs are decomposed into sequences, and subsequent components can be predicted
based on the drawn partial structure by RNNs. In this regard, high accuracies were
achieved and, in particular the use of AI-based suggestion algorithms was
quantified. On the other hand, the use of GNNs allows for learning correlations in
plant topologies. In the present case, existing components were classified into
P&IDs for a consistency check. In this case, too, good results were achieved in a
first feasibility analysis with a recursive GNN and the data basis is expandable
for the application of AI. However, further fields of application for GNNs are
possible. For example, the prediction of connections is possible, e.g. pipelines,
signal lines, as well as the detection of subgraph structures which could help to
automatically detect functional equipment assemblies (as a part of the engineering
of modular plants).
Zaheer et al., 2017
When parsing the DEXPI files, it becomes apparent that the level of detail in the
P&ID description varies greatly depending on the user. Thus, different numbers of
attributes of the XML files are filled in. At the same time, the use of the
attributes leaves some room for interpretation, such that synonymous information
was mapped to different attributes. This requires a certain degree of robustness,
which has been considered in the DEXPI-2-graph implementation. Therefore, several
attributes (e.g. design temperature, pressure, material, …) are deliberately
searched until the desired information for the respective node is found.
Download : Download full-size image
arbitrary node u of a graph
AI-based suggestions can be used to speed up the process of drawing P&IDs. A
sequence of drawn and connected components is used to learn their course with the
help of an RNN. Recurrent neural networks are neural networks that can model time
series (or in this case linearly structured sequences) in training data based on
their structure (Hu and Balasubramaniam, 2008). Based on the learned correlations,
a node prediction is carried out, which returns the most probable subsequent P&ID
components based on an input sequence. The workflow of the modeling is explained in
more detail in the following.
output
Further opportunities are safety assessment and HAZOP studies, where machine-
readable HAZOP scenarios could be mapped to graph-based plant topologies using
search algorithms. Furthermore, in future a possible detection of subgraphs in
P&IDs i.e. functional equipment assemblies could facilitate the mapping between PFD
simulations based on unit operations and P&IDs. This could provide the foundation
for automated generation of PFD simulations from a DEXPI plant topology.
u
The workflow of the node classification is shown in Fig. 8. First, all nodes of all
P&ID graphs in the used dataset are divided into a training dataset (80%) and a
test dataset (20%) using a mask. The neural network is then provided with
information about the topology of the graph, as well as attributes of the nodes and
edges, e.g. equipment class, connection type, etc… From this information, the
network generates an embedding for each node and the predicted node class. This is
compared with the real node class and the error is reduced via backpropagation.
After the training is finished, the trained network can be used for node
classification of unseen data (nodes).
Download : Download high-res image (310KB)
Digital Chemical Engineering, Volume 3, 2022, Article 100028
node2vec
Google Scholar
Declaration of Competing Interest
Download : Download high-res image (216KB)
Hamilton, 2020
PHASuite: an automated HAZOP Analysis tool for chemical processes
Piping equipment 41
An agent-based environment for operational design
Show 4 more figures
Cited by (4)
Introduction
Fig. 7. Example of the neighborhood aggregation of a GNN using a P&ID.
(5)
Fig 5
and CONCAT concatenates the individually calculated aggregations of each node.
Conclusion & outlook
View PDF
International Organization for Standardization 2012a
Google Scholar
l
Terms and conditions
Recommended articles
weight by which the features of nodes influence each other
Check valves 60
Remote access
Add to Mendeley
References
The first step is sampling, during which the graph with its networked structure of
nodes and edges is sequentially transformed into linear input data. These input
data consist of a list of contiguous nodes, which contain the interconnected graph
in linear representations. In the sampling process, all possible turns based on the
number of output edges are made at branches to obtain a reliable representation of
all node interconnections via random walks (Grover and Leskovec, 2016). The
sampling is performed with the function randomBiasedWalk, which is part of the
Python library StellarGraph (package: stellargraph.data.BiasedRandomWalk / version:
v1.0.0rc1) (StellarGraph, 2020). The random biased walk requires four input
parameters. The number of walks defines how many walks are generated from each node
in the graph. The walk length specifies how many nodes are considered per walk.
Important special features of the biased random walk are the return hyperparameter
p and the in-out hyperparameter q, which guide the walk. Thus, 1/p defines the
probability of reversing the sampling direction during the random walk, while 1/q
describes the probability of discovering new nodes in the graph. In this way, the
depth of the search can specifically be controlled (Grover and Leskovec, 2016).
Since the generated samples should represent a clean and linear section of the
plant topology, the parameters must be chosen in a way that the random walk jumps
back as rarely as possible and continuously explores new paths. In this respect,
previous investigations have shown that convincing results can be achieved with
values of p = 1000 and q = 1. Smaller values of p, lead to an undesired probability
of sampling against the flow direction. The sequential samples represent the actual
training data for AI modeling and have a previously defined length l. They are
divided in such a way that the first l-1 entries represent the input sequence x,
while the entries at position l are the corresponding output y. The dataset used in
this work is composed of a total of 4923 sequences, each consisting of six nodes.
For validation, 20 % of the data set are randomly retained as a test set. The
remaining 80% are used to train the RNN.
Google Scholar
(8)
2023, Computers and Chemical Engineering
CrossRefGoogle Scholar
M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. Salakhutdinov, A. Smola
Datasets
Bahdanau et al., 2014
As mentioned before, the information from the P&ID is interpreted in the form of a
graph. This makes it possible to store the relationships between components and the
topology in an unambiguous and machine-interpretable way. However, to learn the
graph structure as a whole and to solve tasks such as node classification, edge
classification or link predictions, machine learning methods of graph analysis are
required that can deal with non-Euclidean data structures such as graphs. The
modeling of graph structures is particularly interesting in the field of P&ID
engineering. By learning connections (e.g. piping, signal lines, …) or components
(e.g. valves, equipment, …) based on their neighborhood with the help of AI, it
will be possible in the future to perform consistency checks in P&IDs and detect
errors in P&IDs. This could reduce the amount of time for drawing P&IDs, which will
shorten the time for developing a plants documentation. To achieve this goal, Graph
Neural Networks can be used for modeling Graph Neural Networks (GNN) can be used
for modelling, which have become increasingly important in recent years (Zhou et
al., 2020). A GNN is based on a message passing algorithm that aggregates arbitrary
information from the neighborhood of a node, which will convolve the graph
(Hamilton, 2020). In general, the message passing of a GNN is analogous to the
Weisfeiler-Lehman algorithm to test the isomorphism of two graphs (Weisfeiler and
Leman, 1968), which was introduced in 1968 and in which information is aggregated
from the neighborhood of each node.
Zhao et al., 2005
Show 3 more articles
We use cookies to help provide and enhance our service and tailor content and ads.
By continuing you agree to the use of cookies.
Deep Learning and Practice with MindSpore, Cognitive Intelligence and Robotics
Fig. 1. Use cases of artificial intelligence to accelerate and improve the
synthesis of…
Another option is to form the message via a set pooling approach (7), which is
similar to the aggregation of a Graph Isomorphism Network (GIN) (Xu et al., 2019).
According to Zaheer et al. (2017), this uses an
Comput. Chem. Eng., 21 (1997), pp. S71-S76, 10.1016/S0098-1354(97)87481-9
Download : Download full-size image
(Bahdanau et al., 2014).
with trainable parameters
Machine Learning : A Probabilistic Perspective
In the following, several P&IDs in the standardized DEXPI format are used as
training data, which were exported using the program PlantEngineer from the
software vendor X-Visual Technologies GmbH and converted to graphs in GraphML
format (GraphML Project Group, 2017) according to chapter 2.1. In total, 35 P&ID
graphs from third parties (laboratory and industrial plants) with 1641 nodes and
1410 edges are used. The data set contains 92 different equipment classes (valves,
pumps, vessels, instrumentation, etc.) based on the DEXPI specifications (Theißen
and Wiedau, 2021) and has three different classes of edges (pipes, signal lines,
process connection lines). The ratio of nodes/edges shows that, as expected for
P&IDs, these are very linear graphs with rather low connectivity structures. At a
closer look there are usually many single nodes along a pipeline (e.g. valves,
vessels, pumps, heat exchangers, measuring points, etc.) which results in a kind of
dead ends. Additionally, some P&IDs show inconsistencies in their drawn structures,
which in some cases lead to isolated nodes or several, smaller graphs. However,
these inconsistencies were deliberately included in the data set, as the data is
intended to represent the current state of machine-readable P&IDs in the process
industry to obtain representative results. The influence of the inconsistencies on
the results is examined in more detail in chapter 4.
International Electrotechnical Commission 2016
Readers:
Share
The results show that RNNs are generally able to learn patterns in sequences from
P&ID graphs. It is noticeable that the SimpleRNN provides the best results with a
validation accuracy of 78.36%. In the case, where the equipment is part of the five
most likely predictions, even 95.2% accuracy is achieved. The BRNN reaches an
accuracy of 94.39%, while predicting the five most suitable equipment types. The
LSTM and GRU have slightly lower accuracy, suggesting that the effect of the
diminishing gradient for the short sequences involved does not have a significant
effect on the training. At the same time, it should be noted that training for the
GRU took less than one-third the time of a SimpleRNN model. Given the current small
amount of data, this is not a decisive factor with the current setting. However,
should the training of the models be done in the future on large data sets or
continuously, it is recommended to give more attention to this aspect, as the use
of GRUs or LSTMs can save time and resources (Strubell et al., 2019), which should
be considered with respect to a sustainable process development.
M. Wiedau, L. von Wedel, H. Temmen, R. Welke, N. Papakonstantinou
Introduction
1
In the following, the different RNN models are used and trained with the in chapter
2.2 generated P&ID graphs according to the presented workflow. The implementation
is done in Python using the keras library (Chollet, 2020). The "Adam" optimizer
(Kingma and Ba, 2014) is used for all trainings and the calculation of the loss is
performed by the "categorical cross entropy" (Murphy, 2012). The prediction
accuracy is used as an evaluation metric and is defined as follows.
Towards a systematic data harmonization to enable AI application in the process
industry
Download : Download high-res image (542KB)
ISO 10628-2 - Diagrams for the Chemical and Petrochemical Industry – Part 2:
Graphical Symbols
Manaswi, 2018
Download : Download full-size image
Apress, Berkeley (2018), 10.1007/978-1-4842-3516-4
Digital Chemical Engineering
W.L. Hamilton
For better application, the DEXPI-2-graph converter is equipped with a graphical
user interface (GUI), which is shown in Fig. 4. The figure also shows a
visualization of an extracted P&ID graph. The path folder containing the DEXPI
P&IDs to be converted is selected and the conversion is started via buttons. A
console window shows the progress, errors and the generated GraphML files. In
addition, a plot window is used to directly check the generated P&ID graphs. The
converter including GUI is published as an open-source application and is available
on Github as a Python application at https://github.com/TUDoAD/DEXPI2graphML
(Oeing, 2022).
Google Scholar
Artificial intelligenceProcess synthesisP&IDProcess engineeringDetail engineering
embedding/feature of a node u
Google Scholar
Fig. 9. Results of the node classification in a P&ID graph via recursive GNN
grouped by the applied aggregation functions.
The reduction of a graph to canonical form and the algebra which appears therein
Weisfeiler and Leman, 1968
Download : Download high-res image (186KB)
Fig 4
Citation Indexes:
Wiedau et al., 2019
ENPRO data integration: extending DEXPI towards the asset lifecycle
input
Fig. 4. GUI of the DEXPI-2-graph converter
Hu, 2008
Fig 3
Chem. Ing. Tech., 91 (2019), pp. 240-255, 10.1002/cite.201800112
(3)
Google Scholar
into an one-hot vector, i.e., a vector consisting of zeros except a single entry
that is set to one (Gulli and Pal, 2017). The sum of the vectors allows for
determining exactly the used one-hot vectors, which were needed for its generation.
Fig. 2. Structure of the Python DEXPI-2-graph converter.
View PDFView articleView in ScopusGoogle Scholar
Acknowledgment
Yoshinari Hashimoto, Hiroto Kase
Volume 4, September 2022, 100038
Results – node prediction
Fig. 3. P&ID topology representing GraphML structure used for further training
GraphML Project Group 2017
Fig. 6. Results of the training of following P&ID equipment with different RNN
models
Murphy, 2012
v
The design and engineering of piping and instrumentation diagrams (P&ID) is a very
time-consuming and labor-intensive process. Although P&IDs show common patterns
that could be reused during development, the drawing is usually created manually
and built up from scratch for each process. The aim of this paper is to recognize
these patterns with the help of artificial intelligence (AI) and to make them
available for the development and the drawing process of P&IDs. In order to achieve
this, P&ID data is made accessible for AI applications through the DEXPI format,
which is a machine-readable, manufacturer-independent exchange standard for P&IDs.
It is demonstrated how deep learning models trained with DEXPI P&ID data can
support the engineering as well as drawing of P&IDs and therefore decrease labor
time and costs. This is achieved by assisted prediction of equipment in P&IDs based
on recurrent neural networks as well as consistency checks based on graph neural
networks.
(9)
Elsevier
Neural Comput., 9 (1997), pp. 1735-1780, 10.1162/neco.1997.9.8.1735
Keywords
Google Scholar
Fig. 5. Workflow of an RNN-based model for predicting subsequent equipment in
P&IDs.
Google Scholar
Algorithmische Graphentheorie, De Gruyter Studium
Fig 2
Heat exchangers 86
Proceedings of the NIPS (2017), p. 17
C. Zhao, M. Bhushan, V. Venkatasubramanian
Google Scholar
Chollet, 2020
Deep Learning with Keras: Implementing Deep Learning Models and Neural Networks
With the Power of Python
International Electrotechnical Commission, 2016. IEC 62424, Representation of
process control engineering – requests in P&I diagrams and data exchange between
P&ID tools and PCE-CAE tools. International Electrotechnical Commission, Geneva.
To better understand the modeling of plant topology by message passing GNNs, an
example is given in Fig. 7 that relates the aggregation of neighborhood information
to a snippet of a P&ID. The example shows the aggregation by a two-layer neural
network. Since the plant topology is to be learned, we focus in the following on
the equipment information, such as the classes of each component in the P&ID. Thus,
in a first step (k = 1), inferences can be made about the vessel based on the
information from the valve and the heat exchanger. In a second step (k = 2), a
valve's and a temperature sensor's information can be aggregated for the embedding
of the heat exchanger, while the valves´ embedding is influenced by the connected
drive and flow control.