Petri Nets for AI?

A Petri Net is one of the less well-known types of value propagation networks.  Like Artificial Neural Networks, Petri Nets are capable of producing extremely complex behavior.  The good news is that they are more stable in recurrent operations than Artificial Neural Networks.  The bad news is that unlike ANNs, there is no widespread and well-founded work on how to most effectively train them.  The training methods that have been used tend to be either extremely general, like evolutionary methods, or extremely adhoc, in the sense of “somebody tried this and found that sometimes and for some problems it seems to sort of work.”

However,  in a quest for general AI, classic Artificial Neural Network training techniques are inappropriate anyway.  In the absence of any constant set of connections or topology, classic neural net training algorithms do not operate. If we assume that we’re trying to create something that has the same kind of emergent fine detail and self-organizing architecture as a brain, then we have to consider the neuroplasticity of biological brains. Neurons, once thought to never be replaced in humans, do in fact sometimes wear out and get replaced.  They get replaced with neurons that don’t connect in exactly the same way, but which continue to develop, change, and refine the emergent structure. Existing neurons develop new dendrites or even axons that change their connectivity and the way the brain uses them, over time.

We have strong hints that this kind of neuroplasticity is not only useful in recovering from brain lesions and injuries, but also a fundamental mechanism in the development of our brains’ fine structure.  Among other things, humans – the only species to develop the kind of flexible intelligence we’re talking about – have neuroplasticity to a degree astonishing among all species on Earth;  our brains are subject to many orders of magnitude more structural change over time than the brains of, eg, horses or elephants or turtles.

When we talk about neural network training algorithms that can utilize plasticity – neural rerouting and network topology changes, in our terms – we are mostly talking about some form of Evolutionary Algorithms anyway.  And if we’re talking about an Evolutionary Algorithm, we can train Petri Nets with it as easily as we can train Neural Networks.

For those unfamiliar with them, a Petri Net, like a Neural Network, is a directed graph of nodes and edges in which values (in this case integer values considered as a count of ‘tokens’) are transmitted from node to node via the edges according to rules associated with each edge.   But there is an important difference.  When tokens are transmitted along an edge in a Petri net, they are subtracted from the source node and added to the destination node.  You can think of this as a conservation-of-energy rule, and this conservation of energy is why Petri Nets are easier to keep stable than recurrent Artificial Neural Networks.  Over time, the total number of tokens in a Petri net does not change.  Petri nets are naturally recurrent, in that the tokens circulate among the nodes.

In a ‘classical’ Petri Net, the edges are unordered.  At each step in time one of them is selected at random, its rule is checked, and if the rule’s conditions  are met, the edge ‘fires’ – tokens are moved from source to destination.  In practice, if you want your Petri net to work efficiently, you can check first the edges attached to the nodes whose token counts have just changed. Otherwise you can check all edges in any sequence, ordered or random.   Some Petri Nets are capable of ‘deadlock’ – reaching a state in which no edge can fire – and some are not.  A ‘deadlock’ in a Petri Net is analogous to ‘quiescence’ in a recurrent Artificial Neural Network – the moment in time when no node fires and the whole system comes to a permanent halt.

A related condition is called ‘saturation’, and classical Hebbian training of a neural network (which always increases and never decreases the connection weights) always leads to saturation unless an additional step downregulating the network weights is taken.  Saturation, in an Artificial Neural Network, means every node fires, every time, creating positive feedback.  This is a repetitive state every bit as meaningless as quiescence.  In a Petri Net, the analogous state would be one in which every edge can fire, and no sequence of firing could reach a state in which some edge cannot.

But both of these conditions are extremely simple to avoid in a Petri Net.  These two invariants result in a Petri Net that provably avoids both deadlock and saturation:

1.  The edges include a fully connected subgraph of edges whose rule is that one token may be transferred from the source to the destination if the source has more than zero tokens and the destination has zero or fewer tokens.

2.  The total number of tokens in the network is greater than zero and less than the number of nodes.

A positive total number of tokens means that there is always at least one node with a positive number of tokens.  A total number of tokens less than the number of nodes means that there is always at least one node with zero or fewer tokens.  A fully connected subgraph contains all the nodes, and therefore at all times at least one edge of that subgraph must connect a node with a positive number of tokens to a node with zero or fewer tokens.  Therefore at all times, there exists at least one edge that can fire. But transferring a token on firing means that there can be no node that will always have a negative number of tokens and there can be no node that will always have a positive number of tokens.  So no matter what else happens in the network, every member of this set of edges will sometimes have its conditions met for firing (no deadlock) and sometimes not (no saturation).

To these edges you can add any number of other edges, with whatever rules you like.  Any edges whose rules can be met by source and destination of one token and zero respectively, in a Petri net where all firings transfer a positive number of tokens, will never saturate and will never deadlock.

The number of tokens transferred is the strength of the edge, similar in concept to the weight of a connection in an artificial neural network. Adjusting strengths, like adjusting weights in ANN’s, is one of the ways a Petri net can be trained.  This can be done via evolutionary methods with the same ease as evolutionary methods applied to training an ANN, or can be done with ‘swarm’ methods or similar adhoc hill-climbing techniques, but the broad experience and proven, elegant, reliable training algorithms that exist for ANN’s don’t really have any closely comparable methods in Petri nets.

Petri Nets can have extremely complex rules, or ‘edges’ that connect more than one source and/or more than one destination, or several different ‘colors’ of tokens, differentiated by rules that determine when an edge can fire based on several different colors of tokens or transfer tokens of a particular color.  Individual nodes can have maximum numbers of tokens they can hold, or minimum numbers.  But as Petri Nets grow more complex, so do the sets of invariants necessary to maintain a proof that the network can neither deadlock nor saturate. So far there is little evidence that such complications actually help much with training or evolved behavior.  Evidence of their value may be discovered if and as we develop a proper body of knowledge about training and operating Petri Nets, but it is at least as likely that it will not.

I am very much aware that the nodes in our ANN’s  are extremely simplified models that only vaguely remind real neurobiologists of the way the neurons of a biological brain operate.  But if ANN’s hold up a cloudy and cracked mirror to the structure of biological brains, Petri Nets are holding up a bowl of pudding, or something equally random.  There is absolutely nothing in a Petri Net that resembles anything about the way biological brains work.

I have been thinking about Petri Nets, and ways to model or modify them so that there are tractable methods for training them, lately, because my recurrent Artificial Neural Networks, which I’m training via evolutionary methods, frequently become unstable or fall into self-reinforcing repetitive patterns.