Pexels ann h 11022638.1920x0

What happened back then?

21 october 2022

An article on applying existing software engineering practises to smart industry solutions.

How often haven’t you asked that question, “what happened back then?”, after discovering a problem that was caused by something that happened way back in the past.
It could help a lot to answer questions like that when you have data available from the time the problem occurs. However, as I explained earlier in in an article called “Data, data, everywhere” it’s not easy to collect the right data without without drowning in a see of seemingly unrelated data.
In that article I was looking at data being collected from two sides. First I looked at the available technology to collect data, and second at the fact that data should be collected ‘wisely’.
What I didn’t go into there was possible ways to do the second and avoid getting too much or the wrong data. In this article I’m going to fill in that gap by putting something in between the functional need to gather data, and the technologies I mentioned in the “Data, data, everywhere”: some software engineering practises.

The role of software
In factory automation and smart industry it’s no surprise that data gathering, handling and analysis is something that is largely done in software. After all, in a way data itself is also software. At least that is what we called music and video at Philips 25 years ago.

When we use software to collect data in a factory, we can collect data on all levels of operation: on machine control level we can get it from our PLCs or other controlled, on process control level from the MES system and on factory level from the relevant parts of the ERP system. All of these are connected to a network, so the data is relatively easy to access. We can use dedicated interfaces, or just read directly from the database of these systems. Also, we can analyse the data on the fly, or gather it in a central place and optimise it for further analysis. Data scientists (specialists in gathering and analysing data) call that last activity ‘data cleaning’ or ‘data optimisation’.

For cleaning, analysing and reporting, lots of different software solutions are available. If a commercial or open source application can’t help us, we can always write our own using one of the many data science development kits.

Still, having these opportunities still circumvents the real question: how can we reduce the amount of data we gather to just what we need? The short answer is that we can’t, simply because we don’t know what data we may need at a give point in the future. However, we can increase the chance that we have the right data, without just randomly collecting everything.

Back to the functional level
In order to achieve that, we have to take a step back and take of our technologist hat for a little while. After all, at times an operator or manager in a factory wonders “What happened there?” he or she is most likely not thinking about what exactly the hardware or software was doing. Instead, their main concern is whether somebody made a mistake, whether the temperature in a silo became too high, or the maybe a delivery of materials did not arrive. Yes, it these are examples that are  completely different from each other and relate to completely different processes or parts of the factory. And yes, they do have something in common. They are events that occur in the factory. In this case, events of the form ‘something went wrong’. There are also events of the type ‘a planned or controlled action was completed’, which occur when a step in a process or procedure is completed. It’s these events that can help us filter out data that is relevant.

Before going into that let me first introduce a term from software development that’s been around for over a decade now: Event Sourcing.
Even sourcing is explained easiest using something we all use every day, our bank account.

If you go to the app or web site of your bank, and open the overview of your account, you’ll see the current balance and the most recent transactions. Each transaction is an event that has occurred, and that resulted in a change of your account balance, a delta. A positive delta occurs when money was added, a negative one when money was withdrawn. Which each transaction, there’s an indication of the delta, a timestamp and the identification of the other party involved.
Basically, that’s all we need to keep track of what is going on in our bank account. Of course there is a lot more going on in the software that handles the transactions, but as a customer of the bank, what is in the transaction overview is what really matters to me.

Event sourcing is a software architecture pattern based on this idea. Or actually, it is based on the invention of the double entry accounting, by  the 15th century Italian mathematician Luca Pacioli. He was the first to register transactions in a bank or any other business, next to the current balance. Keeping both these things is why it is called double entry accounting.
In Event Sourcing, the focus is on the transaction side of this approach. Instead of storing the status of an object (e.g. a production order) only status changes are stored. For a production order that could lead to registration of the order being created, then scheduled, then produced, stored in the warehouse and finally shipped.

Starting from point 0, when the system was first turned on, the current status can then be calculated by adding up all transactions. That may become a bit much calculation for objects that have a long lifespan, but I’ll get back to that later on.

This idea of Event Sourcing contains a very good basis for what we want to achieve in a factory: collect data that is relevant, and avoid collecting too much data.

For smart industry projects, this seems a very useful approach. It opens up a way to collect data for problem analysis as well as for setting up a process of continuous evaluation and improvement, which is what we try to achieve for our customers at Shinchoku.

For each object, regardless of whether it is a production order, a product recipe or or a transport document, we can use this approach. We need to do two things to make this work. First of all, identify which events are important for the people running the factory. Second, determine what information should be included with each event to make storing it useful. Some data is needed for problem analysis, for continuous improvement something more may be needed.

The essence of this is that we stop collecting seemingly random data items at regular time intervals. This is what happens a lot in practice still, especially at PLC and MES level. Instead, we start collecting data in the form of meaningful even traces.

A few challenges
There are some things to consider however. Although ERP and MES systems are written by software people, not all of them implement event sourcing. In fact most of the don’t. Most systems are build in the ‘traditional’ way, around a database that shows the current state of every object, instead of the full history.
There are some features in there that come close, but they do not fulfil our need as described above completely. For example, most ERP systems have an audit module that can be enabled for financial functions. Also, MES systems in some branches, like food production, often keep track of how ingredient mixes are produced, because of traceability requirements.

In order to apply event sourcing in combination with these systems we’ll have to have a look at how to interface with them, and collect the appropriate event data. This is not impossible, and I expect that at some point we’ll be standardising interface for such things. With the life span of MES and ERP systems being 10+ years it will take a while before it’s common practise though.

Also, there is the issue of delta’s versus current state on the PLC level. PLCs are used to control machines, and to gather data from sensors  that are integrated with those machines. By nature, sensors give absolute, current values instead of delta’s. This is something that we can handle easily. It’s not on line with the ‘rules’ of event sourcing, but if we have the absolute values and related time stamps, in an analysis we can always calculate the delta’s ourselves. On top of that, if we are looking at what operators and managers want to know, there is no need to collect sensor data continuously. Getting operational information like ‘set temperature reached’, ‘safety boundary on power exceeded’, accompanied by a time stamp and some other relevant data may be more useful in a lot of cases.

Finally, an issue with event sourcing, as I wrote earlier, is that recalculating the current status from a lot of past events may become time consuming. If an object has a long life cycle (machines may remain switched on for weeks or months) we probably want to avoid this. Event sourcing solves this by creating snapshots of the current status at regular intervals. This allows us to apply only the events that occurs after the last baseline when calculating current status. In a manufacturing environment with pre-existing MES and ERP systems, this is less of an issue. Since these systems most often are based on current state, we already have a snapshot at all times. In fact, combining this with event sourcing would be closer to double entry accounting than plain event sourcing.

Work in progress
This idea is likely not new, but it also not an idea that is widely implemented yet. In many cases companies working on smart industry or industry 4.0 solutions have already achieved a certain level of digitalisation, including the aforementioned MES and ERP systems that are based on ‘current state’. There, data cleaning before analysis is the chosen option, because it is often faster to implement than event sourcing inside or on top of existing systems. In more Greenfield environments like the ones we work in at SME manufacturers that are only making their first digitalisation steps, it could work out very well.