Understanding Industrial Process Control
Buried among the design blue prints and volumes of handbooks, there are two documents of great significance to anyone who cares about ICS security. The names may be slightly different than what I’m calling them here, but the concept is the same.
First is the Process Description. It is an overview of how the process is supposed to work when everything goes right. It is the introduction to what the plant does.
Second, is the Control System Narrative. It is a very detailed description of exactly what the controllers and process do under all adverse conditions. It describes details such as overflow weirs, Over-speed conditions, failure of a valve to close in a timely fashion, and so on.
Everyone in OT needs to read and discuss these documents from cover to cover regularly. They were carefully crafted, before the process was designed, to describe exactly what a controller is supposed to do. If you are not familiar with every single tweak and quirk of each process, you’ll be very late to recognize deviant behavior from the norm.
With guidance from these documents, one can design more resilient networks and segment the controllers so that it is possible to break apart the process in to more functional pieces. It should also give a very clear hint as to what nominal traffic would look like on that network.
Knowing the details of how the process works, one can determine where the instrumentation overlaps. For example, if you do a volumetric flow calculation, it should line up with what the metered chemical pump says it put in to the process. If it doesn’t that’s a clear indication that something isn’t right. It could be that the tank was filled, or an instrument was out of calibration. It could be that there is a slow leak somewhere. Or it could be that someone is messing around with the metered pump controls.
The Process Description’s Role
If disaster hits and you need to hand a document to first responders, the Process Description should be among them. For example, at a waste-water plant, the first several sentences of the Process Description should indicate a quick overview of what is there. For example:
This plant is designed for a modified Bardenpho process with Clarifiers and Tertiary Filtration. Blower setpoints are set seasonally between 40 to 60 PSI and then control valves modulate air flows in each basin according to setpoints from each dissolved oxygen probe…
Okay, you may not know what these words mean, but to someone in the business it gives them a complete overview of what they need to look for and how it is supposed to work. If there are questions concerning the Control System Narrative, this is the document that is supposed to provide guidance on how things are supposed to perform.
The Control System Narrative’s Role in Security
Another aspect of the Control System Narrative is that it describes all the contingencies in case of an attack. This is a living document and senior OT specialists should negotiate changes with Engineering and Operations. This is where the policies and plans for continuing operation of the plant start from. As such, from a security perspective this, along with the Process Description are an absolutely essential part of any security efforts in an ICS.
Unfortunately, I rarely ever see security people mention the Control System Narrative and the Process Description. They blandly refer to policies and procedures, but they don’t indicate where these policies and procedures are derived from. It comes from these two documents.
First Responders and what to tell them
This is ultimately why you need to read, re-read, and keep these documents up to date. If you are ever attacked, These documents are what will ultimately get you back on line –It’s not just a matter of backups, though that certainly helps. The problem is that even after restoring a backup, people need to know whether the process is doing what it is supposed to do. These are the documents that provide that answer.
So, did your most recent risk assessment review these documents? I’ll bet the answer was no. Most people don’t even realize that such documents should exist. Instead, the assessors go mewling about for policies and procedures, as if Standard Operating Procedure (SOP) documents would give them that answer to every question. There may be many SOP documents, but that’s not how the PLC code was written. Remember, someone had to design the background that lead to the SOP in the first place. Nobody should have to reverse engineer the SOP documents to figure out what the process does and how it does it.
The Control Systems Narrative is that description. We can’t write SOP documents for every possible contingency, but we can at least describe the process, and what it will do in response to various stimuli.
Determining the difference between instrumentation failures and a hack
This brings me to this final crucial point: Many hacks look like instrumentation or mechanical failures. Knowing what the automation is supposed to do, how it will do it, and when is crucial for being able to determine the difference between an equipment failure and an attack.
Why do I rely on process behavior? Because most self-learning network traffic monitors are not aware of what the process does or when it might deviate from the norm. They baseline what someone thinks is a routine traffic level and then the SIEM generates many useless alerts until people loosen up the alarm levels to where it stops nagging everyone so frequently. But the loosening of those parameters is not based upon expected performance, it was based upon raw observation that may already be tainted by an Advanced Persistent Threat.
However, if you have the Process Description and the Control System Narrative, you can design a process-aware SIEM that may be able to alarm and react better to changing conditions.
My advice to those of you eager to make risk assessment policy all over everything is to STOP! Go find these documents and start from there. If the Operations staff don’t know it exists, you may find that they’ll be very curious about it.
Pilots have a saying about this situation where you may not be fully aware of what is going on. They call it “getting behind the airplane.” You’re supposed to be thinking of the next two things ahead and know what to look for. Without these documents, you’re not just behind, you’re clueless. If they don’t exist, then it is up to you to reverse engineer the process and create them. Do not continue operations without the Process Description and the Control System Narrative.