Let’s fast forward in to the near future. The network configuration is well documented. The hashes of all code are known and recorded. Firewalls have been installed in appropriate places. The network traffic rates and patterns are known and monitored. Physical LAN port statuses are monitored. In other words the control systems integrity is monitored and known.
You are the plant superintendent. It’s Friday afternoon, before a long weekend, close to quitting time and the OT security specialist walks in to your office, and says “I think you’d better have a look at this.”
So you look. Yup. It is pretty obvious there is some kind of malware in your control system and it wouldn’t have been noticed except that one of the edge switches in a panel alarmed on a fleeting duplicate MAC address on a trunk.
Now what? All this effort to become aware of this sort of attack, and here you appear to be detecting one. Do you have a plan to deal with this? Well, the first thing that has to happen is to disconnect from the office networks. But you can’t be certain that there isn’t some device in your system calling home through a smart-phone or WiFi interface. So this may not be terribly effective.
You will probably think this is time to start flipping controls to manual. Nice try. Thanks to automation, you don’t have those extra people on the plant to run everything manually any more.
So what options do you have? This is where you should have brought the control systems engineers in to the picture. But the way most people discuss this, they think this is almost entirely an OT thing, right?
WRONG!
The goal of the control system security is to maintain availability. To do that we need fall-back systems with minimal networks and minimal automation. For example, consider the autopilot that flies most Airbus airliners. The pilots merely put commands in to it and tell it what to do next. But if the autopilot self integrity checks fail, it drops back to a much simpler stick and rudder mode where the control stick translates the pilot’s commands straight to the flight controls without significant intervention. This is called Alternate Law mode.
The plants need control systems engineers to design a similar form of Alternate Law for the industrial processes. The OT staff can’t do this. They need guidance from those who actually know what the process should be doing.
Many talk about the gulf between the OT and Engineering staff. This is precisely what they should be talking about. Everyone is very hot about getting all the right documentation and network hygiene in place. But there is shockingly little discussion about how this is all supposed to maintain process availability.
There needs to be a lot more discussion over how the networks within the control system can break apart in to useful pieces. For example, the engineers might find a way to shut down the HMI VLAN if there is sufficient M2M communications to make things happen. The local OIT stations can be used to command each PLC and record performance. Someone else can integrate those records in to the reports when things calm down.
It may also be that someone may set up manual controls with embedded PID controllers to do the things that are of a primary concern.
One may also have ways to push a batch oriented process to a safe state where it can be stable for a few hours or a day while others locate and attempt to remove the malware.
And above all, there should be staff that that can be called upon to help you through this morass. For example, the water industry has WARN, the Water Area Resource Network. This sort of thing should be practiced regularly so that everyone gets familiar with the plants used by their neighbors.
So do you have anything like this? Does anyone?
Why not?