In their Water and Wastewater Systems Security Recommendations, CISA touched on a subject that I rarely ever see anywhere:
Before working on security, it helps to make the automation and the process more resilient. Even more important, the automation should actively refuse certain toxic moves. Yes, CISA was recommending that all Automation be made safer. This is the long overdue convergence of safety and security with automation. We shouldn’t just be using these practices in Safety systems, we should borrow from these concepts and apply them, where practical.
The designers of an industrial process frequently over-build and over-size things because they would like their designs to be somewhat capable of handling extreme conditions. Thus chemical dosing pumps may be a bit larger than you might expect, flow meters may be larger than one might expect, pipe sizes are conservatively larger, and so forth.
However, the control system does not need to operate over that entire range. In fact, it may be better to have the control system fault and alert operators that it has been asked to do something outside the range of normal control.
Many were worried that the control system would do exactly what it was commanded to do which was to dramatically increase the rate of Sodium Hydroxide dosing. That’s like being worried about what would happen if your car’s cruise control were set to 200 MPH. It won’t be able to go that fast (well, unless you happen to be driving a supercar like a Dodge Demon). Furthermore, modern cruise controls will look for the car ahead of you and slow down regardless of where it was set so as to maintain a safe distance.
We need that in industry. Water chemistry usually doesn’t change that dramatically. If setpoints are entered that are significantly beyond routine operations (say by a certain percentage from nominal), the control system should ignore the request; and if necessary, go to a safe mode.
Why are control systems designed with full range like this? There is a certain implicit trust that engineers have of operators to give them the authority to go to extremes if needed. But that same trust is not warranted with automation. If things are really that bad, the automation should stop, and alert a human to get out there and verify what is going on.
What this also means is that automation should be designed with self integrity features. For example if the flow going in to a process stage does not equal the flow coming out with some margin of error, get someone out there to have a look.
We need this level of instrumentation because if we don’t have it, there is no assurance that we can safely leave a process unattended.
What this also means is that if another intrusion attempt happens or if the operator has slippery fingers and keys in a number incorrectly, the control system will bark and continue trucking along as it was set before. At the City of Oldsmar water treatment facility, there probably wasn’t any good reason reason to routinely be able to turn the Sodium Hydroxide injection levels up to 11,000 ppm any more than there is a good reason to set the cruise control of your car to 200 mph.
The more self integrity features we include, the more reasonable process limits that we include, the safer we will be. It also involves humans in the process earlier, it enables better process awareness, and it helps to catch problems as early as possible.