Improving Control Systems Management

Operational Technology Cybersecurity is a symptom of a much larger problem. It is a problem that extends beyond just securing networks, keeping forensic logs, managing software and embedded systems. It is MUCH deeper than that. It goes to the core of industrial systems and the people that manage and operate it.

HOW DID WE GET HERE?

In the beginning, most automation was primitive. If there was automation, it was done with control relays. There were physical ladders of relays timers, pressure switches, and 4-20 mA analog PID controllers. Some systems were pneumatic or hydraulic. Most were designed by Electrical Engineers. I have worked on a few relic systems like that (in the 1980s). Then Dick Morley said “Let there be PLCs.” And they were good. For some time after that, Engineers were responsible for programming these things because the processing power was limited, and the needs were pretty basic.

But as time went on, the automation and the PLC programs got bigger and more unwieldy because they could. Many engineers back then had Boolean algebra and programming classes in college. They managed. But the PLC software began to get a bit hairy. Some of that hairy software was mitigated by the ‘1131 standard (known today as IEC 61131) with several different interpreter languages and a State Transition Language for managing state machines.

However, Engineers may not be getting the education that they once did. Most ABET certified engineering curricula offer programming and software design as an elective. Networking, Operating system design, and many other topics such as virtualization, version control, basic cryptography, and so on are taught rarely if at all. So it is entirely possible that an engineer may graduate with minimal exposure to foundational software and systems design subjects.

Fortunately, many people with an education in IT topics are arriving in this space. Let’s be honest: The automation complexity, networks, and embedded software components are too much for most engineers. We need an OT specialist who knows something about programming, operating systems, networks and the like. But they also need to recognize that those folk from IT are facing a very significant knowledge fire hose. That would include topics such as Engineering standards, Chemistry, Physics, Differential Equations, Linear Algebra, Biology, Fluid Mechanics, Thermodynamics, and Finance. It turns out that if these things are taught at all, they are electives in the ABET certified IT oriented curricula. The people coming to the OT space are just as lacking as the engineers.

Worse, neither of these highly educated people are familiar with basic plant bureaucracy such as Safety Standards, Standard Operating Procedures, Process Narratives, asset management, work orders, condition monitoring, Legal Requirements, and so much more.

We’re expecting a lot from the people who choose the instrumentation, design the process, program these controllers, and develop the HMI displays, the networks and the information flows. And it all has to be good enough that even a sleep-deprived operator can figure out who to call even while stress is sky high.

THE GAP OF KNOWLEDGE

There are no educational tracks that offer this level of diverse training, experience, culture, and so on. Culture? Why, yes. You may not like the buddy you were paired up to work with. But you need to watch their back and they need to watch yours to keep the two of you safe. Put your ego aside. If someone sees something they don’t like, neither you nor they should be embarrassed about stopping for a moment to understand what each of you perceives. That’s safety culture and you should never fault someone for doing this. Talk about it at the shop at the end of the day, but while you’re in the field, you work with each other. The people who play stupid political games are not welcome on a plant.

There isn’t even an informal mentorship program to bring most people up to speed. And then on top of all this we somehow expect these engineers and programmers to understand cryptography, networks, protocols, APIs, Kernels, and Domain Controller management (among many other things), so that they can build safe and secure automation, and write secure, maintainable code for automation systems.

I have read many anecdotes of how people with limited knowledge and experience do things. I don’t have a magical experience wand I can wave to bring pragmatic ideas to these people. There is so much to know and so much context to sort through. I have tried developing course work for local university curricula, but they’re not interested. I don’t think University leadership understands the scope of this problem.

I can’t easily explain to someone whose specialty is software, the problems with back-watering the hydraulic jump in a flume for flow measurement (in fairness, I have known engineers who lost sight of that problem too). I also can’t easily explain to a person whose background is engineering why setting the peer to peer RPI rate as short as possible isn’t necessary and may actually make things more difficult to work with.

Despite efforts by security proponents to encapsulate functionality, we still need to be able to talk to each other. We need better cross-training. When emergencies hit, we need to be able to function as a team, not as a bureaucracy.

BRIDGING THE GAP

I have two suggestions: First, we need to find new divisions and frankly, new bureaucracy, to split up this automation design and diagnostic discipline. Then we need to train people, and establish reasonable handoffs between the new disciplines based upon how we do diagnostics, development and configuration management. The way we do things now requires too much knowledge and experience for a very few key people.

Second, we need to include overviews of each sub-discipline to coordinate among a team. We need to summarize all this for operators and technicians so that they know which people to call for what.

The days when we could send a technician in to the field with a voltmeter, two way radio, and hand tools are gone. But for the love of all that is good, if we have to send a fully equipped committee in to the field to diagnose every automation problem, we’ll all be out of a job. As the software people sometimes say, we need to refactor this.

One idea worth borrowing from IT are the change management meetings. There are too many things going on at once. For what it’s worth, most people on a plant detest meetings. These meetings should not be sit-down affairs. There will be no doughnuts. Meet with the senior staff first thing in the morning, capture the conditions from the night before, create the work-orders, discuss any interlocking interests, and work out a plan for the day. Then convey that to the operators so that they know who to expect working where and what they’re going to do.

Above all, don’t be a compartmentalized mystery to the mechanics, electricians, instrumentation technicians, and operators. You’re a team. Work together as one. Just because you do software, don’t be shy about putting on a hard-hat, and PPE to go in to the field to work on things. You need to know what the people in the field are seeing. If someone sold OT to you as an office job, they lied. Take the same safety classes with the other staff. And don’t be shy about lending a hand.

Yes. I am a professional engineer. Yet I have examined instrumentation in a waste-water storage basin. I have climbed around a belt filter press to examine problems with a control program for a sludge dewatering process. I have climbed tanks to check on instrumentation problems and in to underground vaults to examine a problem with a flume. And I have climbed towers to inspect antenna systems to ensure they’re performing as designed. I’m not ashamed of breaking a sweat or getting dirty. If you’re serious about working in OT, you shouldn’t be, either. This is how you learn what is REALLY behind those numbers you gather in the control system.

Further, we need to establish better and more organized training at each industrial company to bring people up to speed on how things get done. We need mentors in various subjects. We need cross-training. Operators should know something about what happens when they push a button on a screen. Likewise, programmers need to know, not only how to make things work, but what to do when they don’t react as expected. They also need to know the quirks and limitations of the instruments that return the data, and how to cross check those instruments in a process. Engineers need to know how the networks are organized, what components the operating system has, and what the PLC or RTU protocols do, what APIs are in use, how the time is distributed through the various computing systems, and so on.

Why do all this? When an instrument goes bad, we all need to understand what will happen. When a patch comes along we all need to understand what changes to look for, and what tests to make to ensure that the patched device is acceptable to return to service. An operator needs to understand what normal process behavior looks like so they don’t annoy people every time they see something they don’t understand. And Technicians need to know who to call when they work on an instrument or a fiber-optic network segment. And finally, we need professors and instructors from local universities as well as public utility commissioners to spend time visiting these plants so that they understand the engineering, the software, the legal and performance requirements, the finances, and the societal expectations.

This kind of teamwork rarely happens organically. It needs support and organization from the top-down. It is even more important than the cybersecurity people are. You cannot secure something you do not understand.

http://www.infracritical.com

With more than 30 years experience at a large water/wastewater utility and extensive experience with control systems, substation design, SCADA, RF and microwave telecommunications, and work with various standards committees, Jake still feels like one of those proverbial blind men discovering an elephant. Jake is a Registered Professional Engineer of Control Systems. Note that this blog is Jake's opinion ONLY. No Employers, past or present were ever consulted with regard to these posts. These are Jake's notions. Don't blame anyone else for them.