I wrote that there would be more about this subject in my earlier blog about self integrity monitoring at the Physical Layer of a SCADA system.
In this discussion I will outline some things we’re doing to test for self integrity at the RTU level of a SCADA system.
Some of you may know that I am chair of the DNP Users Group. I am therefore quite partial to the DNP3 protocol and my remarks will be made in that context. However, what I say here can be applied to other common SCADA protocols.
In the DNP3 protocol, there are two octets (bytes) of flags sent back with RTU responses to a READ function or in an unsolicited report. These are single bit flags that may have useful information in them.
For example, the DEVICE_RESTART flag indicates that the RTU has been restarted (rebooted). This flag is extremely important. You should not see this routinely. There are two reasons for seeing this flag: First, someone is known to be working on the RTU. Second, there was an extended power outage and the backup power source ran down. If neither of these two reasons are known to be the cause, you should investigate right away. It may be something as mundane as hardware failure, but it may also be someone messing around with a remote station. Either way, you must investigate.
Another couple flags of interest are the DEVICE_TROUBLE and LOCAL_CONTROL flags. These flags are basically there for the RTU software to do as they wish. It may indicate that the RTU isn’t able to see some of the networks or hardware it is configured for; or that someone is working on the RTU (perhaps downloading a new configuration). The exact meanings of these flags are dependent upon what the RTU vendor says they mean. In any case, however, if you see these flags and you don’t know of anyone working on that RTU, it is a call for help by the RTU so you should send someone there to investigate.
Then there are other flags indicating that what you’re asking the RTU to do is not how it was configured. In normal operation, you should not see these flags either. They are NO_FUNC_CODE_SUPPORT, OBJECT_UNKNOWN, and PARAMETER_ERROR. The first of these is that you’re asking the RTU to do something it doesn’t understand. For example, you might send an authentication challenge, even though the RTU doesn’t know how to authenticate. The second, means that RTU might know that you’re asking, but it doesn’t know what sort of object you’re trying to read. For example, you could be requesting an RTU to report a string, but it is not equipped to receive or report strings. The last error flag is where an RTU knows what you’re asking for, it has objects like that, but it doesn’t have anything with that address. An example is where you want to read the state of relay number 8, but it only has addresses 0 through 7.
These flags should never be seen in normal operation. If you do see them it indicates that something is wrong with your records of how the RTU is configured. It also might mean that someone is probing your RTU to see what I/O it has! So if you didn’t issue any commands to the RTU and all of the sudden you see a PARAMETER_ERROR flag, you may well want to start sweeping the network to see who else is on there.
Another flag worth watching is the BROADCAST flag. Generally I do not recommend the use of the BROADCAST features of any SCADA protocol because there is no guarantee that each RTU will receive the message. Such messages might be for things like resetting counters, freezing analog values for event reporting, and the like. There are other ways to generate such events in the RTU, so I generally recommend against using this feature. However, I do not wish to completely preclude the use of broadcasts. Load Shedding commands don’t have to be received by every RTU. If a few miss it, it’s not a huge problem. Also time broadcasts could be reasonable. If a device doesn’t get the time broadcast, it can just pick it up on the next broadcast.
Nevertheless, if an unexpected BROADCAST flag shows up, you might want to investigate. It could be due to the remote receiving a distant SCADA master belonging to another utility. I reported a case recently where we received “alien SCADA from New Jersey.” It turns out that they were properly licensed and so were we, but due to unusual radio wave propagation conditions caused by weather, we were hearing their traffic and they were probably hearing ours.
Fortunately for both of us, we were using different SCADA protocols, so there were no significant concerns other than a slight increase of data errors. But had they been using the DNP3 protocol, things could have been very different.
This brings me to another concern: Ladies and Gentlemen, we try to be organized and neat by assigning RTU addresses from some low number. Please stop. The DNP3 RTU address space is from 0 through 65519. You can use any address you like. it is worth noting that the SHODAN search engine presumes that most people start from the low addresses and searches them up to around 100. Most SCADA systems typically do have an RTU or a Master in that range. That’s how they get discovered. But if you start numbering from around 8000 or even 49233 you’re less likely to get discovered quickly if you’re exposed to the Internet. Very few crawlers will search the whole DNP3 address space. Remember, the address only has to be unique in your communications system. There is nothing significant about low address numbers.
Another reason such addressing practice will help you in case you should accidentally receive an adjacent SCADA system who also happen to be using DNP3 (or whatever other protocol you happen to be using). It would be a good idea to look up neighboring radio licenses on your frequency and introduce yourself to them. If you have interference problems (whether caused by your neighbor or not) you can help each other. You can also coordinate with those who use the same protocols as you do to ensure that if you hear each others radio traffic over a data radio, that you won’t react to it.
To summarize: an RTU that returns a BROADCAST flag when you’re not expecting to see one may not be the end of the world; but you may want to look around to see what else is going on. Some people may choose to use broadcasts to mess with your SCADA system. One particularly egregious use of the broadcast feature is to broadcast a restart command. Lightly configured RTUs will react to such commands and it will be a significant problem. If you see that broadcast flag, it’s worth a look.
Many RTUs have features in them that allow the RTU to be selective as to which master station addresses are allowed to send traffic to them. Most system integrators want to be lazy and not bother with such things because it’s a hassle. Nevertheless, this feature is worth configuring! It prevents the remote from reacting to other SCADA traffic it doesn’t know about. Purists will argue that this is security through obscurity, –and it is. However, it can also prevent problems from other SCADA systems accidentally sending stuff to your RTUs or from casual drive-by attacks.
Another flag worth watching is the NEED_TIME flag. Some systems prefer to set the time before this flag is raised. Most use it as an indication that the remote is requesting the time from a master. But it shouldn’t do this too frequently. If the RTU has a local clock reference, it shouldn’t raise this flag at all. If it raises the flag too frequently, something is misconfigured.
To summarize: there is a common theme that you should pay attention to here: You can not remain ignorant of your communications protocols any longer. If you use TCP/IP, you need to know how it works. If you use Ethernet, you need to know how it is framed and switched. If you use DNP3, you need to know what it does and how it does it.
Yes, I am strident about knowing how your SCADA system works. If you are actually attacked, you need to understand what has been compromised. Discovering how your SCADA system works while you’re being attacked is extremely confusing. You also need to know what the precursors to an attack look like. The days when you could blissfully remain ignorant of how your SCADA system does what it does are over. We do not have magic OT fairy dust hardware that we can sprinkle on all of this to make it better. Just as it is with a car, you may not like working on one, but you can not be a safe driver while remaining ignorant of what is under the hood.
There are still more things one can do to determine SCADA system integrity. I’ll get to those in a future blog. The ones here are the sorts of things that you can do today, just by adding some alarms on your master station.
Keep reading, there will be more goodness to come.
(Updated to fix some grammatical errors and to improve sentence structure)