Hart Energy Publishing

Best practices for SCADA security

With the threat of cyber attack growing in recent years, oil and gas pipeline operators are well advised to look for way to protect their SCADA systems.

September 1, 2009

W

ith the threat of cyber attack growing in recent years, oil and gas pipeline operators are well advised to look for way to protect their SCADA systems. To that end, there are a number of best practices operators can adopt to secure their SCADA systems, and their RTU installations, and detect and prevent both physical and cyber-related threats. The goal here is to describe the nature of the growing threat, and some products and practices operators can deploy to protect their systems against them.

The nature of the threat

Over the past few years, people have continued to ask this author whether there have been any attacks or intrusions into SCADA systems. The answer is yes. One notable attack occurred in Australia a few years ago, and continues to reign as the most famous. Key to this attack is the fact that it was targeted to the remote sites; hence, our focus on this particular aspect of a SCADA system. As reported by news.com.au and other sources, a disgruntled former employee of the contracting firm that installed a computer system for the Maroochy Shire Council, near Brisbane, later hacked into the system. According to a court statement, “He applied for a job with the council but was rejected[,] and later hacked into the council’s sewage control computers, using radio transmissions to alter pump station operations. Up to one million liters of raw sewage flowed into the grounds of the Hyatt Regency Resort at Coolum and nearby Pacific Paradise, where it ended up in a storm water drain.” The court statement went on to describe a great deal of environmental damage those attacks caused. Could this sort of attack happen to your system?

Securing RTUs at remote sites

In SCADA systems, the control zones are normally in remote areas, away from control center zones. This presents a number of unique characteristics which are notably different from control centers as well as plant processes. We will consider both the cyber and physical threats and offer measures in terms of monitoring for intrusions as well as prevention. The term “RTU” will be used for the electronic monitoring and control device at these locations. Please keep in mind that the device could actually be a PAC, PLC, or a product that uses some other, three-letter abbreviation.

Preventing cyber threats

In many systems, it is simply too easy to gain access via an RTU local serial port or, even worse, a dial-up radio or other network link that makes the RTU accessible from practically anywhere in the world. How important is this aspect compared to the rest of the SCADA system? In the episode in Australia, the attacker targeted the remote stations by using a radio to access serial ports and was able to operate pumps.

RTU ports can basically fall into one of two groups: local and remote. Local ports are wired directly to nearby equipment such as analyzers, flow meters, pressure transmitters and a PC or other HMI device. Wireless interfaces are becoming more popular for local links, e.g. wireless HART between an RTU and pressure transmitter and Bluetooth between a lap-top PC and the RTU.

If the RTU is not in a physically secure zone, a major risk is that anyone can plug-in to—or wirelessly access—the local port that is intended for configuration, taking readings and other, local operations via a PC.

Unfortunately, it is too easy to say that it is mandatory for the RTU to be physically secure and be done with it. Today’s trend toward wireless communications, even for “local” functions, re-introduces the risk of intrusion because the radio range can extend beyond the physically secure zone. A wireless local link, thus, shares a major risk with a remote port, which is defined as one with a modem, radio or other physical connection to a wide area network.

Since much of a SCADA wide area network is located, both physically and logically, outside of any of the operator’s secure zones, this is a major cause for concern. Authentication has emerged as the cyber security provision-of-choice when it comes to remote port access.

In some cases, protocol standards are being amended to adopt authentication. The DNP (Distributed Network Protocol) Users Group Steering Committee recently ratified a security extension that mandates the authentication of master devices through the use of one-way cryptographic hash functions employing a shared key in order to access critical DNP functions.   These critical functions include write, select, operate, direct operate, cold restart, warm restart, initialize application, start application, stop application, enable unsolicited responses, disable unsolicited responses, record current time and activate configuration. Authentication ensures that messages arriving at the RTU come from the control center, or other, legitimate asset in the SCADA system. Since the SCADA wide area network can be located mostly outside of any security zones, it is subject to eavesdropping.

A number of years ago, Bill Rush of the Gas Technology Institute (GTI) proposed SCADA message encryption to address this risk. As Bill pointed out, if someone can eavesdrop and learn to recognize messages, the party can likely also practice “spoofing,” that is, inject commands, which can operate process equipment or corrupt proprietary information.

This is the thrust behind the SCADA encryption standardization effort, which was originally proposed as American Gas Association (AGA) Report No. 12. Since then, the technical standards community has favored authentication over encryption primarily because it is much less resource-intensive and can more reasonably be retro-fitted in existing systems.

In any event, encryption standardization efforts continue and encryption is finding its way into new installations. Some data communication devices, such as radios, offer it as an option. Many IP-based systems use encryption and, for those users replacing direct-wire local links with wireless, it is also a feature of Bluetooth.

Monitoring and detecting cyber threats

At a minimum, the RTU must be able to log all activity on local or modem ports and report it to operators on the SCADA network. The NERC (North American Electric Reliability Council) CIP (Critical Infrastructure Protection) 005-1 standard requires 24/7 logging at all access points to the electronic security perimeter.

The Simple Network Management Protocol (SNMP) is emerging as a vehicle for security monitoring in SCADA networks. Traditionally used by IT to monitor components such as routers, servers and switches, SNMP is now being employed to monitor remote sites. For example, such control zone parameters as main power status, battery voltage, cabinet temperature, and door switch status can be reported via SNMP.

Similarly, SNMP can report activity on RTU serial ports. That information can be used for intrusion detection. SNMP operates over TCP/IP links and can function concurrently with other SCADA protocols. While DNP3 or IEC60870-5 protocols are used to transfer process or operational information between the SCADA server and the RTUs, SNMP is used, over the same physical network, in a background mode, transferring “shadow data” that is used for system health monitoring and security.

In this architecture, a Semaphore RTU is “Industrial Defender Enabled.” The Industrial Defender Risk Mitigation platform is a central monitoring system for the health, status and security state of critical cyber assets. By using Industrial Defender to maintain an ongoing inventory of cyber assets, automatic reporting is provided for CIP-005-1 compliance. The monitoring and reporting feature within Industrial Defender greatly reduces any manual reporting burden on the entity’s IT staff.

Preventing physical threats

Below are a number of measures that can help operators physically secure the RTU installations in their SCADA system.

The best practice for RTU location is to place it in a physically secure area. Risk is significantly decreased if the RTU is installed in a location with access control. Keep information about RTU locations secured. Risk is also significantly decreased if as few people as possible know the location of the RTU in the first place. Similarly, power and network cabling should be kept secure and out of sight. Information on their routing and termination locations should be secured.

In case of a main power failure, the RTU should include adequate battery backup to continue all operations for a time you determine. This time depends on how long you feel it could take to restore main power. Note that this does not mean how long it could take for operators to find out about the problem. The alarm system must inform operators of a main power failure immediately. Typical RTU backup times are between eight and 72 hours—the latter taking three-day, holiday weekends into consideration.

The backup batteries should be secured inside a locked cabinet with ventilation.  For outdoor locations, the most appropriate rating is Nema 3R or IP14. You must periodically maintain the batteries on a schedule provided by the battery supplier. You can expect a maximum of a five-year lifetime from lead acid cell batteries but you should check them at least once per year. In areas in which temperatures are often at the extremes of the operating range, battery lifetime is significantly reduced. The RTU should continually monitor the batteries and set an alarm if they lose their charge. If their condition is in doubt, replace the batteries.

Include line filters and surge suppression on the power input. Accidentally or otherwise, and battery-backed or otherwise, power problems should not take the RTU out. Always keep RTU cabinet doors closed and secured. Once the door is opened, it is just too easy to cause any number of problems.

If the RTU is not in a physically secure area, then you must keep keypads, pushbuttons, and switches secured. Users should have to open up a door that is secured by access control—which could be as simple as a key lock—in order to access these devices.

Of course, this is all easy to say but what do you do about an existing installation? In most cases, it has been feasible to secure the room or building in which the RTU is located. In cases this has been impossible, it was better to secure the RTU inside a locked cabinet or put a gate around it. Ideally, both the room and the RTU enclosure are secured.  However, you may have to settle for one or the other.

Finally, be on the alert for innovative methods of disabling the RTU. In other industries, computer equipment has been disabled through the use of fire extinguishers, other chemical spray, excessive dust or sand, flooding, sprinkler systems, radio interference and surges on wiring. Vulnerability assessments must include such scenarios, even though they would likely be far down the list in terms of risk. Best practices in terms of locating and physically securing the RTU should prevent these problems.

Monitoring and detecting physical threats

The RTU should detect entry into the physical secure zone via an access control device, that is, when a door or gate is opened, and alert operators via an alarm. The RTU should also continually monitor main power and report an alarm on main power failure. The device must also be able to report that a user has plugged a hand held device or PC into the local port—or gained access via Bluetooth or other, local wireless link. This could be an alarm, but some users simply log it as an event.

Log an event when the user signs on by entering a password. Log an event for each value change the user makes. Operators must be aware that value changes are being made, locally.  Log an event when the user signs off and either log an event or clear/reset the alarm when the user unplugs the hand held device or PC. If the user forgets to sign off, the RTU should automatically do this after a set time.

Alarm clear/reset when the door closes. What if the user forgets to close the door? The original alarm, set upon opening of the door, should continue to be displayed as a live alarm. As a further provision, you can consider escalating that alarm after a certain time.

Coordinate the alarms, mentioned thus far, with operating procedures. These procedures should include schedules for site visits and ways to keep operators informed regarding them.  Don’t disable alarms just because operators know that a site visit is taking place. Keeping alarming active reinforces procedures and allows the alarms to be kept in a history.

The RTU should not only report alarms over the SCADA network on a priority basis, it should also keep a date-and-time-stamped record of all alarms and events locally in memory.  The memory must be non-volatile. RAM must be backed up by a battery and Flash, which does not require battery back-up, is now being used more often. Many of today’s RTU products incorporate data logging capability, including maintenance of an alarm/event log.  In the gas flow computer business, this is known as the “audit trail.”

One problem with an alarm/event log is a “noisy” alarm condition whose recurring messages fill it up. Not only is this very annoying but, worse, meaningful messages drop out and are permanently lost. In most cases, it is simple to automatically filter out these transitions or disable the alarming characteristic of the misbehaving input.

The alarm/event log is an excellent backup in case of problems with the SCADA host or network, which could cause alarm reports and event logs to be lost. Typically, it allows the user to access all such information, locally. In addition, many RTUs will allow the audit trail, as well as historical averages and totals, to be transmitted to the SCADA host once communication is restored.

You have seen that many of the security tactics in described here involve the use of the RTU for alarm reporting. Please be aware that a common problem with SCADA alarm systems is that engineers are tempted to define too many points as alarms. These quickly become “nuisance” alarms, which are ignored. You should avoid this situation because the alarm system should never lose credibility with operators for any reason. Far worse than that, it creates a situation in which an operator can be easily overloaded and overlook an important development. It is even possible that a security violation can occur because operators are decoyed by a deliberate overload. Your alarm system design should define alarms points as sparingly as possible and it should use alarm management as a further measure to reduce the quantity of alarms generated from any process or zone.

Finally, for remote site security, using the RTU to report alarms for fire, smoke, water spray or water flooding is also very feasible. The RTU can also be put in the security loop through interfaces with access control devices and video cameras.

Best design practices

Best practice system design calls for provisions in case of various failures (or breaches) of the SCADA system. In case the host computer or network fails, the RTU should independently monitor and control the process. Remote processes, today, should not depend on the availability or performance of the network.

The RTU should continue operating even in case the network is jammed or one or more ports are kept busy. While this would amount to a denial-of-service attack on the RTU, we have seen many cases in which the SCADA network was simply overloaded. The multitasking kernals in today’s RTUs prioritize tasks and allow the measurement and control functions to continue even with heavy activity on the network.

You should also consider a redundant network. Competition in the communications industry has resulted in decreasing pricing for hardware that includes cellular radio, licensed radio, spread spectrum radio and wireless Ethernet. There are some users who will scoff at this because they’ve found that selecting even one network is difficult enough. But, increasingly, users are installing redundant SCADA networks. Most SCADA software will automatically switch over to a standby network if the primary network fails. At the RTU, the standby network uses a separate communication port that is not affected by problems on the primary network port.

To detect tampering with process equipment, you can use sanity limits or sanity condition tables to validate commands or process conditions. Even though no RTU includes expert system software, you can still put your expertise in the RTU program, whatever the programming language. If you know that all three compressors shouldn’t be on when line pressure is 800 psi or higher, put that in the RTU. The RTU should know that more than one meter run valve should be open if the run #1 differential pressure is 50 inches of water or less. 

Your first reaction might be that this would add too much complexity to the RTU but some languages make the programming almost as easy as making the statement. If access control is violated and someone manually changes a process equipment setting, the RTU could detect it and report an alarm.

Finally, best practices for system design call for provisions in case of RTU failure, regardless of security issues. Upon failure, what happens to the control outputs, with or without power, is a basic design issue. If power remains available, many devices allow selection of a “safe mode” for the outputs. Process equipment continues to run in a reasonable manner. You also need a separate provision to cover the case in which the RTU fails and all power is lost.  Equipment run using backup power must have a “safe” default setting. Many users have rock solid procedures for activity at the sites in response to any failure or security breach in the SCADA system. You need to be in this category.

Conclusion

Today, information that is widely available – and products and technologies that are now on the market – allow SCADA system operators to install and maintain very secure systems. Utilities need to be well aware of NERC CIP, which requires compliance in your planning, processes and procedures. Another important standard is NSI/ISA-99 (Parts 1 and 2). Entitled Terminology, Concepts and Models,” Part 1 (ANSI/ISA-99.00.01-2007) is part of the overall standard entitled, “Security for Industrial Automation and Control Systems.” This standard lays a solid groundwork for upcoming standards on establishing and operating a security program and technical security requirements, but it is definitely a work-in-process. Part I, which is now available, establishes important, “common ground” in definitions of security-related concepts, assets, risks, threats, and vulnerabilities. Today, users can assess threats, both physical and cyber-related, and implement measures for detection as well as prevention of intrusions and attacks in their SCADA systems. 

Acknowledgment

Based on a paper presented at the ENTELEC 2009 Conference and Expo, held April 29-May 1, 2009, in Houston, Texas.