Ensuring Cyber Resiliency for OT Systems

Ensuring Cyber Resiliency for OT Systems
Ensuring Cyber Resiliency for OT Systems

Cyber resilience is the ability for an entity to continuously deliver the intended outcome despite cyber-attacks. In this case, the “entity” could likely be your plant and the “intended outcome” is the results produced by your operational technology (OT) efforts. Stated simply, being cyber resilient means your operations stay in operation even though they may be under cyber-attack.

“Cyberworthiness” is an assessment of the resilience of a system from cyber-attacks. It is applicable to software and hardware elements like standalone software, code deployed on an Internet site, browsers, manufacturing equipment or Industrial Internet of Things (IIoT) devices.

Whether intentional—as in a cyber-attack—or unintentional—as in a failed software update—adverse cyber events negatively impact the availability, integrity, or confidentiality of networked OT and information technology (IT) systems and associated services.
 

Cybersecurity versus cyber resilience

Cyber resilience is designed to prevent systems and networks from being derailed in the event that security is compromised. The manufacturing line, refinery or pipeline “stays” operational. Cyber resilience means that cybersecurity is effective without compromising the usability of OT systems (Figure 1).

Figure 1: Cyber resilience means that cybersecurity is effective without compromising the usability of OT systems.

According to Phil Tonkin, field CTO at Dragos, cybersecurity is concerned with the protection of digital systems, whereas cyber resilience considers the real-world implications of cyber events—extending beyond the digital defense perimeter to encompass the ability of an organization to maintain its core functions and recover swiftly from any form of cyber disruption. “In the world of OT, infrastructure owners as asset managers are concerned with the integrity and reliability of their assets. An electric company needs to worry about keeping a reliable, efficient and clean energy supply to its customers, how they achieve that is resilience. It’s not just protecting the system against compromise but managing the risks of downstream effects.”

Greg Hale, editor and founder of ISSSource, said that resiliency is a plan to find ways to keep the plant/network/system up and running despite an ongoing attack. It is related closely to the business continuity plan. “Cybersecurity, on the other hand, is the overall general idea of protecting assets. The government says resilience entails the ability of a system to anticipate, withstand, recover from and adapt to cyberattacks and natural or accidental disruptions,” he said.

Hale wrote in a recent article in The Source: “A core meaning behind cybersecurity is keeping systems up and running and secure against any kind of attack. But when an organization does suffer a hit, the next step in the ladder of protection needs to be resilience—how to stay up and running no matter the type of assault.”

“Cybersecurity focuses on the implementation of capabilities and controls such as identification, detection, protection and so on, whereas resilience relates to the ability to withstand attacks, bring appropriate response and ability to recover swiftly,” said Mansur Abilkasimov, vice president of Cyber and Product Security Strategy and Governance at Schneider Electric.


Need for cyber resilience is real

Hale points out that one of the classic cases of a lack of cyber resilience is the Colonial Pipeline incident a few years back (Figure 2). “There was a ransomware attack on the company’s IT department and while OT systems remained up and capable of running, the company shut down completely for about four or five days ‘out of an abundance of caution.’ The real reason was the company’s billing system was run on the IT side and if that was held for ransom, the company could not bill its customers and therefore not make any money, so they had to shut everything down. Even though OT was not affected, they had no plan on what they should do to stay running in case of an attack.”

Figure 2: One of the classic cases of a lack of cyber resilience is the Colonial Pipeline incident.

Roy Kok, senior partner and Alliances specialist CLPA at Mitsubishi Electric Automation Inc. said that cyber resilience becomes an interesting challenge for Mitsubishi Electric going forward “because we’re the only company that’s offering combined networking. Most industrial automation companies have a control network and an information network, the control network being focused on deterministic performance and also being dedicated to doing the control. And then of course, the information network is open to the IT world, performance management, quality and so on.”

With combined networking, cyber resilience is increasingly important. “Our protocol is called CC-Link IE TSN. IE stands for ‘industrial Ethernet.’ TSN [time-sensitive networking] is the enhancement to the Ethernet spec that happened back in 2016, which allows you to have deterministic performance. It’s like setting up a private channel on Ethernet that guarantees that your control will have deterministic performance regardless of anything else on the network. The spec has been enhanced to allow scheduling of communications, which means that means devices on a network know when they have an opportunity to speak—traffic shaping.”

The tie-in with cyber resilience is Mitsubishi Electric’s push to bring these security efforts to CC-Link and TSN. “By combining the networks” Kok continued, “there are little things that we take for granted. When you make a device that is compliant with our protocol, you get SNMP [simple network management protocol] support in the device as well. And SNMP lets IT systems ping and communicate with all kinds of endpoint devices. Those endpoint devices used to be isolated on a control network but are now exposed because they’re on a combined network.”

There is greater access to information. “It gives you greater ability to manage all the devices on your network,” said Kok. “Cybersecurity tends to be more important in that world. We're creating the opportunity for smarter machines because you have better communications with every aspect of the machine from its control devices to its PLCs [programmable logic controllers.

Abilkasimov  said the cybersecurity threat landscape is continuously evolving, and as a next step organizations should validate if their cybersecurity controls can respond to their current environment or threat landscape. Schneider Electric’s cybersecurity resiliency approach is multifaceted. “This strategy starts at the top. The cybersecurity objectives are set by the Global CISO [Chief Information Security Officer], and the implementation of the strategy is carried out by the executive management team as a whole. A key element of the initiatives are the employees, so the resilience strategy includes robust training and education of all its employees. The strategy company-wide, risk-informed approach that has preventative (breach readiness) and response (breach resilience) measures in place for potential incidents,” he said.

Schneider’s program includes:

  • Employee training and awareness: The company aims to raise employee cybersecurity awareness, provide relevant training and create a culture to empower employees across IT and OT to act in a secure manner. The training includes an annual baseline awareness course for all employees and role-based trainings for specialized populations including cybersecurity site leaders.
  • Enterprise risk management (ERM) framework: Schneider Electric categorizes and translates cybersecurity risks into business and operational scenarios and exposure. This exposure is communicated with the C-suite to drive investments in risk mitigation initiatives. This framework is aligned to National Institute of Standards and Technology (NIST) Cybersecurity Framework and increases the company’s overall level of cyber resilience.
  • Incident response capabilities: Schneider Electric is constantly testing and improving its capacity to respond to operational disruption, damage to customers, compliance issues and IP theft. Its incident response plans are defined, and stress-tested routinely to ensure preparedness. The Security Operations Center (SOC) operates 24/7/365 and is staffed with security analysts leveraging security incident and event management (SIEM) capabilities with OT scenario-based playbooks and responders.
  • Crisis simulation exercises: Crisis simulations aim at training senior executives through operational roles, enhancing external collaboration and internal coordination and reviewing internal processes around crisis resolution. The company’s simulation activities follow a comprehensive framework with realistic and risk-based scenarios for the best outcomes and learning. The goal is for simulations go beyond testing and training and focus on examining and improving operational processes while enhancing readiness for future crises through experiential learning.

The combination of these programs ensure that cybersecurity risk is not an afterthought for the organization but rather an intentional practice to ensure cybersecurity resilience.

“Dragos emphasizes the importance of understanding the specific threats and vulnerabilities that could impact critical systems and assets and ensures that important context is built into its technology,” said Tonkin. “This begins with a thorough assessment to identify the ‘crown jewels’ or most critical components of an organization’s operations. Based on this assessment, Dragos advocates implementing controls that are proportionate to the actual threats and vulnerabilities identified.”

For example, a prominent water utility, responsible for managing 20 dams and 2,000 kilometers (1,243 miles) of pipelines, recognized the critical nature of its infrastructure and took steps to adopt a proactive cybersecurity stance to get ahead of potential threats. Audits pinpointed areas that needed improvement, raising leadership’s awareness of the importance of OT cybersecurity.

When seeking a cybersecurity provider, the utility prioritized OT-specific expertise and reputable providers. The water utility adopted the Dragos OT cybersecurity platform to streamline and advanced its cybersecurity programs to ensure the secure delivery of water to more than 5,000 commercial customers and enable critical projects in collaboration with industry, mining and government agencies.

The partnership with Dragos has resulted in increased efficiency, productivity and cybersecurity readiness. The utility is prepared to counter evolving cyber threats and plans to expand the footprint of the Dragos Platform in the future by adding sensors at prioritized sites.


Automate—with caution

In an ISAGCA blog post, titled “The Danger of Overreliance on Automation in Cybersecurity,” Zac Amos, features editor at ReHack, and frequent contributor to the ISAGCA Blog wrote: “Automation is critical in enhancing cybersecurity efforts, and speed is one of its key benefits. Most cyberthreats spread quickly, such as ransomware or worm attacks, and automated systems can detect and respond to them much faster than humans can. AI [artificial intelligence] also ensures consistency because it can do repetitive tasks with high accuracy. However, it’s easy to rely too heavily on automation to provide cybersecurity. The volume of logs, alerts, and incidents is multiplying exponentially, and automated tools can analyze vast amounts of data without getting overwhelmed. This can be a double-edged sword, though. Companies should have a healthy balance of tech and human talent when keeping systems safe.”

Amos warns that some of dangers of being overly dependent on automation in cybersecurity include a false sense of security, false positives and/or negatives, lack of context, reduction in human expertise and reliability concerns to name a few. “Believing that automated systems will catch every threat can make organizations complacent. No system is perfect, and new, unforeseen threats are always emerging,” he said.

“Automated systems can generate false positives, which can desensitize security teams if they happen frequently,” Amos said. “Conversely, false negatives, where a genuine threat goes undetected, can have severe implications.” In addition, “automated systems lack the human intuition and context needed to evaluate the risk and importance of a particular alert. A seasoned security expert can differentiate between a benign activity that looks suspicious and a genuine threat. Over-relying on automation reduces the need for human experts, which means an organization might have fewer experts who fully understand the system. This can be dangerous if things fail or are compromised.”

Reliability is always a concern when using automation to bolster cyber resilience. “Like any technology, automated systems can fail. Overreliance without redundancy can lead to exposure when these systems experience downtimes,” Amos said.


Becoming cyber resilient: awareness

When it comes to cyber resilience, the biggest difference now from three or four years ago is awareness. “Companies understand they can’t fight off all attacks and some will get in. Depending on what kind of plan they have and how they approach it, remains up to the individual company,” said Hale.

Hale said that organizations’ approach must shift from a futile quest for absolute invulnerability to a more realistic strategy of resiliency in which they can control the impacts of failures. Resilience means organizations need to identify the most critical assets and determine what they find as an acceptable return to operations. “Today, organizations are more aware and more tuned into the idea that attacks are going to happen so they better be protected and then understand—and have a plan—as to what they should be doing and what should happen if an attack makes it in and starts to create issues. This is also where quality segmentation and micro segmentation come into play… Three years ago, they were running around putting out fires and trying to ward off attacks. Today, companies have realized attacks are going to happen, so let’s figure out what are the most important areas we need to protect and then create a plan around that.

“Industry is maturing in its understanding of cybersecurity. Gone are the days of lacking broad attention for the topic when it was viewed as a technical issue rather than a strategic one,” said Tonkin. “Today, the subject of managing cyber risks to improve operational integrity and resilience is becoming much more aligned with the overall risk management of organizations. This maturation in approach reflects a deeper understanding of the interconnectedness between cybersecurity and business continuity. Organizations are now more proactive in identifying and protecting critical assets, assessing vulnerabilities and implementing comprehensive cybersecurity measures that support resilience. This includes not just technological solutions but also organizational and procedural changes to enhance the ability to withstand and recover from cyber incidents.”

This column originally appeared in the April 2024 isue of InTech digital magazine.

About The Author


Jack Smith is senior contributing editor for Automation.com and InTech digital magazine, publications of ISA, the International Society of Automation. Jack is a senior member of ISA, as well as a member of IEEE. He has an AAS in Electrical/Electronic Engineering and experience in instrumentation, closed loop control, PLCs, complex automated test systems and test system design. Jack also has more than 20 years of experience as a journalist covering process, discrete and hybrid technologies.

Download Intech Digital Magazine

Did you enjoy this great article?

Check out our free e-newsletters to read more great articles..

Subscribe