SINGAPORE – The main cause of two recent IT system outages at public healthcare institutions was the failure of hardware devices in data centres and, based on investigations so far, there have been no indications of security compromise to the affected systems.
Senior Minister of State for Health Janil Puthucheary revealed this in Parliament on Monday, where he outlined what was known so far and said the Ministry of Health was investigating the incidents.
Public healthcare IT infrastructure is housed in more than one data centre to enhance resilience and prevent redundancy. Within this infrastructure, there are multiple nodes. Nodes are hardware devices that operate in tandem so that if one fails, the load of data traffic is managed by other nodes.
This is to ensure that operations remain uninterrupted, and this system has generally been working well until the recent outages, Dr Janil said.
On Aug 25, one of the nodes failed. It happened again on Aug 26, when another node failed. But the systems and services continued functioning. On Aug 27, engineers tried to restore the two failed nodes, but this operation failed. This led to the outage that day.
The outage on Sept 5 was caused by the simultaneous failure of two further nodes.
The hardware that was involved is also used in other parts of government systems, but in relatively limited numbers, Dr Janil said.
The Government is working with the various agencies involved in public sector technology, cyber-security agency Government Technology Agency and others to share information, and they are looking to scan the various infrastructures to detect potential vulnerabilities.
The outage on Aug 27 affected 26 IT applications including electronic medical records, appointments, pharmacy and laboratory systems. A total of 17 public healthcare institutions, including acute care hospitals, community hospitals, specialist outpatient clinics and all polyclinics, were affected.
On Sept 5, another outage affected eight public healthcare institutions and two of the three polyclinic groups. The time to recovery of the system was longer, hence operations and services were switched to their back-up infrastructure, Dr Janil said.
Both incidences caused a significant impact on operations. Staff had to work doubly hard, patients experienced longer wait times of up to an hour at the affected institutions, and some even had their outpatient appointments rescheduled.
Fortunately, there was no compromise to urgent care services and nobody was turned away from the emergency departments, Dr Janil said.
The failure of the nodes was caused by bugs in the firmware of the devices, Dr Janil said. They have since been identified by the manufacturer Cisco, and the devices have been patched.
The simultaneous failure of two further nodes on Sept 5 came from the same model by the same manufacturer, but the way in which this failure occurred was noted to be different from the Aug 27 incident.
It was assessed that it would take longer to restore operations, hence the decision was made to switch operations to the back-up systems. The root cause of why these two nodes failed is still under investigation, Dr Janil said.