Major internet outage

Major incident Network Core Nationwide Ethernet Nationwide FTTx Basingstoke Farnborough Langley Rothwell
2023-10-02 16:10 BST · 3 hours, 47 minutes

Updates

Post-mortem

This incident was caused by third party engineers operating in our rack space. They were installing new fibre optic SFPs in equipment into unused slots.

As the engineers shut the rack doors, the router detected a power off request from the on board power button. Causing the system to shut down.
As this wasn’t a clean shutdown, the rest of the network suffered several BGP flaps and time outs. The external peers were configured with hold times of 180 seconds. Which caused traffic to be black holed for 3 minutes when the router had shut down.
Engineers reacted promptly directing traffic over alternative routers.

During investigations, the router was powered back up. At this time it began announcing routes to our upstream providers. The internal BGP session did not establish correctly, causing a further black hole event for some traffic lasting approximately 4 minutes whilst this was resolved.

This was shutdown as soon as the issue became noticed. The restoration of the network was completed last night during an emergency change window.

The network has been stable and operating as expected since this works. There was no outage to traffic flow whilst this work was completed.

This issue was human error on behalf of a third party and we are developing changes to the infrastructure to reduce impact should this happen again.

If you have further questions / concerns please do reach out to us.

October 3, 2023 · 08:54 BST
Resolved

This incident that been resolved.

Our NOC team will continue to monitor the network as per business as usual and any changes which will aid network stability will be rolled out in due course (notifications via this website).

October 2, 2023 · 19:56 BST
De-escalate

Upstream network transit is back online.

We are continuing to investigate the root cause.

Further works will come from this incident. We apologise for any inconvenience caused.

October 2, 2023 · 17:51 BST
Issue

There is an ongoing incident which has caused loss of internet connectivity from our London infrastructure.

We are in the process of restoring connectivity. Apologies for the outage.

More updates as we investigate.

October 2, 2023 · 15:52 BST

← Back