Episerver Customer-Centric Digital Experience Platform (DXP; formerly Digital Experience Cloud™ Service - DXC Service) is the cloud-based offer from Episerver based on Microsoft cloud technology. A solution that delivers high availability and performance, easy connectivity with other cloud services and existing systems, ability to manage spikes in customer demand, and a platform that is ready to seamlessly adopt the latest technology updates.
Starting on 25th February, 2020 a limited subset of customers using instances of App Services hosted in North Europe may experienced HTTP 503 response on their websites. Root cause analysis has been provided by Microsoft and the following report describes additional details around the event.
Between 19:14 UTC 2020-02-25 to 9:46 UTC 2020-02-26 , a subset of customers using App Service hosted in North Europe may have experienced HTTP 503 response code when accessing App Service.
2020-02-25 19:14 UTC - First alerts for client websites is received and investigation is initiated by Episerver.
2020-02-25 19:46 UTC - Support ticket raised with Microsoft.
2020-02-25 21:05 UTC - Issue mitigated for the initial customers who were impacted.
2020-02-25 23:12 UTC - StatusPage updated.
2020-02-26 06:16 UTC - Mitigation efforts ongoing by Episerver & Microsoft for remaining impacted Clients.
2020-02-26 09:46 UTC - Mitigation was completed successfully. Issue was monitored.
2020-02-27 08:41 UTC - Incident closed. Microsoft continue the investigation to establish the full root cause.
2020-03-12 07:51 UTC - Microsoft officially provided root cause analysis.
The issue happened because of a platform change. Microsoft Engineers determined that the issue was related to configuration update occurring as a part of deployment on Microsoft Azure. It caused the data roles to switch which in turn was causing a null token exception. Once the deployment was completed the error disappeared.
A subset of customers may have seen HTTP 503 response code while accessing App service.
Since the root cause was discovered, necessary fixes have been implemented to mitigate the issue from re-occurring.
Microsoft is continuously taking steps to improve the Microsoft Azure Platform and their processes to help ensure such incidents do not occur in the future. In this case, this includes (but is not limited to):
• Improving the resiliency of these types of error to allow for graceful error reporting and potential recovery paths.
We apologize for the impact to affected customers. We have a strong commitment to delivering high availability for our services and we will do everything we can to learn from the event and to avoid a recurrence in the future.