DXP Management Portal (PaaS) deployment issue
Incident Report for Optimizely Service
Postmortem

SUMMARY

Episerver Customer-Centric Digital Experience Platform (DXP; formerly Digital Experience Cloud™ Service - DXC Service) is the cloud-based offer from Episerver based on Microsoft cloud technology. A solution that delivers high availability and performance, easy connectivity with other cloud services and existing systems, ability to manage spikes in customer demand, and a platform that is ready to seamlessly adopt the latest technology updates. 

On November 26, 2020, a subset of DXP customers experiences timeouts during deployments and the following information provides the details around this incident.

DETAILS

A limited subset of deployments started to encounter timeouts during the deployment. Most deployments were still running properly.

TIMELINE

November 26, 2020

08:46 UTC: First deployment is affected by an incident affecting Azure Automation in West Europe region.

10:47 UTC: Incident is escalated to the Episerver engineering team.

11:37 UTCActive Automation account is switched to North Europe.

11:41 UTC: Episerver published a notification about the issue to DXP Management Portal users. 

13:43 UTC: No more timeouts seen with deployments and the incident is closed.

ANALYSIS

The deployments that timed out were all located in West Europe and we suspected that it might be a problem within that region.

IMPACT

A subset of customers experienced timeouts during deployments to their DXP environments. Most deployments were running properly and retries worked as a mitigation/workaround.

CORRECTIVE MEASURES

Deployments were previously run on both North Europe and West Europe. This was changed to run only in North Europe instead to mitigate the issue.

FINAL WORDS

We place the utmost importance and pride on achieving and sustaining the highest level of availability for our customers and we regret any disruption in service you have experienced. We continue to work tirelessly to ensure any and all service disruptions are prevented and or mitigated and we will use this incident to further these efforts to help ensure you receive a reliable and positive experience.

Posted Dec 11, 2020 - 08:58 UTC

Resolved
This incident has been resolved.
Posted Nov 26, 2020 - 13:49 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 26, 2020 - 11:40 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Nov 26, 2020 - 11:34 UTC
Investigating
We are currently investigating an issue with DXP deployments.
Posted Nov 26, 2020 - 11:09 UTC