Between Oct 18th and Oct 21, Galaxy’s UI interface and deployment functionality were sporadically impaired, for as long as 1.5 hours at a time. During this time period, some users were not able to deploy new applications, deploy updates to existing applications, or load the interface that showed the status of their applications. Already running applications were not affected.
We have identified the root cause underlying the issue. A code release that added more instrumentation negatively impacted the Galaxy management application’s performance. We identified the problematic code and rolled out the fixes on Friday, Oct 21st.
Going forward, we have identified improvements to monitoring to catch these performance degradations earlier. We have also improved our performance analysis tooling to reduce the identification time in case of a similar event. We apologize for the service interruption and we are following up on these issues to ensure Galaxy's uptime improves in the future.