Elevated level of application errors
Incident Report for Core Practice
Resolved
Summary of Impact: On 28th August 2024, between 06:10 PM and 6:40 PM (AEDT), a significant number of users intermittently encountered application errors, which inhibited their ability to access the Core Practice system.

Root Cause: Upon investigation, we identified that a small number of dependency requests were returning 500 errors, preventing the Core Practice application from loading successfully. This issue was exacerbated by the recent migration of Core Practice to a new infrastructure architecture, which took longer than expected to diagnose. We found that some server pools were functioning as expected, and steps were taken to reroute all traffic to these unaffected pools. Further investigation revealed that one of the servers in the affected pool had failed to load, prompting us to rebuild the affected server.

Mitigation: The Core Practice team is committed to improving infrastructure through the new server migration to enhance security and redundancy by routing traffic to healthy pools. We are currently working to improve our emergency procedures to reduce switchover downtime and are reviewing processes to automate repairs without manual intervention. We apologize for the inconvenience caused and thank you for your cooperation.
Posted Aug 28, 2024 - 21:04 AEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Aug 28, 2024 - 19:00 AEST
Update
We are continuing to work on a fix for this issue.
Posted Aug 28, 2024 - 18:54 AEST
Identified
We have identified the affected servers and rerouted all traffic to unaffected servers. We are currently investigating the root cause.
Posted Aug 28, 2024 - 18:48 AEST
Investigating
We're experiencing an elevated level of application errors and are currently investigating the issue.
Posted Aug 28, 2024 - 18:27 AEST
This incident affected: Core Practice Application.