Myphoner - Notice history

Web Application - Operational

100% - uptime
Feb 2021 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2021
Mar 2021
Apr 2021

Myphoner Voice - Operational

100% - uptime
Feb 2021 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2021
Mar 2021
Apr 2021

Marketing and Documentation Pages - Operational

Support Widget/Chat - Operational

Support Backend - Operational

Notice history

Apr 2021

Connections closed without response
  • Update
    Update

    First our apologies for the inconveniences this incident has caused our customers and their clients. During investigations, we found that related issues have been happening on Thursday last week as well, but at a lesser scale and went past undiscovered by us. We understand how important a reliable platform is for our customers, and we do everything we can to build, scale and maintain a stable and fast product. We have now found the root cause of the issues, corrected them, set up monitoring and alarms and adjusted relevant procedures to avoid similar issues in the future. We have included the more elaborate story below for anyone interested. On Monday afternoon, we became aware that customers were occasionally getting errors when trying to access the app. We started investigations and were puzzled by the symptoms, leading us to believe we were dealing with an issue on the underlying infrastructure, and we escalated to our infrastructure partner within 15 minutes after first becoming aware of the issue. There were no load issues, there were no spikes, nor any other anomalies or outliers in our monitoring tools. We continued investigating, trying different things to remedy the impact and eventually found that spawning more web server instances seemed to resolve the issue - even though the web instances already running were not near out of capacity. During the coming day, as we continued investigating to find the root cause, our infrastructure partner politely pointed us toward an interesting error message in the logs which led us to discover an application server config with an outdated default setting for its maximum number of worker connections. Our application server had gone out of sync with the HTTP/reverse proxy server and that was causing it to drop connections before the application stack was reached, which is why our normal monitoring tool did not reveal the issues. Earlier today we deployed an updated configuration to the application server and adjusted the number of web instances back down, carefully monitoring the impact. As expected our servers are now spinning along serving up all requests without error. In the future, all default settings will be checked and updated accordingly when we update the application server. In addition, we have set up monitoring of error response codes happening before the application stack.

  • Resolved
    Resolved

    We've had no errors for an hour and a half, thus closing this issue. We are continuing to monitor systems and work with our infrastructure partner on the details of the issues.

  • Monitoring
    Monitoring

    Errors have been declining in number steadily and we have had no errors the past half hour. We are continuing to monitor all systems closely and waiting for feedback from our infrastructure partner on the details of the issues.

  • Update
    Update

    We are still seeing occasional errors and working with our infrastructure provider on identifying and resolving them. Next update will be in one hour unless we have something new to add.

  • Update
    Update

    We have been ruling out issues with the app itself, but continue to monitor closely as we are suspecting issues with the underlying infrastructure of Myphoner and are working with our infrastructure provider to identify the issue more specifically.

  • Update
    Update

    We are continuing to investigate and have escalated the issue to our infrastructure provider as well as we are seeing indications it could be an infrastructure issue.

  • Update
    Update

    We are continuing to investigate this issue.

  • Investigating
    Investigating

    We are currently investigating reports of long response times and timeouts for some customers

Mar 2021

No notices reported this month

Feb 2021 to Apr 2021

Next