Elevated timeouts and errors

Uptime Impact: 9 minutes
Resolved
Updated

I am following up on this morning's incident.

An abnormal amount of traffic was passing through our cache and placing an unexpected load on our origin servers. This was enough to upset our database connection and cause increased request errors and timeouts for around 9 minutes.

Our edge cache and origin shield usually protect us from this kind of issue. However it seems that some malformed requests coming from a particular source were slipping through the net.

We've now made some configuration changes to patch this hole, so it shouldn't cause us any further issues.

Avatar for Robert Rawlins
Robert Rawlins
Resolved

We've not seen any further issues with performance or availability so I'm now treating this incident as resolved.

I think we've managed to identify what caused the earlier wobble, but the team continue to work on this so I shall post again once we have a more complete picture.

Avatar for Robert Rawlins
Robert Rawlins
Recovering

We saw an increased number of timeouts starting about ten minutes ago at around 11:10 UTC. Things looking to be back to normal for now, but we're digging in to try and find out what caused the initial instability.

Avatar for Robert Rawlins
Robert Rawlins
Began at:

Affected components
  • Management UI
  • REST API
  • Monitoring Automation
    • Inbound Mail
    • Pingdom Sync
  • Message Distribution
    • Email
      • Email by Sorry™
      • MailChimp
      • Mailgun
      • SendGrid
      • Postmark
    • Microsoft Teams
    • Slack
    • SMS
    • Twitter
    • Website Plugin
    • Intercom Messenger App