Summary

On July 18, 2023, Hookdeck's ingestion service experienced an unexpected disruption due to an outage at Cloudflare's Workers KV store. This incident led to an increase in 500 HTTP status codes at our ingestion endpoints. The issue began at 05:49 UTC and was resolved by 07:08 UTC. The issue was contained to a handful of sources, and localized mainly in the UK datacenter, with some incidents in the Sydney data centre. During that interval, 1.23% of our total traffic was affected.

Timeline

Root Cause

The increase in 500 errors at ingestion was traced back to an issue with Cloudflare's Workers KV store, which started experiencing increased 500 HTTP errors. Hookdeck relies on Cloudflare KV store to retrieve ingestion endpoint preferences, integrations and quotas.

Lessons Learned

What went wrong

Where we got lucky

Corrective actions

Conclusion

We understand the crucial role we play in our customers' operations and sincerely apologize for the inconvenience this outage might have caused. We take this incident very seriously and are taking proactive measures to prevent any similar occurrences in the future. We appreciate your understanding and trust as we continue to work towards providing seamless services.