I'm wondering if this is related to the Garmin outage rather than being a Strava issue? Once Garmin came back up it probably caused a large influx of activities to Strava, which in turn caused a large number of webhook calls.
On my site, I deal with this by pushing all the pending webhook calls into my database and then have a separate processing job which handles them more slowly. That way the thread handling the webhook can respond with HTTP200 quickly after a few basic checks without using up too many resources. The processing job is limited to a small number of threads and also checks API limits. That both prevents any kind of DDoS type issue and also makes sure when something like this happens where a large number of calls come in at once I don't use all my API calls up; once the limit is hit the processing job pauses until the next 15 minute (or 24h) interval.
Thank you for the quick response. I have implemented something like you proposed as a ‘firewall’.
It doesn’t seem as an Strava issue, but probably it affects them too. I never saw that before, and as we are having 1 call per minute, this seemed too awkward since it was a sustained 50 events per second for maybe 20 minutes. Probably is some kind of third party integration that is updating the activity in a wrong way.
@jllave I saw the same thing happen around the same time, I think it’s possible the issue was related to Garmin’s outage, but I also think it was Strava’s fault because I looked at the events and there were many duplicates.
I have a generic endpoint that responds quickly with a 200 and then passes the event to another function to do the processing.
I haven’t implemented a queuing solution that ignores duplicates but I might.
@renschler They may not actually be duplicates, though I can’t say for certain in your case. A lot of people have multiple apps connected and Strava will send a webhook for changes to activity title, description, type, and privacy. In addition as noted on the webhook page, some of those updates are done asynchronously so you’ll get multiple webhooks for a ‘single’ change (e.g. changing activity type + title in one edit may result in 2 separate ‘update’ webhook events). My guess would be that once Garmin came back up and sent out activities to Strava, multiple apps received webhooks, made updates, and triggered additional webhook update events which were broadcast out basically causing a massive burst of webhook calls as everything caught back up.