cancel
Showing results for 
Search instead for 
Did you mean: 
TobiasHermann
Pico de Orizaba

The following article redescribes the problem with how the times of segment efforts are currently measured, suggests a solution, and provides an exemplaric implementation: Accurate timing of Strava segments

16 Comments
Status changed to: Open To Voting
Soren
Denali

Thanks for submitting your idea to improve segment effort precision. It has been reviewed by our moderation team and is now open to voting.

Jan_Mantau
Superuser
Superuser

There could be improvements made with this approach in some cases I guess but bringing it to work would be a nightmare. The article just assumes that every GPS points of the rider and of the segment are in a straight line but the reality is much more fuzzy. There are many segments that start and end besides the road because of poor reception and the GPS points of the activities have the same problem. Therefore you will never really know if the GPS point recorded nearest the start of a segment is lucky or unlucky or even point on. Bringing other GPS points into the game that are even further away can diminish the accuracy in many cases.

TobiasHermann
Pico de Orizaba

@Jan_Mantau 

Hi, and thanks for the feedback.

Neither does the proposed algorithm assume (or require) the recorded GPS points to be on a straight line, nor that the activity path goes exactly through the segment start/end:
- current/old approach: Use the recorded activity point closest to the segment start/end point.
- proposed/new approach: On the polygonal chain defined by the recorded activity points, use any (potentially interpolated) point on it, which is closest to the segment start/end point.

The only approximation here is, that the parts of the polygonal chain between two recorded points are linearly interpolated, while in reality, the athlete probably moved somewhat non-linearly between two recorded points. Still, it should be much better compared to no interpolation at all.

(Analogy: Let's say you measure the height of a tree on every first of March (recorded activity GPS points in the analogy). 2022-03-01 it was 30 m tall. 2023-03-01 it was 36 m. Now you're interested in how tall it probably was on 2022-07-01 (segment start point in the analogy). The currently implemented approach would just take the nearest neighbor measurement (2022-03-01) and say "On 2022-07-01 the tree was likely 30 m tall.", while the new approach would say "On 2022-07-01 the tree was likely 32 m tall.", which is better.)

Could you please provide an example of a segment and an activity, that you think would break the new approach, so I can test the implementation with it?

Jan_Mantau
Superuser
Superuser

Hi Tobias, I'm sure that in many or even most cases the interpolation gets the section of the activity nearer to the segment start and end point than just using the recorded GPS points. Only imagine the nearest GPS point used by Strava was in reality  spot on the segment start or end but due to measurement inaccuracies the recorded value was nowhere near it. Any attempt to use another (calculated) GPS point here would make a perfect activity to segment match worse.

Maybe this solution should be used anyway (at least for future activities to avoid recalculating decades of former activities) but I only want to point out that it's not in any case better.

TobiasHermann
Pico de Orizaba

@Jan_Mantau 

If a recorded GPS point of the activity is exactly on the segment start/end (up to floating-point-number precision) or the closest point (on the polygonal chain) overall, the new approach will choose exactly this point and not do any interpolation.

The calculated virtual point is always either an exact original point or the interpolation between two adjacent points. More than two points will never be used and thus not dilute the result.

Jan_Mantau
Superuser
Superuser

@TobiasHermann 

>If a recorded GPS point of the activity is exactly on the segment start/end (up to floating-point-number precision) or the closest point (on the polygonal chain) overall, the new approach will choose exactly this point and not do any interpolation.

The interpolation WILL use another point if the recording was imprecise and a calculated point is nearer than a recorded point. It will do that even if the nearest recorded point was actually perfect and only seemed to be further away as the interpolated point.

TobiasHermann
Pico de Orizaba

[empty, please ignore]

TobiasHermann
Pico de Orizaba

@Jan_Mantau What do you mean by "the nearest recorded point was actually perfect"?

TobiasHermann
Pico de Orizaba

There are two main factors, which currently make segment times inaccurate:
- A: low sampling frequency in a stream of recorded GPS points
- B: imperfect accuracy of the position of single GPS points

The proposed solution does only intend to solve problem A, not problem B.

TobiasHermann
Pico de Orizaba

@Jan_Mantau 

After having thought about it a bit more (during a run 😀), I think I now understand what you mean:

Assuming the athlete actually physically recorded a GPS point at exactly the segment start/end, but the coordinates were somewhat inaccurate (problem "B" from above), it would actually be more correct to use this original point instead of an interpolated one.

So, yes, I agree. I too think you have correctly identified a situation, in which the old approach would give a more correct result than the new one. ✔️

On average (and in most cases), however, I'm still convinced, that the new approach should be much better than the old one.

Thanks a lot for this very good discussion. ❤️ Based on it, I've improved the explanations in my article. ✍️

Silentvoyager
Superuser
Superuser

An important reason to use interpolation is not only to improve accuracy of the match but also to improve quality of the match. With the current algorithm there are many false positives where segments get matched where a person hasn't truly competed a segment. Most often I see that on uphill running segments. When running it is too easy to turn around before reaching a segment finish. The current algorithm uses a fairly large matching radius because it has to accommodate devices that record points less often or devices with smart recording (Garmin uses smart recording by default). From what I read and observed the matching radius is rather large - 50-75 meters. That means that on an uphill someone can easily shave up to 30 seconds by turning around early and still have their segment matched. That is how some people have gotten their KOMs. Similarly, it is quite easy to start a segment late withouts going through the start.

Interpolation would allow to tighten the matching radius and better detect incomplete segment attempts, this it would help to keep leaderboards fairer. 

TobiasHermann
Pico de Orizaba

@Silentvoyager Very good point. Thanks!

AdamVD
Pico de Orizaba

I'd suggest that leave accuracy as it is for most segments but enforce some "standard" for the top-10. I put my thoughts on this heere: https://communityhub.strava.com/t5/ideas/re-calculate-old-segment-times-so-leaderboards-are-fair/idi...

Basically, because of the algorithm used to calculate speed, it is more common to see older activities at the top of the leaderboard because they typically have less accurate GPS. Very often, I'm looking at the KOM for a segment and it is pre 2018. If I look at the activity, it usually has GPS pings 5+ seconds apart. This makes a huge difference in speed accuracy calculation giving an unfair advantage to those using less accurate GPS.

I'd say, to be in the top 10, you need a GPS ping at least every 2 seconds. To be in the top 50, every 5. Otherwise, a top-50 time should not count as "valid".

AdamVD
Pico de Orizaba

For this to work best, Strava need to use the best GPS data available for segments. If the segment is recorded using "bad" GPS, the start point and finish point will always be inaccurate regardless of if the rider is using good/bad GPS. Strava could, for all current segments and for newly created segments, replace the creators GPS pings with better pings if better pings are available.

For example, if I use bad GPS that pings every 10 seconds and make a segment with that, the leaderboard will always be inaccurate. Strava could, when I create the segment, instead look at a "bank" of "good" GPS data for when a rider has been over that route with "good" GPS and use their position data

TobiasHermann
Pico de Orizaba

@AdamVD 

While it would be nice to have a GPS point each second (or every 2) for leaderboard efforts, enforcing this would
- make the user experience more complicated, i.e., many users would need to change the settings in their recording devices, e.g., switching from "Smart Recording" to "Every Second Recording" on Garmin devices.
- completely remove many old efforts with low GPS-recording frequency from the leaderboards, resulting in potential frustration for many users.
- still have an average error of 1 second.

So I think it would be better to instead not to enforce such a GPS-point frequency, but instead replace the current (nearest-neighbor-based) algorithm with one using interpolation (as described in the article (with source code) in my original post).

This would:
- keep the user experience simple, i.e., the users would not need to change the settings on their devices.
- preserve old efforts on the leaderboards. (The Strava product team would have the option to decide to re-calculate the old times, i.e., increase the accuracy.)
- be more accurate thanks to the sub-second-capable calculations. This works reasonably well even for GPS points with a few seconds of gap in between, due to the "virtual GPS points" from the interpolation.