Your app generates a lot of telemetry. You want to drop redundant data to save money, but you still want live aggregations over that data to be accurate. The answer is sampling: keep enough for accurate aggregations, and drop the rest. But how much can you sample before you can't trust your graphs? The answer is different per situation, depending on data volume and the probably distribution of the numbers you're aggregating. This tool can give you a sense for it.
Adjust the volume, sample rate, and distribution parameters below to see real-time simulations of how sampling affects aggregation accuracy.
This was coded with a LOT of AI assistance. We haven't checked the math. The simulations and aggregations seem right.
{number of events} events go in
to {number of simulations} simulations
Sampled at a rate of 1:{sample rate}
Saving you {percentage reduction}
About {number of events / sample rate} events come out
Then we aggregate the sampled events.