Filename: 276-lower-bw-granularity.txt
Title: Report bandwidth with lower granularity in consensus documents
Author: Nick Mathewson
Created: 20-Feb-2017
Status: Dead
Target: 0.3.1.x-alpha
[NOTE: We're calling this proposal dead for now: the benefits are small
compared to the possible loss in routing correctness. If/when proposal
300 is built, it will have even less benefit. (2020 July 31)]
1. Overview
This document proposes that, in order to limit the bandwidth needed for
networkstatus diffs, we lower the granularity with which bandwidth is
reported in consensus documents.
Making this change will reduce the total compressed ed diff download
volume by around 10%.
2. Motivation
Consensus documents currently report bandwidth values as the median
of the measured bandwidth values in the votes. (Or as the median of
all votes' values if there are not enough measurements.) And when
voting, in turn, authorities simply report whatever measured value
they most recently encountered, clipped to 3 significant base-10
figures.
This means that, from one consensus to the next, these weights very
often and with little significance: A large fraction of bandwidth
transitions are under 2% in magnitude.
As we begin to use consensus diffs, each change will take space to
transmit. So lowering the amount of changes will lower client
bandwidth requirements significantly.
3. Proposal
I propose that we round the bandwidth values, as they are placed in votes,
to no more than two significant digits. In addition, for
values beginning with decimal "2" through "4", we should round the
first two digits the nearest multiple of 2. For values beginning
with decimal "5" though "9", we should round to the nearest multiple
of 5.
The change will take effect progressively as authorities upgrade: since
the median value is used, when one authority upgrades, 1/5 of the
bandwidths will be rounded (on average).
Once all authorities upgrade, all bandwidths will be rounded like this.
4. Analysis
The rounding proposed above will not round any value by more than 5% more
than current rounding, so the overall impact on bandwidth balancing should
be small.
In order to assess the bandwidth savings of this approach, I
smoothed the January 2017 consensus documents' Bandwidth fields,
using scripts from [1]. I found that if clients download
consensus diffs once an hour, they can expect 11-13% mean savings
after xz or gz compression. For two-hour intervals, the savings
is 8-10%; for three-hour or four-hour intervals, the savings only
is 6-8%. After that point, we start seeing diminishing returns,
with only 1-2% savings on a 72-hour interval's diff.
[1] https://github.com/nmathewson/consensus-diff-analysis
5. Open questions:
Is there a greedier smoothing algorithm that would produce better
results?
Is there any reason to think this amount of smoothing would not
be safe?
Would a time-aware smoothing mechanism work better?