Filename: 213-remove-stream-sendmes.txt
Title: Remove stream-level sendmes from the design
Author: Roger Dingledine
Created: 4-Nov-2012
Status: Dead
1. Motivation
Tor uses circuit-level sendme cells to handle congestion / flow
fairness at the circuit level, but it has a second stream-level
flow/congestion/fairness layer under that to share a given circuit
between multiple streams.
The circuit-level flow control, or something like it, is needed
because different users are competing for the same resources. But the
stream-level flow control has a different threat model, since all the
streams belong to the same user.
When the circuit has only one active stream, the downsides are a)
that we waste 2% of our bandwidth sending stream-level sendmes, and b)
because of the circuit-level and stream-level window parameters we
picked, we end up sending only half the cells we might otherwise send.
When the circuit has two active streams, they each get to send 500
cells for their window, because the circuit window is 1000. We still
spend the 2% overhead.
When the circuit has three or more active streams, they're all typically
limited by the circuit window, since the stream-level window won't
kick in. We still spend the 2% overhead though. And depending on their
sending pattern, we could experience cases where a given stream might
be able to send more data on the circuit, but it chooses not to because
its stream-level window is empty.
More generally, we don't have a good handle on the interactions between
all the layers of congestion control in Tor. It would behoove us to
simplify in the case where we're not clear on what it buys us.
2. Design
We should strip all aspects of this stream-level flow control from
the Tor design and code.
2.1. But doesn't having a lower stream window than circuit window save
room for new streams?
It could be that a feature of the stream window is that there's always
space in the circuit window for another begin cell, so new streams
will open faster than otherwise. But first, if there are two or more
active streams going, there won't be any extra space. Second, since
begin cells are client-to-exit, and typical circuits don't fill their
outbound circuit windows very often anyway, and also since we're hoping
to move to a world where we isolate more activities between circuits,
I'm not inclined to worry much about losing this maybe-feature.
See also proposal 168, "reduce default circuit window" -- it's
interesting to note that proposal 168 was unknowingly dabbling in
exactly this question, since reducing the default circuit window to
500 or less made stream windows moot. It might be worth resurrecting
the proposal 168 experiments once this proposal is implemented.
2.2. If we dump stream windows, we're effectively doubling them.
Right now the circuit window starts at 1000, and the stream window
starts at 500. So if we just rip out stream windows, we'll effectively
change the stream window default to 1000, doubling the amount of data
in flight and potentially clogging up the network more.
We could either live with that, or we could change the default circuit
window to 500 (which is easy to do even in a backward compatible way,
since the edge connection can simply choose to not send as many cells).
3. Evaluation
It would be wise to have some plan for making sure we didn't screw
up the network too much with this change. The main trouble there is
that torperf et al only do one stream at a time, so we really have no
good baseline, or measurement tools, to capture network performance
for multiple parallel streams.
Maybe we should resolve task 7168 before the transition, so we're
more prepared.
4. Transition
Option one is to do a two-phase transition. In the first phase,
edges stop enforcing the deliver window (i.e. stop closing circuits
when the stream deliver goes negative, but otherwise they send and
receive stream-level sendmes as now). In the second phase (once all
old versions are gone), we can start disobeying the deliver window,
and also stop sending stream-level sendmes back.
That approach takes a while before it will matter. As an optimization,
since clients can know which relay versions support the new behavior,
we could have relays interpret violating the deliver window as signaling
support for removed stream-level sendmes: the relay would then stop
sending or expecting sendmes. That optimization is somewhat klunky
though, first because web-browsing clients don't generally finish out
a stream window in the upstream direction (so the klunky trick will
probably never happen by accident), and second because if we lower
the circuit window to 500 (see Sec 2.2), there's now no way to violate
stream deliver windows.
Option two is to introduce another relay cell type, which the client
sends before opening any streams to let the other side know that
it shouldn't use or expect stream-level sendmes. A variation here
is to extend either the create cell or the begin cell (ha -- and they
thought I was crazy when I included the explicit \0 at the end of the
current begin cell payload), so we can specify our circuit preferences
without any extra overhead.
Option three is to wait until we switch to a new circuit protocol
(e.g. when we move to ntor or ace), and use that as the signal to
drop stream-level sendmes from the design. And hey, if we're lucky,
by then we'll have sorted out the n23 questions (see ticket 4506)
and we might be dumping circuit-level sendmes at that point too.
Options two or three seem way better than option one.
And since it's not super-urgent, I suggest we hold off on option two
to see if option three makes sense.
5. Discussion
Based on feedback from Andreas Krey on tor-dev, I believe this proposal
is flawed, and should likely move to Status: Dead.
Looking at it from the exit relay's perspective (which is where it matters
most, since most use of Tor is sending a little bit and receiving a lot):
when a create cell shows up to establish a circuit, that circuit is
allowed to send back at most 1000 cells. When a begin relay cell shows
up to ask that circuit to open a new stream, that stream is allowed to
send back at most 500 cells.
Whenever the Tor client has received 100 cells on that circuit, she
immediately sends a circuit-level sendme back towards the exit, to let
it know to increment its "number of cells it's allowed to send on the
circuit" by 100.
However, a stream-level sendme is only sent when both a) the Tor client
has received 50 cells on a particular stream, *and* b) the application
that initiated the stream is willing to accept more data.
If we ripped out stream-level sendmes, then as you say, we'd have to
choose between "queue all the data for the stream, no matter how big it
gets" and "tell the whole circuit to shut up".
I believe you have just poked a hole in the n23 ("defenestrator") design
as well: http://freehaven.net/anonbib/#pets2011-defenestrator
since it lacks any stream-level pushback for streams that are blocking
on writes. Nicely done!