Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
Implemented-In: 0.2.1.x
1. Overview.
Tor has many steps to bootstrapping directory information and
initial circuits, but from the controller's perspective we just have
a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
slow connections or with connectivity problems can wait a long time
staring at the yellow onion, wondering if it will ever change color.
This proposal describes a new client status event so Tor can give
more details to the controller. Section 2 describes the changes to the
controller protocol; Section 3 describes Tor's internal bootstrapping
phases when everything is going correctly; Section 4 describes when
Tor detects a problem and issues a bootstrap warning; Section 5 covers
suggestions for how controllers should display the results.
2. Controller event syntax.
The generic status event is:
"650" SP StatusType SP StatusSeverity SP StatusAction
[SP StatusArguments] CRLF
So in this case we send
650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
PROGRESS=num TAG=Keyword SUMMARY=String \
[WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]
The arguments MAY appear in any order. Controllers MUST accept unrecognized
arguments.
"Progress" gives a number between 0 and 100 for how far through
the bootstrapping process we are. "Summary" is a string that can be
displayed to the user to describe the *next* task that Tor will tackle,
i.e., the task it is working on after sending the status event. "Tag"
is an optional string that controllers can use to recognize bootstrap
phases from Section 3, if they want to do something smarter than just
blindly displaying the summary string.
The severity describes whether this is a normal bootstrap phase
(severity notice) or an indication of a bootstrapping problem
(severity warn). If severity warn, it should also include a "warning"
argument string with any hints Tor has to offer about why it's having
troubles bootstrapping, a "reason" string that lists one of the reasons
allowed in the ORConn event, a "count" number that tells how many
bootstrap problems there have been so far at this phase, and a
"recommendation" keyword to indicate how the controller ought to react.
3. The bootstrap phases.
This section describes the various phases currently reported by
Tor. Controllers should not assume that the percentages and tags listed
here will continue to match up, or even that the tags will stay in
the same order. Some phases might also be skipped (not reported) if the
associated bootstrap step is already complete, or if the phase no longer
is necessary. Only "starting" and "done" are guaranteed to exist in all
future versions.
Current Tor versions enter these phases in order, monotonically;
future Tors MAY revisit earlier stages.
Phase 0:
tag=starting summary="starting"
Tor starts out in this phase.
Phase 5:
tag=conn_dir summary="Connecting to directory mirror"
Tor sends this event as soon as Tor has chosen a directory mirror ---
one of the authorities if bootstrapping for the first time or after
a long downtime, or one of the relays listed in its cached directory
information otherwise.
Tor will stay at this phase until it has successfully established
a TCP connection with some directory mirror. Problems in this phase
generally happen because Tor doesn't have a network connection, or
because the local firewall is dropping SYN packets.
Phase 10
tag=handshake_dir summary="Finishing handshake with directory mirror"
This event occurs when Tor establishes a TCP connection with a relay used
as a directory mirror (or its https proxy if it's using one). Tor remains
in this phase until the TLS handshake with the relay is finished.
Problems in this phase generally happen because Tor's firewall is
doing more sophisticated MITM attacks on it, or doing packet-level
keyword recognition of Tor's handshake.
Phase 15:
tag=onehop_create summary="Establishing one-hop circuit for dir info"
Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
to establish a one-hop circuit for retrieving directory information.
It will remain in this phase until it receives the CREATED_FAST cell
back, indicating that the circuit is ready.
Phase 20:
tag=requesting_status summary="Asking for networkstatus consensus"
Once we've finished our one-hop circuit, we will start a new stream
for fetching the networkstatus consensus. We'll stay in this phase
until we get the 'connected' relay cell back, indicating that we've
established a directory connection.
Phase 25:
tag=loading_status summary="Loading networkstatus consensus"
Once we've established a directory connection, we will start fetching
the networkstatus consensus document. This could take a while; this
phase is a good opportunity for using the "progress" keyword to indicate
partial progress.
This phase could stall if the directory mirror we picked doesn't
have a copy of the networkstatus consensus so we have to ask another,
or it does give us a copy but we don't find it valid.
Phase 40:
tag=loading_keys summary="Loading authority key certs"
Sometimes when we've finished loading the networkstatus consensus,
we find that we don't have all the authority key certificates for the
keys that signed the consensus. At that point we put the consensus we
fetched on hold and fetch the keys so we can verify the signatures.
Phase 45
tag=requesting_descriptors summary="Asking for relay descriptors"
Once we have a valid networkstatus consensus and we've checked all
its signatures, we start asking for relay descriptors. We stay in this
phase until we have received a 'connected' relay cell in response to
a request for descriptors.
Phase 50:
tag=loading_descriptors summary="Loading relay descriptors"
We will ask for relay descriptors from several different locations,
so this step will probably make up the bulk of the bootstrapping,
especially for users with slow connections. We stay in this phase until
we have descriptors for at least 1/4 of the usable relays listed in
the networkstatus consensus. This phase is also a good opportunity to
use the "progress" keyword to indicate partial steps.
Phase 80:
tag=conn_or summary="Connecting to entry guard"
Once we have a valid consensus and enough relay descriptors, we choose
some entry guards and start trying to build some circuits. This step
is similar to the "conn_dir" phase above; the only difference is
the context.
If a Tor starts with enough recent cached directory information,
its first bootstrap status event will be for the conn_or phase.
Phase 85:
tag=handshake_or summary="Finishing handshake with entry guard"
This phase is similar to the "handshake_dir" phase, but it gets reached
if we finish a TCP connection to a Tor relay and we have already reached
the "conn_or" phase. We'll stay in this phase until we complete a TLS
handshake with a Tor relay.
Phase 90:
tag=circuit_create "Establishing circuits"
Once we've finished our TLS handshake with an entry guard, we will
set about trying to make some 3-hop circuits in case we need them soon.
Phase 100:
tag=done summary="Done"
A full 3-hop circuit has been established. Tor is ready to handle
application connections now.
4. Bootstrap problem events.
When an OR Conn fails, we send a "bootstrap problem" status event, which
is like the standard bootstrap status event except with severity warn.
We include the same progress, tag, and summary values as we would for
a normal bootstrap event, but we also include "warning", "reason",
"count", and "recommendation" key/value combos.
The "reason" values are long-term-stable controller-facing tags to
identify particular issues in a bootstrapping step. The warning
strings, on the other hand, are human-readable. Controllers SHOULD
NOT rely on the format of any warning string. Currently the possible
values for "recommendation" are either "ignore" or "warn" -- if ignore,
the controller can accumulate the string in a pile of problems to show
the user if the user asks; if warn, the controller should alert the
user that Tor is pretty sure there's a bootstrapping problem.
Currently Tor uses recommendation=ignore for the first nine bootstrap
problem reports for a given phase, and then uses recommendation=warn
for subsequent problems at that phase. Hopefully this is a good
balance between tolerating occasional errors and reporting serious
problems quickly.
5. Suggested controller behavior.
Controllers should start out with a yellow onion or the equivalent
("starting"), and then watch for either a bootstrap status event
(meaning the Tor they're using is sufficiently new to produce them,
and they should load up the progress bar or whatever they plan to use
to indicate progress) or a circuit_established status event (meaning
bootstrapping is finished).
In addition to a progress bar in the display, controllers should also
have some way to indicate progress even when no controller window is
open. For example, folks using Tor Browser Bundle in hostile Internet
cafes don't want a big splashy screen up. One way to let the user keep
informed of progress in a more subtle way is to change the task tray
icon and/or tooltip string as more bootstrap events come in.
Controllers should also have some mechanism to alert their user when
bootstrapping problems are reported. Perhaps we should gather a set of
help texts and the controller can send the user to the right anchor in a
"bootstrapping problems" page in the controller's help subsystem?
6. Getting up to speed when the controller connects.
There's a new "GETINFO /status/bootstrap-phase" option, which returns
the most recent bootstrap phase status event sent. Specifically,
it returns a string starting with either "NOTICE BOOTSTRAP ..." or
"WARN BOOTSTRAP ...".
Controllers should use this getinfo when they connect or attach to
Tor to learn its current state.