Filename: 246-merge-hsdir-and-intro.txt
Title: Merging Hidden Service Directories and Introduction Points
Author: John Brooks, George Kadianakis
Created: 2015-07-12
Status: Rejected
Change history:
18-Jan-2016 Changed status to "Needs-Research" after discussion in email
thread [1].
1. Overview and Motivation
This document describes a modification to proposal 224 ("Next-Generation
Hidden Services in Tor"), which simplifies and improves the architecture by
combining hidden service directories and introduction points at the same
relays.
A reader will want to be familiar with the existing hidden service design,
and with the changes in proposal 224. If accepted, this proposal should be
combined with proposal 224 to make a superseding specification.
1.1. Overview
In the existing hidden service design and proposal 224, there are three
distinct steps building a connection: fetching the descriptor from a
directory, contacting an introduction point listed in the descriptor, and
rendezvous as specified during the introduction. The hidden service
directories are selected algorithmically, and introduction points are
selected at random by the service.
We propose to combine the responsibilities of the introduction point and
hidden service directory. The list of introduction points responsible for a
service will be selected using the algorithm specified for HSDirs [proposal
224, section 2.2.3]. The service builds a long-term introduction circuit to
each of these, identified by its blinded public key. Clients can calculate
the same set of relays, build an introduction circuit, retrieve the
ephemeral keys, and proceed with sending an introduction to the service in
the same ways as before.
1.2. Benefits over proposal 224
With this change, client connections are made more efficient by needing only
two circuits (for introduction and rendezvous), instead of the three needed
previously, and need to contact fewer relays. Clients also no longer cache
descriptors, which substantially simplifies code and removes a common source
of bugs and reliability issues.
Hidden services are able to stay online by simply maintaining their
introduction circuits; there is no longer a need to periodically update
descriptors. This reduces network load and traffic fingerprinting
opportunities for a hidden service.
The number and churn of relays a hidden service depends on is also reduced.
In particular, prior hidden service designs may frequently choose new
introduction points, and each of these has an opportunity to observe the
popularity or connection behavior of clients.
1.3. Other effects on proposal 224
An adversarial introduction point is not significantly more capable than a
hidden service directory under proposal 224. The differences are:
1. The introduction point maintains a long-lived circuit with the service
2. The introduction point can break that circuit and cause the service to
rebuild it
See section 4 ("Discussion") for other impacts and open discussion
questions.
2. Specification
2.1. Picking introduction points for a service
Instead of picking HSDirs, hidden services pick their introduction points
using the same algorithm as defined in proposal 224 section 2.2 [HASHRING].
To be used as an introduction point, a relay must have the Stable flag in
the consensus and an uptime of at least twice the shared random period
defined in proposal 224 section 2.3.
This also specifies the lifetime of introduction points, since they will be
rotated with the change of time period and shared randomness.
2.2. Hidden service sets up introduction points
After a hidden service has picked its intro points, it needs to establish
long-term introduction circuits to them and also send them an encrypted
descriptor that should be forwarded to potential clients. The descriptor
contains a service key that should be used by clients to encrypt the
INTRODUCE1 cell that will be sent to the hidden service. The encrypted parts
of the descriptor are encrypted with the symmetric keys specified in prop224
section [ENCRYPTED-DATA].
2.2.1. Hidden service uploads a descriptor
Services post a descriptor by opening a directory stream with BEGIN_DIR, and
sending a HTTP POST request as described in proposal 224, section 2.2.4.
The relay must verify the signatures of the descriptor, and check whether it
is responsible for that blinded public key in the hash ring. Relays should
connect the descriptor to the circuit used to upload it, which will be
repurposed as the service introduction circuit. The descriptor does not need
to be cached by the introduction point after that introduction circuit has
closed.
It is unexpected and invalid to send more than one descriptor on the same
introduction circuit.
2.2.2. Descriptor format
The format for the hidden service descriptor is as described in proposal 224
sections 2.4 and 2.5, with the following modifications:
* The "revision-counter" field is removed
* The introduction-point section is removed
* The "auth-key" field is removed
* The "enc-key legacy" field is removed
* The "enc-key ntor" field must be specified exactly once per descriptor
Unlike previous versions, the descriptor does not encode the entire list of
introduction points. The descriptor only contains a key for the particular
introduction point it was sent to.
2.2.3. ESTABLISH_INTRO cell
When a hidden service is establishing a new introduction point, it sends the
ESTABLISH_INTRO cell, which is formatted as described by proposal 224
section 3.1.1, except for the following:
The AUTH_KEY_TYPE value 02 is changed to:
[02] -- Signing key certificate cross-certified with the blinded key, in
the same format as in the hidden service descriptor.
In this case, SIG is a signature of the cell with the signing key specified
in AUTH_KEY. The relay must verify this signature, as well as the
certification with the blinded key. The relay should also verify that it has
received a valid descriptor with this blinded key.
[XXX: Other options include putting only the blinded key, or only the
signing key in this cell. In either of these cases, we must look up the
descriptor to fully validate the contents, but we require the descriptor
to be present anyway. -special]
[XXX: What happens with the MAINT_INTRO process defined in proposal 224
section 3.1.3? -special]
2.3. Client connection to a service
A client that wants to connect to a hidden service should first calculate
the responsible introduction points for the onion address as described in
section 2.1 above.
The client chooses one introduction point at random, builds a circuit, and
fetches the descriptor. Once it has received, verified, and decrypted the
descriptor, the client can use the same circuit to send the INTRODUCE1 cell.
2.3.1. Client requests a descriptor
Clients can request a descriptor by opening a directory stream with
BEGIN_DIR, and sending a HTTP GET request as described in proposal 224,
section 2.2.4.
The client must verify the signatures of the descriptor, and decrypt the
encrypted portion to access the "enc-key". This key is used to encrypt the
contents of the INTRODUCE1 cell to the service.
Because the descriptor is specific to each introduction point, client-side
descriptor caching changes significantly. There is little point in caching
these descriptors, because they are inexpensive to request and will always
be available when a service-side introduction circuit is available. A client
that does caching must be prepared to handle INTRODUCE1 failures due to
rotated keys.
2.3.2. Client sends INTRODUCE1
After requesting the descriptor, the client can use the same circuit to send
an INTRODUCE1 cell, which is forwarded to the service and begins the
rendezvous process.
The INTRODUCE1 cell is the same as proposal 224 section 3.2.1, except that
the AUTH_KEYID is the blinded public key, instead of the now-removed
introduction point authentication key.
The relay must permit this circuit to change purpose from the directory
request to a client or server introduction.
3. Other changes to proposal 224
3.1. Removing proposal 224 legacy relay support
Proposal 224 defines a process for using legacy relays as introduction
points; see section 3.1.2 [LEGACY_EST_INTRO], and 3.2.3 [LEGACY-INTRODUCE1].
With the changes to the introduction point in this proposals, it's no longer
possible to maintain support for legacy introduction points.
These sections of proposal 224 are removed, along with other references to
legacy introduction points and RSA introduction point keys. We will need to
handle the migration process to ensure that sufficient relays are available
as introduction points. See the discussion in section 4.1 for more details.
3.2. Removing the "introduction point authentication key"
The "introduction point authentication key" defined in proposal 224 is
removed. The "descriptor signing key" is used to sign descriptors and the
ESTABLISH_INTRO2 cell. Descriptors are unique for each introduction point,
and there is no point in generating a new key used only to sign the
ESTABLISH_INTRO2 cell.
4. Discussion
4.1. No backwards compatibility with legacy relays
By changing the introduction procedure in such a way, we are unable to
maintain backwards compatibility. That is, hidden services will be unable to
use old relays as their introduction points, and similarly clients will be
unable to introduce through old relays.
To maintain an adequate anonymity set of intro points, clients and hidden
services should perform this introduction method only after most relays have
upgraded. For this reason we introduce the consensus parameter
HSMergedIntroduction which controls whether hidden services should perform
this merged introduction or fall back to the old one.
[XXX: Do we? This sounds like we have to implement both in the client, which
I thought we wanted to avoid. An alternative is to make sure that the intro
point side is done early enough, and that clients know not to rely on the
security of 224 services until enough relays are upgraded and the
implementation is done. -special]
4.2. Restriction on the number of intro points and impact on load balancing
One drawback of this proposal is that the number of introduction points of a
hidden service is now a constant global parameter. Hence, a hidden service
can no longer adjust how many introduction points it uses, or select the
nodes that will serve as its introduction points.
While bad, we don't consider this a major drawback since we don't believe
that introduction points are a significant bottleneck on hidden services
performance.
However, our system significantly impacts the way some load balancing
schemes for hidden services work. For example, onionbalance is a third-party
application that manages the introduction points of a hidden service in a
way that allows traffic load-balancing. This is achieved by compiling a
master descriptor that mixes and matches the introduction points of
underlying hidden service instances.
With our system there are no descriptors that onionbalance can use to mix
and match introduction points. A variant of the onionbalance idea that could
work with our system would involve onionbalance starting a hidden service,
not establishing any intro points, and then ordering the underlying hidden
service load-balancing instances to establish intro points to all the right
introduction points.
4.3. Behavior when introduction points go offline or misbehave
In this new system, it's the Tor network that decides which relays should be
used as the intro points of a hidden service for every time period. This
means, that a hidden service is forced to use those relays as intro points
if it wants clients to connect to it.
This brings up the topic of what should happen when the designated relays go
offline or refuse connections. Our behavior here should block guard
discovery attacks (as in #8239) while allowing maximum reachability for
clients.
We should also make sure that an adversary cannot manipulate the hash ring
in such a way that forces us to rotate introduction points quickly. This is
enforced by the uptime check that is necessary for acquiring the HSDir flag
(#8243).
For this reason we propose the following rules:
- After every consensus and when the blinded public key changes as a result
of the time period, hidden services need to recalculate their
introduction points and adjust themselves by establishing intro points to
the new relays.
- When an introduction point goes offline or drops connections, we attempt
to re-establish to it INTRO_RETRIES times per consensus. If the intro
point failed more than INTRO_RETRIES times for a consensus period, we
abandon it and stay with one less intro point.
If a new consensus is released and that relay is still listed as online,
then we reset our retry counter and start trying again.
[XXX: Is this crazy? -asn]
[XXX: INTRO_RETRIES = 3? -asn]
4.4. Defining constants; how many introduction points for a service?
We keep the same intro point configuration as in proposal 224. That is, each
hidden service uses 6 relays and keeps them for a whole time period.
[XXX: Are these good constants? We don't have a chance to change them
in the future!! -asn]
[XXX: 224 makes them consensus parameters, which we can keep, but they
can still only be changed on a network-wide basis. -special]
References:
[1] : https://lists.torproject.org/pipermail/tor-dev/2016-January/010203.html