Opening streams and transmitting data
Opening a new stream: The begin/connected handshake
To open a new anonymized TCP connection, the OP chooses an open circuit to an exit that may be able to connect to the destination address, selects an arbitrary StreamID not yet used on that circuit, and constructs a RELAY_BEGIN message with a body encoding the address and port of the destination host. The body format is:
ADDRPORT [nul-terminated string]
FLAGS [4 bytes, optional]
ADDRPORT is made of ADDRESS | ':' | PORT | [00]
where ADDRESS can be a DNS hostname, or an IPv4 address in dotted-quad format, or an IPv6 address surrounded by square brackets; and where PORT is a decimal integer between 1 and 65535, inclusive.
The ADDRPORT string SHOULD be sent in lower case, to avoid fingerprinting. Implementations MUST accept strings in any case.
The FLAGS value has one or more of the following bits set, where "bit 1" is the LSB of the 32-bit value, and "bit 32" is the MSB. (Remember that all integers in Tor are big-endian, so the MSB of a 4-byte value is the MSB of the first byte, and the LSB of a 4-byte value is the LSB of its last byte.)
If FLAGS is absent, its value is 0. Whenever 0 would be sent for FLAGS, FLAGS is omitted from the message body.
bit meaning
1 -- IPv6 okay. We support learning about IPv6 addresses and
connecting to IPv6 addresses.
2 -- IPv4 not okay. We don't want to learn about IPv4 addresses
or connect to them.
3 -- IPv6 preferred. If there are both IPv4 and IPv6 addresses,
we want to connect to the IPv6 one. (By default, we connect
to the IPv4 address.)
4..32 -- Reserved. Current clients MUST NOT set these. Servers
MUST ignore them.
Upon receiving this message, the exit node first checks whether it happens to be the first node in the circuit (i.e. someone attempts to create a one-hop circuit). It does so, by inspecting the accompanying channel, whether the channel initiator has authenticated itself as well as its fingerprint is part of the current consensus. (See "Negotiating and initializing channels")
Any attempts to create a one-hop circuit using a RELAY_BEGIN cell SHOULD be declined by sending an appropriate DESTROY cell with a protocol violation as its reason.
Afterwards, the exit node resolves the address as necessary, and opens a new TCP connection to the target port. If the address cannot be resolved, or a connection can't be established, the exit nodes replies with a RELAY_END message. (See "Closing streams") Otherwise, the exit node replies with a RELAY_CONNECTED message, whose body is one of the following formats:
The IPv4 address to which the connection was made [4 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
or
Four zero-valued octets [4 octets]
An address type (6) [1 octet]
The IPv6 address to which the connection was made [16 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
Implementations MUST accept either of these formats, and MUST also accept an empty RELAY_CONNECTED message body.
Implmentations MAY ignore the address value, and MAY choose not to cache it. If an implementation chooses to cache the address, it SHOULD NOT reuse that address with any other circuit.
The reason not to cache an address is that the exit might have lied about the actual address of the host, or might have given us a unique address to identify us in the future.
[Tor exit nodes before 0.1.2.0 set the TTL field to a fixed value. Later versions set the TTL to the last value seen from a DNS server, and expire their own cached entries after a fixed interval. This prevents certain attacks.]
Transmitting data
Once a connection has been established, the OP and exit node package stream data in RELAY_DATA message, and upon receiving such messages, echo their contents to the corresponding TCP stream.
If the exit node does not support optimistic data (i.e. its version number is before 0.2.3.1-alpha), then the OP MUST wait for a RELAY_CONNECTED message before sending any data. If the exit node supports optimistic data (i.e. its version number is 0.2.3.1-alpha or later), then the OP MAY send RELAY_DATA messages immediately after sending the RELAY_BEGIN message (and before receiving either a RELAY_CONNECTED or RELAY_END message).
RELAY_DATA messages sent to unrecognized streams are dropped. If the exit node supports optimistic data, then RELAY_DATA messages it receives on streams which have seen RELAY_BEGIN but have not yet been replied to with a RELAY_CONNECTED or RELAY_END are queued. If the stream creation succeeds with a RELAY_CONNECTED, the queue is processed immediately afterwards; if the stream creation fails with a RELAY_END, the contents of the queue are deleted.
Relay RELAY_DROP messages are long-range dummies; upon receiving such a message, the OR or OP must drop it.
Opening a directory stream
If a Tor relay is a directory server, it should respond to a RELAY_BEGIN_DIR message as if it had received a BEGIN message requesting a connection to its directory port. RELAY_BEGIN_DIR message ignore exit policy, since the stream is local to the Tor process.
Directory servers may be:
- authoritative directories (RELAY_BEGIN_DIR, usually non-anonymous),
- bridge authoritative directories (RELAY_BEGIN_DIR, anonymous),
- directory mirrors (RELAY_BEGIN_DIR, usually non-anonymous),
- onion service directories (RELAY_BEGIN_DIR, anonymous).
If the Tor relay is not running a directory service, it should respond with a REASON_NOTDIRECTORY RELAY_END message.
Clients MUST generate a empty body for RELAY_BEGIN_DIR message; relays MUST ignore the the body of a RELAY_BEGIN_DIR message.
In response to a RELAY_BEGIN_DIR message, relays respond either with a RELAY_CONNECTED message on success, or a RELAY_END message on failure. They MUST send a RELAY_CONNECTED message with an empty body; clients MUST ignore the body.
[RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients SHOULD NOT send it to routers running earlier versions of Tor.]