Message format

Description format

The message formats listed below use ABNF as described in RFC 2234. The protocol itself is loosely based on SMTP (see RFC 2821).

We use the following nonterminals from RFC 2822: atom, qcontent

We define the following general-use nonterminals:

QuotedString = DQUOTE *qcontent DQUOTE

There are explicitly no limits on line length. All 8-bit characters are permitted unless explicitly disallowed. In QuotedStrings, backslashes and quotes must be escaped; other characters need not be escaped.

Wherever CRLF is specified to be accepted from the controller, Tor MAY also accept LF. Tor, however, MUST NOT generate LF instead of CRLF. Controllers SHOULD always send CRLF.

Notes on an escaping bug

CString = DQUOTE *qcontent DQUOTE

Note that although these nonterminals have the same grammar, they are interpreted differently. In a QuotedString, a backslash followed by any character represents that character. But in a CString, the escapes "\n", "\t", "\r", and the octal escapes "\0" ... "\377" represent newline, tab, carriage return, and the 256 possible octet values respectively.

The use of CString in this document reflects a bug in Tor; they should have been QuotedString instead. In the future, they may migrate to use QuotedString instead. If they do, the QuotedString implementation will never place a backslash before a "n", "t", "r", or digit, to ensure that old controllers don't get confused.

For future-proofing, controller implementors MAY use the following rules to be compatible with buggy Tor implementations and with future ones that implement the spec as intended:

    Read \n \t \r and \0 ... \377 as C escapes.
    Treat a backslash followed by any other character as that character.

Currently, many of the QuotedString instances below that Tor outputs are in fact CStrings. We intend to fix this in future versions of Tor, and document which ones were broken. (See bugtracker ticket #14555 for a bit more information.)

Note that this bug exists only in strings generated by Tor for the Tor controller; Tor should parse input QuotedStrings from the controller correctly.

Commands from controller to Tor

    Command = Keyword OptArguments CRLF / "+" Keyword OptArguments CRLF CmdData
    Keyword = 1*ALPHA
    OptArguments = [ SP *(SP / VCHAR) ]

A command is either a single line containing a Keyword and arguments, or a multiline command whose initial keyword begins with +, and whose data section ends with a single "." on a line of its own. (We use a special character to distinguish multiline commands so that Tor can correctly parse multi-line commands that it does not recognize.) Specific commands and their arguments are described below in section 3.

Replies from Tor to the controller

    Reply = SyncReply / AsyncReply
    SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine
    AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine

    MidReplyLine = StatusCode "-" ReplyLine
    DataReplyLine = StatusCode "+" ReplyLine CmdData
    EndReplyLine = StatusCode SP ReplyLine
    ReplyLine = [ReplyText] CRLF
    ReplyText = XXXX
    StatusCode = 3DIGIT

Unless specified otherwise, multiple lines in a single reply from Tor to the controller are guaranteed to share the same status code. Specific replies are mentioned below in section 3, and described more fully in section 4.

[Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes generate AsyncReplies of the form *(MidReplyLine / DataReplyLine). This is incorrect, but controllers that need to work with these versions of Tor should be prepared to get multi-line AsyncReplies with the final line (usually 650 OK) omitted.]

General-use tokens

CRLF means, "the ASCII Carriage Return character (decimal value 13) followed by the ASCII Linefeed character (decimal value 10)."

    CRLF = CR LF

How a controller tells Tor about a particular OR. There are four possible formats:

  • $Fingerprint -- The router whose identity key hashes to the fingerprint. This is the preferred way to refer to an OR.
  • $Fingerprint~Nickname -- The router whose identity key hashes to the given fingerprint, but only if the router has the given nickname.
  • $Fingerprint=Nickname -- The router whose identity key hashes to the given fingerprint, but only if the router is Named and has the given nickname.
  • Nickname -- The Named router with the given nickname, or, if no such router exists, any router whose nickname matches the one given. This is not a safe way to refer to routers, since Named status could under some circumstances change over time.

The tokens that implement the above follow:

    ServerSpec = LongName / Nickname
    LongName   = Fingerprint [ "~" Nickname ]

For tors older than 0.3.1.3-alpha, LongName may have included an equal sign ("=") in lieu of a tilde ("~"). The presence of an equal sign denoted that the OR possessed the "Named" flag:

    LongName   = Fingerprint [ ( "=" / "~" ) Nickname ]

    Fingerprint = "$" 40*HEXDIG
    NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9"
    Nickname = 1*19 NicknameChar

What follows is an outdated way to refer to ORs. Feature VERBOSE_NAMES replaces ServerID with LongName in events and GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later.

    ServerID = Nickname / Fingerprint

Unique identifiers for streams or circuits. Currently, Tor only uses digits, but this may change:

    StreamID = 1*16 IDChar
    CircuitID = 1*16 IDChar
    ConnID = 1*16 IDChar
    QueueID = 1*16 IDChar
    IDChar = ALPHA / DIGIT
    Address = ip4-address / ip6-address / hostname   (XXXX Define these)

A "CmdData" section is a sequence of octets concluded by the terminating sequence CRLF "." CRLF. The terminating sequence may not appear in the body of the data. Leading periods on lines in the data are escaped with an additional leading period as in RFC 2821 section 4.5.2.

    CmdData = *DataLine "." CRLF
    DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF
    LineItem = NonCR / 1*CR NonCRLF
    NonDotItem = NonDotCR / 1*CR NonCRLF

ISOTime, ISOTime2, and ISOTime2Frac are time formats as specified in ISO8601.

  • example ISOTime: "2012-01-11 12:15:33"
  • example ISOTime2: "2012-01-11T12:15:33"
  • example ISOTime2Frac: "2012-01-11T12:15:33.51"
    IsoDatePart = 4*DIGIT "-" 2*DIGIT "-" 2*DIGIT
    IsoTimePart = 2*DIGIT ":" 2*DIGIT ":" 2*DIGIT
    ISOTime  = IsoDatePart " " IsoTimePart
    ISOTime2 = IsoDatePart "T" IsoTimePart
    ISOTime2Frac = IsoTime2 [ "." 1*DIGIT ]

Numbers

    LeadingDigit = "1" / ... / "9"
    UInt = LeadingDigit *Digit