Difference between revisions of "EC Protocol HOWTO"

From AMule Project FAQ
Jump to: navigation, search
m (Transmission layer: 32 bits: from 0 to 31 (not to 32))
(Some style applied)
Line 3: Line 3:
 
== Basic Protocol Structure ==
 
== Basic Protocol Structure ==
  
 +
EC protocol consist of two layers: a low-level ''transmission layer'', and a high level ''application layer''.
  
'''Protocol definition'''
+
The transmission layer consist of two int32 values.
  
Short description:
+
* A uint32 flag specify the format of the message, e.g. if the packet uses UTF-8 encoded numbers or is compressed by zlib.
 +
* The next uint32 determines the size of the application layer data.
  
EC protocol consist of two layers: a low-level transmission layer, and
+
The application layer consists of an op-code and a tag counter, followed by a tag structure.
a high level application layer.<br>
+
 
+
The transmission layer consist of two int32 values.<br>
+
A uint32 flag specify the format of the message e.g. if the packet uses utf8 encoded numbers or is compressed by zlib.<br>
+
The next uint32 determines the size of the application layer data.<br>
+
 
+
The application layer consists of an op-code and a tag counter,followed by a tag structure.
+
  
 
== Transmission layer ==
 
== Transmission layer ==
  
 +
The transmission layer is completely independent of the application layer, and holds only transport-related information.
  
The transmission layer is completely independent of the application layer,<br>
+
The transmission layer actually consists of an uint32 number, referenced below as flags, which describes flags for the current transmission session (send/receive operation).
and holds only transport-related information.
+
  
The transmission layer actually consists of an uint32 number, referenced below as flags,<br>
+
This four-byte value is the only one in the whole protocol that is transmitted LSB first, and zero bytes omitted (therefore an empty transmission flags value is sent as <tt>0x20</tt>, not <tt>0x20 0x00 0x00 0x00</tt>).
which describes flags for the current transmission session (send/receive operation).<br>
+
  
This four-byte value is the only one in the whole protocol, that is transmitted LSB first,<br>
+
=== Bit description ===
and zero bytes omitted (therefore an empty transmission flags value is sent as 0x20, not 0x20 0x0 0x0 0x0).<br>
+
  
Bit description:
+
* BIT 0: Compression flag. When set, zlib compression is applied to the application layer's data.
  
:bit 0: Compression flag. When set, zlib compression is applied to the application layer's data.
+
* BIT 1: Compressed numbers. When set (presumably on small packets that don't worth compressing by zlib), all the numbers used in the protocol are encoded as a wide char converted to utf-8 to let some zero bytes not to be sent over the network.
  
:bit 1: Compressed numbers. When set (presumably on small packets that doesn't worth compressing by zlib), all the numbers used
+
* BIT 2: Has ID. When this flag is set, an uint32 number follows the flags, which is the ID of this packet. The response to this packet also has to have this same ID number. The only requirement for the ID value is that they should be unique in one session (or at least do not repeat for a reasonably long time).
::in the protocol are encoded as a wide char converted to utf-8 to let some zero bytes not to be sent over the network
+
  
:bit 2: Has ID. When this flag is set, an uint32 number follows the flags, which is the ID of this packet. The response to this
+
* BIT 3: Reserved for later use.
::packet also has to have this ID. The only requirement for the ID value is that they should be unique in one session (or at
+
::least do not repeat for a reasonably long time.)
+
  
:bit 3: Reserved for later use.
+
* BIT 4: Accepts value present. A client sets this flag and sends  another uint32 value (encoded as above, LSB first, zero bytes omitted), which is a fully constructed flags value, bits set meaning that the client can accept those extensions. No extensions can be used, until the other side sends an accept value for them. It is not defined when this value should be sent, best is on first transfer, but can be sent any time later, even changing the previously announced flags.
  
:bit 4: Accepts value present. A client sets this flag and sends  another uint32 value (encoded as above, LSB first, zero
+
* BIT 5: Always set to '''<tt>1</tt>''', to distinguish from older (pre-rc8) clients.
::bytes omitted), which is a fully constructed flags value, bits set meaning that the client can accept those extensions.
+
::No extensions can be used, until the other side sends an accept value for them. It is not defined when this value
+
::should be send, best is on first transfer, but can be sent any time later, even changing the previously announced flags.
+
  
:bit 5: Always set to 1, to distinguish from older (pre-rc8) clients.
+
* BIT 6: Always set to '''<tt>0</tt>''', to distinguish from older (pre-rc8) clients.
  
:bit 6: Always set to 0, to distinguish from older (pre-rc8) clients.
+
* BITS 7, 15, 23: Extension flag, means that the next byte of the flags is present.
  
:bits 7,15,23: Extension flag, means that the next byte of the flags is present.
+
* BITS 8-14, 16-22, 24-31: Reserved for later use.
  
:bits 8-14,16-22,24-31: Reserved for later use.
+
=== Transmission layer example ===
 
+
 
+
Transmission layer example:
+
 
:0x30 0x23 <appdata> - Client uses no extensions on this packet, and indicates that it can accept zlib compression and compressed numbers.
 
:0x30 0x23 <appdata> - Client uses no extensions on this packet, and indicates that it can accept zlib compression and compressed numbers.
  
Notes:
+
=== Notes ===
:Note 1: On the "accepts" value, the predefined flags must be set to their predefined values, because this can be used as a sort of a sanity check.
+
* '''Note 1:''' On the "<tt>accepts</tt>" value, the predefined flags must be set to their predefined values, because this can be used as a sort of a sanity check.
  
:Note 2: Bits marked as "reserved" should always be set to 0.
+
* '''Note 2:''' Bits marked as "reserved" should always be set to 0.
  
 
== Application layer ==
 
== Application layer ==
  
 +
Data transmission is done in packets. A packet can be considered as a special tag - with no data, no <tt>tagLen</tt> field, and with the <tt>tagCount</tt> field always present. All numbers in the application layer are transmitted in network byte order, i.e. MSB first.
  
Data transmission is done in packets. A packet can be considered as<br>
+
A packet contains the following:
a special tag - with no data, no tagLen field, and with the tagCount<br>
+
[ec_opcode_t] OPCODE
field always present. All numbers part of the application layer are<br>
+
[uint16] TAGCOUNT
transmitted in network byte order, i.e. MSB first.<br>
+
<tags>
  
:A packet contains the following:
+
In detail: The opcode means what the data fields contain. Its type is set as <tt>ec_opcode_t</tt>, which currently is an uint8. <tt>TagCount</tt> is the number of first level tags this packet has. Then are the tags themselves.
::[ec_opcode_t] OPCODE
+
::[uint16] TAGCOUNT
+
::<tags>
+
  
In detail: The opcode means what to to or what the data fields contain.<br>
+
A tag consist of:
Its type is set as ec_opcode_t, which currently is an uint8.<br>
+
[ec_tagname_t] TAGNAME
TagCount is the number of first level tags this packet has. Then are the<br>
+
[ec_tagtype_t] TAGTYPE
tags themselves.
+
[ec_taglen_t] TAGLEN
 +
<[uint16] TAGCOUNT>?
 +
  &lt;sub-tags>
 +
  <tag data>
  
:A tag consist of:
+
The <tt>ec_tagname_t</tt> is defined as an uint16, <tt>ec_taglen_t</tt> as an uint32 value at the moment. <tt>ec_tagtype_t</tt> is an uint8. <tt>TagName</tt> tells what it contains (see ECcodes.h for details). <tt>TagType</tt> sends the type of this tag (see ECPacket.h for types) <tt>TagLen</tt> contains the whole length of the tag, including the lengths of the possible sub-tags, but without the size of the <tt>tagName</tt>, <tt>tagType</tt> and <tt>tagLen</tt> fields. Actually the lowest bit of the <tt>tagName</tt> doesn't belong to the <tt>tagName</tt> itself, so it has to be cleared before checking the name.
::[ec_tagname_t] TAGNAME
+
::[ec_tagtype_t] TAGTYPE
+
::[ec_taglen_t] TAGLEN
+
::<[uint16] TAGCOUNT>?
+
:::&lt;sub-tags&gt;
+
:::<tag data>
+
 
+
The ec_tagname_t is defined as an uint16, ec_taglen_t as an uint32 value<br>
+
at the moment. ec_tagtype_t is an uint8. <br>
+
TagName tells what it contains (see ECcodes.h for details).<br>
+
TagType sends the type of this tag (see ECPacket.h for types)<br>
+
TagLen contains the whole length of the tag, including the lengths of the<br>
+
possible sub-tags, but without the size of the tagName, tagType and <br>
+
tagLen fields. Actually the lowest bit of the tagname doesn't belong to the <br>
+
tagName itself, so it has to be cleared before checking the name.<br>
+
 
+
Tags may contain sub-tags to store the information, and a tagCount field<br>
+
is present only for these tags. The presence of the tagCount field can<br>
+
be tested by checking the lowest bit of the tagName field, when it is<br>
+
set, tagCount field present.<br>
+
 
+
When a tag contains sub-tags, the sub-tags are sent before the tag's own<br>
+
data. So, tag data length can be calculated by substracting all sub-tags'<br>
+
length from the tagLen value, and the remainder is the data length, if<br>
+
non-zero.
+
  
 +
Tags may contain sub-tags to store the information, and a <tt>tagCount</tt> field is present only for these tags. The presence of the <tt>tagCount</tt> field can be tested by checking the lowest bit of the <tt>tagName</tt> field, when it is set, tagCount field present.
  
 +
When a tag contains sub-tags, the sub-tags are sent before the tag's own data. So, tag data length can be calculated by substracting all sub-tags' length from the tagLen value, and the remainder is the data length, if non-zero.
  
 
== Future Changes ==
 
== Future Changes ==
  
 
Future changes of the EC protocol (probably after 2.2.0) may be:
 
Future changes of the EC protocol (probably after 2.2.0) may be:
*no more \0 for string termination
+
* No more <tt>\0</tt> for string termination.
*last bit of flag byte indicates a following flag byte, and so on
+
* Last bit of flag byte indicates a following flag byte, and so on.
 
+
  
 
== Resources ==
 
== Resources ==
  
 
You get definitions of OP- and Tag-Codes at this locations in the source:
 
You get definitions of OP- and Tag-Codes at this locations in the source:
*./src/lib/ec/[c#|cpp|java]/ECCodes.[cs|h|java]
+
* <tt>./src/lib/ec/[c#|cpp|java]/ECCodes.[cs|h|java]</tt>
* ./docs/EC_Protocol.txt (outdated, but much useful information)
+
* <tt>./docs/EC_Protocol.txt</tt> (outdated, but much useful information)
 
+
  
 
== Examples ==
 
== Examples ==
  
'''Notes:'''
+
=== Notes ===
*aMule sends EC packets in two flavours (albeit it would understand other flag options as well), depending on the packet size.
+
**zlib compressed application data that doesn't use utf8 compressed numbers when decompressed.
+
**utf8 compressed numbers in the application data
+
*The tag size doesn't take into account the size of utf8 compressed numbers in subtags. When parsing, you may want to drop the length completely and get it by the size of the subtags + size of the value field (determined by the value type flag).
+
  
 +
* aMule sends EC packets in two flavours (albeit it would understand other flag options as well), depending on the packet size.
 +
** zlib compressed application data that doesn't use UTF-8 compressed numbers when decompressed.
 +
** UTF-8 compressed numbers in the application data
 +
* The tag size doesn't take into account the size of UTF-8 compressed numbers in subtags. When parsing, you may want to drop the length completely and get it by the size of the subtags + size of the value field (determined by the value type flag).
  
----
+
=== Authorization ===
  
 
+
This is a packet in hex values that is sent to aMule for authorization:
This is a packet in hex values that is send to aMule
+
for authorization:
+
 
<pre>
 
<pre>
00 00 00 22 //flag
+
00 00 00 22             // flag
00 00 00 36 //packet body length 54
+
00 00 00 36             // packet body length 54
02     //EC_OP_AUTH_REQ
+
02                     // EC_OP_AUTH_REQ
04     //tag count
+
04                     // tag count
  
c8 80         //EC_TAG_CLIENT_NAME
+
c8 80                   // EC_TAG_CLIENT_NAME
06             //EC_TAGTYPE_STRING
+
06                     // EC_TAGTYPE_STRING
0d             //value length 13
+
0d                     // value length 13
61 6d 75 6c 65 2d 72 65 6d 6f 74 65 00 //"amule-remote\0"
+
61 6d 75 6c 65 2d 72 65 6d 6f 74 65 00 // "amule-remote\0"
  
c8 82         //EC_TAG_CLIENT_VERSION
+
c8 82                   // EC_TAG_CLIENT_VERSION
06             //EC_TAGTYPE_STRING
+
06                     // EC_TAGTYPE_STRING
07             //value length 7
+
07                     // value length 7
30 78 30 30 30 31 00 // "0x0001\0"
+
30 78 30 30 30 31 00   // "0x0001\0"
  
04             //EC_TAG_PROTOCOL_VERSION
+
04                     // EC_TAG_PROTOCOL_VERSION
03             //EC_TAGTYPE_UINT16
+
03                     // EC_TAGTYPE_UINT16
02             //value length 2
+
02                     // value length 2
02 00         //value is defined by EC_CURRENT_PROTOCOL_VERSION
+
02 00                   // value is defined by EC_CURRENT_PROTOCOL_VERSION
  
02             //EC_TAG_PASSWD_HASH
+
02                     // EC_TAG_PASSWD_HASH
09             //EC_TAGTYPE_HASH16
+
09                     // EC_TAGTYPE_HASH16
10             //value length 16
+
10                     // value length 16
47 bc e5 c7 4f 58 9f 48 //md5 hashed password string
+
47 bc e5 c7 4f 58 9f 48 // md5 hashed password string
67 db d5 7e 9c a9 f8 08 //password "aaa" was used
+
67 db d5 7e 9c a9 f8 08 // password "aaa" was used
 
</pre>
 
</pre>
  
c8 80 is in fact an utf8 encoded number. It decodes to 02 00 (or 512 in decimal).<br>
+
<tt>c8 80</tt> is in fact an UTF-8 encoded number. It decodes to <tt>02 00</tt> (or 512 in decimal). As every tag code, it is shifted one bit to left to fit in a bit that indicates the presence of subtags. The lowest bit of <tt>02 00</tt> is <tt>0</tt>; so this tag doesn't have subtags. When we shift the value to the right one bit (or divide by 2), we get <tt>01 00</tt>. That's the value that can be found in ECCodes.h.
As every tag code, it is shifted one bit to left to
+
fit in a bit that indicates the presence of subtags.<br>
+
The lowest bit of 02 00 is 0; so this tag doesn't have subtags.<br>
+
When we shift the value to the right one bit (or divide by 2),
+
we get 01 00.<br>
+
That's the value that can be found in ECCodes.h.
+
  
 +
=== Search request ===
  
----
+
This is a simple search request that is send without UTF-8 compressed numbers.
 +
<pre>
 +
00 00 00 20            // plain format, no compression
 +
00 00 00 21            // message length: 33
  
 +
26                      // EC_OP_SEARCH_START
 +
00 01                  // tag count
 +
    0e 03              // EC_TAG_SEARCH_TYPE
 +
    02                  // EC_TAGTYPE_UINT8
 +
    00 00 00 17        // tag length: 23
 +
    00 02              // subtag count
  
This is a simple search request that is send without utf8 compressed numbers.<br>
+
        0e 04          // EC_TAG_SEARCH_NAME
<pre>
+
        06              // EC_TAGTYPE_STRING
00 00 00 20 //plain format, no compression
+
        00 00 00 05    // tag length
00 00 00 21 //message length: 33
+
        74 65 73 74 00 // "test\0"
   
+
 
26 //EC_OP_SEARCH_START
+
        0e 0a          //EC_TAG_SEARCH_FILE_TYPE
00 01 //tag count
+
        06              //EC_TAGTYPE_STRING
0e 03 //EC_TAG_SEARCH_TYPE
+
        00 00 00 01    // tag length
02 //EC_TAGTYPE_UINT8
+
        00             // "\0"
00 00 00 17 //tag length: 23
+
00 02 //subtag count
+
  
0e 04 //EC_TAG_SEARCH_NAME
+
    00                 // uint8 search type (local)
06 //EC_TAGTYPE_STRING
+
00 00 00 05 //tag length
+
74 65 73 74 00 //"test\0"
+
+
0e 0a //EC_TAG_SEARCH_FILE_TYPE
+
06 //EC_TAGTYPE_STRING
+
00 00 00 01 //tag length
+
00 //"\0"
+
+
00 //uint8 search type (local)
+
 
</pre>
 
</pre>

Revision as of 20:55, 31 January 2009

Work in progress, this site is under heavy construction.

Basic Protocol Structure

EC protocol consist of two layers: a low-level transmission layer, and a high level application layer.

The transmission layer consist of two int32 values.

  • A uint32 flag specify the format of the message, e.g. if the packet uses UTF-8 encoded numbers or is compressed by zlib.
  • The next uint32 determines the size of the application layer data.

The application layer consists of an op-code and a tag counter, followed by a tag structure.

Transmission layer

The transmission layer is completely independent of the application layer, and holds only transport-related information.

The transmission layer actually consists of an uint32 number, referenced below as flags, which describes flags for the current transmission session (send/receive operation).

This four-byte value is the only one in the whole protocol that is transmitted LSB first, and zero bytes omitted (therefore an empty transmission flags value is sent as 0x20, not 0x20 0x00 0x00 0x00).

Bit description

  • BIT 0: Compression flag. When set, zlib compression is applied to the application layer's data.
  • BIT 1: Compressed numbers. When set (presumably on small packets that don't worth compressing by zlib), all the numbers used in the protocol are encoded as a wide char converted to utf-8 to let some zero bytes not to be sent over the network.
  • BIT 2: Has ID. When this flag is set, an uint32 number follows the flags, which is the ID of this packet. The response to this packet also has to have this same ID number. The only requirement for the ID value is that they should be unique in one session (or at least do not repeat for a reasonably long time).
  • BIT 3: Reserved for later use.
  • BIT 4: Accepts value present. A client sets this flag and sends another uint32 value (encoded as above, LSB first, zero bytes omitted), which is a fully constructed flags value, bits set meaning that the client can accept those extensions. No extensions can be used, until the other side sends an accept value for them. It is not defined when this value should be sent, best is on first transfer, but can be sent any time later, even changing the previously announced flags.
  • BIT 5: Always set to 1, to distinguish from older (pre-rc8) clients.
  • BIT 6: Always set to 0, to distinguish from older (pre-rc8) clients.
  • BITS 7, 15, 23: Extension flag, means that the next byte of the flags is present.
  • BITS 8-14, 16-22, 24-31: Reserved for later use.

Transmission layer example

0x30 0x23 <appdata> - Client uses no extensions on this packet, and indicates that it can accept zlib compression and compressed numbers.

Notes

  • Note 1: On the "accepts" value, the predefined flags must be set to their predefined values, because this can be used as a sort of a sanity check.
  • Note 2: Bits marked as "reserved" should always be set to 0.

Application layer

Data transmission is done in packets. A packet can be considered as a special tag - with no data, no tagLen field, and with the tagCount field always present. All numbers in the application layer are transmitted in network byte order, i.e. MSB first.

A packet contains the following:

[ec_opcode_t] OPCODE
[uint16] TAGCOUNT
<tags>

In detail: The opcode means what the data fields contain. Its type is set as ec_opcode_t, which currently is an uint8. TagCount is the number of first level tags this packet has. Then are the tags themselves.

A tag consist of:

[ec_tagname_t] TAGNAME
[ec_tagtype_t] TAGTYPE
[ec_taglen_t] TAGLEN
<[uint16] TAGCOUNT>?
  <sub-tags>
  <tag data>

The ec_tagname_t is defined as an uint16, ec_taglen_t as an uint32 value at the moment. ec_tagtype_t is an uint8. TagName tells what it contains (see ECcodes.h for details). TagType sends the type of this tag (see ECPacket.h for types) TagLen contains the whole length of the tag, including the lengths of the possible sub-tags, but without the size of the tagName, tagType and tagLen fields. Actually the lowest bit of the tagName doesn't belong to the tagName itself, so it has to be cleared before checking the name.

Tags may contain sub-tags to store the information, and a tagCount field is present only for these tags. The presence of the tagCount field can be tested by checking the lowest bit of the tagName field, when it is set, tagCount field present.

When a tag contains sub-tags, the sub-tags are sent before the tag's own data. So, tag data length can be calculated by substracting all sub-tags' length from the tagLen value, and the remainder is the data length, if non-zero.

Future Changes

Future changes of the EC protocol (probably after 2.2.0) may be:

  • No more \0 for string termination.
  • Last bit of flag byte indicates a following flag byte, and so on.

Resources

You get definitions of OP- and Tag-Codes at this locations in the source:

  • ./src/lib/ec/[c#|cpp|java]/ECCodes.[cs|h|java]
  • ./docs/EC_Protocol.txt (outdated, but much useful information)

Examples

Notes

  • aMule sends EC packets in two flavours (albeit it would understand other flag options as well), depending on the packet size.
    • zlib compressed application data that doesn't use UTF-8 compressed numbers when decompressed.
    • UTF-8 compressed numbers in the application data
  • The tag size doesn't take into account the size of UTF-8 compressed numbers in subtags. When parsing, you may want to drop the length completely and get it by the size of the subtags + size of the value field (determined by the value type flag).

Authorization

This is a packet in hex values that is sent to aMule for authorization:

00 00 00 22             // flag
00 00 00 36             // packet body length 54
02                      // EC_OP_AUTH_REQ
04                      // tag count

c8 80                   // EC_TAG_CLIENT_NAME
06                      // EC_TAGTYPE_STRING
0d                      // value length 13
61 6d 75 6c 65 2d 72 65 6d 6f 74 65 00 // "amule-remote\0"

c8 82                   // EC_TAG_CLIENT_VERSION
06                      // EC_TAGTYPE_STRING
07                      // value length 7
30 78 30 30 30 31 00    // "0x0001\0"

04                      // EC_TAG_PROTOCOL_VERSION
03                      // EC_TAGTYPE_UINT16
02                      // value length 2
02 00                   // value is defined by EC_CURRENT_PROTOCOL_VERSION

02                      // EC_TAG_PASSWD_HASH
09                      // EC_TAGTYPE_HASH16
10                      // value length 16
47 bc e5 c7 4f 58 9f 48 // md5 hashed password string
67 db d5 7e 9c a9 f8 08 // password "aaa" was used

c8 80 is in fact an UTF-8 encoded number. It decodes to 02 00 (or 512 in decimal). As every tag code, it is shifted one bit to left to fit in a bit that indicates the presence of subtags. The lowest bit of 02 00 is 0; so this tag doesn't have subtags. When we shift the value to the right one bit (or divide by 2), we get 01 00. That's the value that can be found in ECCodes.h.

Search request

This is a simple search request that is send without UTF-8 compressed numbers.

00 00 00 20             // plain format, no compression
00 00 00 21             // message length: 33

26                      // EC_OP_SEARCH_START
00 01                   // tag count
    0e 03               // EC_TAG_SEARCH_TYPE
    02                  // EC_TAGTYPE_UINT8
    00 00 00 17         // tag length: 23
    00 02               // subtag count

        0e 04           // EC_TAG_SEARCH_NAME
        06              // EC_TAGTYPE_STRING
        00 00 00 05     // tag length
        74 65 73 74 00  // "test\0"

        0e 0a           //EC_TAG_SEARCH_FILE_TYPE
        06              //EC_TAGTYPE_STRING
        00 00 00 01     // tag length
        00              // "\0"

    00                  // uint8 search type (local)