Difference between revisions of "Nodes.dat file"

From AMule Project FAQ
Jump to: navigation, search
(Format (version 1))
m (Script for dumping nodes.dat)
 
(12 intermediate revisions by 3 users not shown)
Line 13: Line 13:
 
It is used to [[bootstrap]] the Kad network when aMule starts.
 
It is used to [[bootstrap]] the Kad network when aMule starts.
  
== Format (version 1) ==
+
== Format (version 0) ==
 
This format was used for aMule up to 2.1.3 , and it is no more used
 
This format was used for aMule up to 2.1.3 , and it is no more used
  
Line 28: Line 28:
 
Once this data is put together, it is stored in little-endian.
 
Once this data is put together, it is stored in little-endian.
  
=== Example of nodes.dat version 1 file ===
+
=== Example of nodes.dat version 0 file ===
The following is a hex dump of an hypothetic ''nodes.dat'' file:
+
The following is a hex dump of a hypothetical ''nodes.dat'' file:
  
''0200000012257425DBA4EDDBD097150757404486E55E04DE40123612021F64632587A31EC2FC8566C4A9BAB184E6E9B7D44012361202''
+
<tt>0200000012257425DBA4EDDBD097150757404486E55E04DE40123612021F64632587A31EC2FC8566C4A9BAB184E6E9B7D44012361202</tt>
  
 
In the above example, the following data can be seen:
 
In the above example, the following data can be seen:
  
*Number of contacts: ''2'' (In hex: ''02000000'', remember it's in [[little endian]])
+
*Number of contacts: ''2'' (In hex: <tt>02000000</tt>, remember it's [[little endian]])
 
*Contact #1:
 
*Contact #1:
**ClientID: ''12257425DBA4EDDBD097150757404486''
+
**ClientID: <tt>12257425DBA4EDDBD097150757404486</tt>
**IP: ''222.4.94.229'' (In hex: ''E55E04DE'', remember it's in [[little endian]])
+
**IP: 222.4.94.229 (Hex: <tt>E55E04DE</tt>)
**[http://www.ietf.org/rfc/rfc768.txt UDP] Port: ''1240'' (In hex: ''4012'', remember it's in [[little endian]])
+
**UDP Port: 1240 (Hex: <tt>4012</tt>)
**[http://www.ietf.org/rfc/rfc793.txt TCP] Port: ''1236'' (In hex: ''3612'', remember it's in [[little endian]])
+
**TCP Port: 1236 (Hex: <tt>3612</tt>)
**Type: ''2'' (In hex: ''02'')
+
**Type: 2 (Hex: <tt>02</tt>)
 
*Contact #2:
 
*Contact #2:
**ClientID: ''1F64632587A31EC2FC8566C4A9BAB184''
+
**ClientID: <tt>1F64632587A31EC2FC8566C4A9BAB184</tt>
**IP: ''212.183.233.230'' (In hex: ''E6E9B7D4'', remember it's in [[little endian]])
+
**IP: 212.183.233.230 (Hex: <tt>E6E9B7D4</tt>)
**[http://www.ietf.org/rfc/rfc768.txt UDP] Port: ''1240'' (In hex: ''4012'', remember it's in [[little endian]])
+
**UDP Port: 1240 (Hex: <tt>4012</tt>)
**[http://www.ietf.org/rfc/rfc793.txt TCP] Port: ''1236'' (In hex: ''3612'', remember it's in [[little endian]])
+
**TCP Port: 1236 (Hex: <tt>3612</tt>)
**Type: ''2'' (In hex: ''02'')
+
**Type: 2 (Hex: <tt>02</tt>)
  
 
== Format (version 2) ==
 
== Format (version 2) ==
 
This format is used in aMule version 2.2.0 and later.
 
This format is used in aMule version 2.2.0 and later.
 +
 +
Here again, all fields are stored without any separator character. This is done like this because all the fields have a specific size.
 +
 +
The file begins with 4 bytes storing the null value (hex 0x00000000)
 +
Then follows the fields:
 +
 +
*''Version number'': value 2 stored in little endian (0x02000000 , 4 bytes)
 +
*''Number of contacts'': Number of contacts that will be listed (4 bytes)
 +
After specifying the number of contacts that will be listed, the file lists them.
 +
Each contact takes 34 bytes, splitted into the following fields:
 +
*''ClientID'' (16 bytes): The contact's [[ID|ClientID]]
 +
*''IP'' (4 bytes): The contact's [[IP address|IP]]
 +
*''UDP Port'' (2 bytes): The [http://www.ietf.org/rfc/rfc768.txt UDP] [[port]] to [[connect]] to when trying to reach the contact
 +
*''TCP Port'' (2 bytes): The [http://www.ietf.org/rfc/rfc793.txt TCP] [[port]] to [[connect]] to when trying to reach the contact
 +
*''Version'' (1 byte): Kademlia protocol version. '0' means a Kad v1 node, any value > 0 means a Kad v2 node and determines what kind of packets can be sent to a node, what features it supports, etc.
 +
*''KadUDPKey'' (8 bytes): In Kad v2, Sender's 32-bit key (node version >5), bound to the receiver (local) IP . It is used in encrypted communication to verify node validity.
 +
*''Verified'': (1 byte) Any value different from 0 states the contact has been verified.
 +
 +
Once this data is put together, it is stored in little-endian.
 +
 +
=== Format v2 Details as a Table ===
 +
 +
{| border="1" cellspacing="5"
 +
|-
 +
! Bytes !! Value
 +
! align="left" | Notes
 +
|-
 +
| colspan="3" | Header information
 +
|-
 +
| 4 || NULL    || Hex: <tt>0x00000000</tt>
 +
|-
 +
| 4 || Version || Hex: <tt>0x02000000</tt>
 +
|-
 +
| 4 || Number of contacts || Count of nodes in file
 +
|-
 +
| colspan="4" | Each contact's information
 +
|-
 +
| 16 || ClientID  || Contact's [[ID|Client ID]]
 +
|-
 +
| 4  || [[IP address]] || Not enough bytes for IPv6 addresses
 +
|-
 +
| 2  || UDP Port  || UDP port to connect to when trying to reach the contact
 +
|-
 +
| 2  || TCP Port  || TCP port to connect to when trying to reach the contact
 +
|-
 +
| 1  || Version  || Kademlia protocol version: 0' means a Kad v1 node, any value > 0 means a Kad v2 node and determines what kind of packets can be sent to a node, what features it supports, etc.
 +
|-
 +
| 8  || KadUDPKey || Sender's 32-bit key (node version >5), bound to the receiver (local) IP, which is used in encrypted communication to verify node validity (Kad v2)
 +
|-
 +
| 1  || Verified  || Any value different from 0 indicates the contact has been verified
 +
|}
 +
 +
* Total bytes for a given contact: <tt>16 + 4 + 2 + 2 + 1 + 8 + 1 = </tt>34 bytes
 +
* All fields are stored as little-endian hexidecimal values.
  
 
== Extra ==
 
== Extra ==
Line 58: Line 112:
  
 
== Script for dumping nodes.dat ==
 
== Script for dumping nodes.dat ==
Here's a python script which can be used for dumping contents of the nodes.dat v1 and v2 files:
+
Here's a python script which can be used for dumping contents of the nodes.dat v0 and v2 files:
 +
<pre>
 +
#!/usr/bin/env python
 +
# this code belongs to public domain
 +
# requires nodes.dat filename passed as argument
 +
 +
import struct
 +
import sys
  
#!/usr/bin/env python
+
version = 0
# this code belongs to public domain
+
count = 0
+
 
import struct
+
# check number of command line arguments
version = 1
+
if len(sys.argv) != 2:
count = 0
+
    sys.exit("Please supply a nodes.dat file!")
 
   
 
   
nodefile = open('nodes.dat', 'r')
+
nodefile = open(sys.argv[1], 'r')
(count,) = struct.unpack("<I", nodefile.read(4))  
+
 
if (count == 0) : (version,) = struct.unpack("<I", nodefile.read(4))  
+
(count,) = struct.unpack("<I", nodefile.read(4))
if (version == 2) : (count,) = struct.unpack("<I", nodefile.read(4))  
+
if (count == 0):  
<nowiki>if (version > 0 & version < 3) :</nowiki>
+
    (version,) = struct.unpack("<I", nodefile.read(4))
print 'Nodes.dat file version = %d' %(version)
+
    (count,)   = struct.unpack("<I", nodefile.read(4))
print 'Node count = %d' %(count)
+
 
print ' '
+
if (version >= 0 & version < 3):
if (version == 1):
+
    print 'Nodes.dat file version = %d' %(version)
print ' idx type  IP address      udp  tcp'
+
    print 'Node count = %d' %(count)
else :
+
    print ' '
print ' idx type  IP address     udp  tcp'
+
    if (version == 0):
+
        print ' idx type  IP address      udp  tcp'
for i in xrange(count):
+
    else :
if (version == 1):
+
        print ' idx Ver IP address       udp  tcp kadUDPKey        verified'
(clientid, ip1, ip2, ip3, ip4, udpport, tcpport, type) = \
+
 
struct.unpack("<16s4BHHB", nodefile.read(25))
+
    for i in xrange(count):
else :
+
        if (version == 0):
(clientid, ip1, ip2, ip3, ip4, udpport, tcpport, type, kadUDPkey, verified) = \
+
            (clientid, ip1, ip2, ip3, ip4, udpport, tcpport,  
struct.unpack("<16s4BHHB8sB", nodefile.read(34))
+
            type) = struct.unpack("<16s4BHHB", nodefile.read(25))
+
            ipaddr = '%d.%d.%d.%d' % (ip1, ip2, ip3, ip4)
ipaddr = '%d.%d.%d.%d' % (ip1, ip2, ip3, ip4)
+
            print '%4d %4d %-15s %5d %5d' % (i, type, ipaddr,
print '%4d %4d %-15s %5d %5d' % (i, type, ipaddr, udpport, tcpport)
+
                                            udpport, tcpport)
+
        else :
else :  
+
            (clientid, ip1, ip2, ip3, ip4, udpport, tcpport, type, kadUDPkey,  
print 'Cannot handle nodes.dat version %d !' (version)
+
            verified) = struct.unpack("<16s4BHHBQB", nodefile.read(34))
+
            ipaddr = '%d.%d.%d.%d' % (ip1, ip2, ip3, ip4)
nodefile.close()
+
            if (verified == 0): verf='N'
 +
            else : verf='Y'
 +
            print '%4d %3d %-15s %5d %5d %16x %s' % (i, type, ipaddr, udpport,
 +
                                                    tcpport, kadUDPkey, verf)
 +
 
 +
else :
 +
    print 'Cannot handle nodes.dat version %d !' (version)
 +
 
 +
nodefile.close()
 +
</pre>
 +
 
 +
[[Category:Program Files]]

Latest revision as of 01:05, 12 January 2011

English | Deutsch

File

Name: nodes.dat

Location: ~/.aMule/

Description

This file stores details about known Kademlia clients (also known as Kad nodes).

It is used to bootstrap the Kad network when aMule starts.

Format (version 0)

This format was used for aMule up to 2.1.3 , and it is no more used

All fields are stored without any separator character. This is done like this because all the fields have a specific size:

  • Number of contacts: Number of contacts that will be listed (4 bytes)

After specifying the number of contacts that will be listed, the file lists them. Each contact takes 25 bytes, splitted into the following fields:

  • ClientID: The contact's ClientID (16 bytes)
  • IP: The contact's IP (4 bytes)
  • UDP Port: The UDP port to connect to when trying to reach the contact (2 bytes)
  • TCP Port: The TCP port to connect to when trying to reach the contact (2 bytes)
  • Type: This indicates the type of the contact, which is how much you can be confident on that contact (a scale from 0 to 4, being 0 the best and 4 the worst). (1 byte)

Once this data is put together, it is stored in little-endian.

Example of nodes.dat version 0 file

The following is a hex dump of a hypothetical nodes.dat file:

0200000012257425DBA4EDDBD097150757404486E55E04DE40123612021F64632587A31EC2FC8566C4A9BAB184E6E9B7D44012361202

In the above example, the following data can be seen:

  • Number of contacts: 2 (In hex: 02000000, remember it's little endian)
  • Contact #1:
    • ClientID: 12257425DBA4EDDBD097150757404486
    • IP: 222.4.94.229 (Hex: E55E04DE)
    • UDP Port: 1240 (Hex: 4012)
    • TCP Port: 1236 (Hex: 3612)
    • Type: 2 (Hex: 02)
  • Contact #2:
    • ClientID: 1F64632587A31EC2FC8566C4A9BAB184
    • IP: 212.183.233.230 (Hex: E6E9B7D4)
    • UDP Port: 1240 (Hex: 4012)
    • TCP Port: 1236 (Hex: 3612)
    • Type: 2 (Hex: 02)

Format (version 2)

This format is used in aMule version 2.2.0 and later.

Here again, all fields are stored without any separator character. This is done like this because all the fields have a specific size.

The file begins with 4 bytes storing the null value (hex 0x00000000) Then follows the fields:

  • Version number: value 2 stored in little endian (0x02000000 , 4 bytes)
  • Number of contacts: Number of contacts that will be listed (4 bytes)

After specifying the number of contacts that will be listed, the file lists them. Each contact takes 34 bytes, splitted into the following fields:

  • ClientID (16 bytes): The contact's ClientID
  • IP (4 bytes): The contact's IP
  • UDP Port (2 bytes): The UDP port to connect to when trying to reach the contact
  • TCP Port (2 bytes): The TCP port to connect to when trying to reach the contact
  • Version (1 byte): Kademlia protocol version. '0' means a Kad v1 node, any value > 0 means a Kad v2 node and determines what kind of packets can be sent to a node, what features it supports, etc.
  • KadUDPKey (8 bytes): In Kad v2, Sender's 32-bit key (node version >5), bound to the receiver (local) IP . It is used in encrypted communication to verify node validity.
  • Verified: (1 byte) Any value different from 0 states the contact has been verified.

Once this data is put together, it is stored in little-endian.

Format v2 Details as a Table

Bytes Value Notes
Header information
4 NULL Hex: 0x00000000
4 Version Hex: 0x02000000
4 Number of contacts Count of nodes in file
Each contact's information
16 ClientID Contact's Client ID
4 IP address Not enough bytes for IPv6 addresses
2 UDP Port UDP port to connect to when trying to reach the contact
2 TCP Port TCP port to connect to when trying to reach the contact
1 Version Kademlia protocol version: 0' means a Kad v1 node, any value > 0 means a Kad v2 node and determines what kind of packets can be sent to a node, what features it supports, etc.
8 KadUDPKey Sender's 32-bit key (node version >5), bound to the receiver (local) IP, which is used in encrypted communication to verify node validity (Kad v2)
1 Verified Any value different from 0 indicates the contact has been verified
  • Total bytes for a given contact: 16 + 4 + 2 + 2 + 1 + 8 + 1 = 34 bytes
  • All fields are stored as little-endian hexidecimal values.

Extra

Since the number of contacts field is 4 bytes long, the maximum number of nodes you could store in this file is 4294967296 (~4300M), which should be far enough. Anyway, since this number is so big, aMule, eMule and all clients have hard limitted the amount of contacts that can be stored (aMule's hard limit is 5000).

Since Type 4 contacts are those which are marked for deletion, there should never be any Type 4 contact in the nodes.dat file. If there was, it would just be ignored when reading the file.

Script for dumping nodes.dat

Here's a python script which can be used for dumping contents of the nodes.dat v0 and v2 files:

#!/usr/bin/env python
# this code belongs to public domain
# requires nodes.dat filename passed as argument
 
import struct
import sys

version = 0
count = 0

# check number of command line arguments
if len(sys.argv) != 2:
    sys.exit("Please supply a nodes.dat file!")
 
nodefile = open(sys.argv[1], 'r')

(count,) = struct.unpack("<I", nodefile.read(4))
if (count == 0): 
    (version,) = struct.unpack("<I", nodefile.read(4))
    (count,)   = struct.unpack("<I", nodefile.read(4))

if (version >= 0 & version < 3):
    print 'Nodes.dat file version = %d' %(version)
    print 'Node count = %d' %(count)
    print ' '
    if (version == 0):
        print ' idx type  IP address      udp   tcp'
    else :
        print ' idx Ver IP address        udp   tcp kadUDPKey        verified'

    for i in xrange(count):
        if (version == 0):
            (clientid, ip1, ip2, ip3, ip4, udpport, tcpport, 
             type) = struct.unpack("<16s4BHHB", nodefile.read(25))
            ipaddr = '%d.%d.%d.%d' % (ip1, ip2, ip3, ip4)
            print '%4d %4d %-15s %5d %5d' % (i, type, ipaddr, 
                                             udpport, tcpport)
        else :
            (clientid, ip1, ip2, ip3, ip4, udpport, tcpport, type,  kadUDPkey, 
             verified) = struct.unpack("<16s4BHHBQB", nodefile.read(34))
            ipaddr = '%d.%d.%d.%d' % (ip1, ip2, ip3, ip4)
            if (verified == 0): verf='N'
            else : verf='Y'
            print '%4d %3d %-15s %5d %5d %16x %s' % (i, type, ipaddr, udpport,
                                                     tcpport, kadUDPkey, verf)

else :
    print 'Cannot handle nodes.dat version %d !' (version)

nodefile.close()