Difference between revisions of "Search regexp"

From AMule Project FAQ
Jump to: navigation, search
(unfinished. gonna take some coffee and come back to finish it ;))
Line 1: Line 1:
related::<md4hash>
+
== Description ==
  
So someone could try to search something like
+
This article explains how to tweak [[search]]es and some handy tips and tricks when searching with [[aMule]].
  
related::<md4hash> AND Video AND SIZE > 1000000
+
Notice that this tricks may not work for you at some moments, since it depends on the [[server]]'s running software. However, most [[ed2k]] servers run the latest version of [[lugdunum]], so this tricks will work for 99% of your servers.
  
The server checks if the file is known.
+
== Find files similar or related to some other file ==
  
If not -> End of request, 0 result.
+
In the search box enter ''related::<hash>'' where ''<hash>'' is the is the [[hash]] value of some [[file]]. The results you will get will be files which are related or similar to that file.
  
If yes : It scans the list of clients that share this file.
+
Actually, what the server does is read an index with all files all [[client]]s are sharing and see, out of those sharing the file with has value ''<hash>'', which are the most popular files. Low [[availablity]] files aren't listed.
  
A temporary 'working set' is inited to empty.
+
== Notes ==
  
For each client in the list, it scans the list of its shares, adding found file in a working set :
+
You can combine the above tricks, so someone could try to search something like ''related::<some_hash> AND Video AND SIZE > 1000000''
  
If the file has a small #availability (like 1 : only one people share it), ignore it.
+
The server checks if the file is known.
  
If the file (md4 hash) is already in the working set, adds 1 to the 'related count'
+
== Search for file except extension ... ==
  
If not, check if the file meets the search criteria (if any was specified in the search req)
+
You can use the ''not <query>'' and ''!<query>'' format in the "extension" field too in the [[Usage_Search|search window]].
  
If the file meets the criteria, adds it to the working set with the 'related count' set to 1.
+
This way you can search for files not containing the given extension, which is often very useful.
  
Some logic could be added to make sure the working set could not use too much ram (if a threshold is hit, just do a garbage collection to free half of the entries for example)
+
== The special 'Ñ' character ==
  
Then sort the working set byt the 'related count' key, and give the 300 files having the highers 'related count'. We then free the working set (no 'more' request could be asked to the server to get next 300 files, because keeping the working set in memory would be too expensive)
+
Current server and client software support [[unicode]] so it is no more an issue, but older versions would not support non-english characters, such as the ''ñ'' spanish character.
  
---
+
As a solution, the ''ñ'' character was aliased to ''n''. So, searching for ''españa'' or ''espana'' would give the same results.
  
Searches for multiple file extensions, support not<space> or !<no_space> operator too in file extension (like "zip,rar,cbz,cbr" or "!wme,!wma" or "not wme")
+
This aliasing applies also to unicode-supporting clients and servers. The only thing you should notice is that in this case, since ''ñ'' is a different character than ''n'' and unicoded recognizes it, searching for words containing ''n'' will display results containing ''ñ'', but not the other way round.
  
---
+
== Search for hashes or exact file ==
  
Ability to perform exact searches : Clients can enclose words in ' . Examples : 'blank & john' OR 'the the'
+
If you want to search for any file which's has value is ''<hash>''' (where ''<hash>'' is any MD4 hash value), you can search for ''edk2:<hash>'' and you will get the results.
  
---
+
As an extension, if you want to search for an exact file (maybe you want to see its availability or its [[rate]]) and searching it by its hash value gives several non-equal files, you can narrow the results by searching by the file's hash value ''<hash>'' and size ''<size>'': ''ed2k:<size>:<hash>''
  
Double-quotes can be used to interpret ( ) (brackets) as normal characters instad of boolean modifyiers
+
Or even simply the file's [[ed2k link]] (anything after the file's hash in the link will be ignored): ''ed2k://|file|<name>|<size>|<hash>
  
---
+
== Boolean search ==
  
words separators: , ; . : - _ ' / ! (space)
+
------------------------------------
  
---
+
Default operation: AND -> word separators: , ; . : - _ ' / ! (space)
  
 
and & or not
 
and & or not
  
---
+
Ability to perform exact searches : Clients can enclose words in ' . Examples : 'blank & john' OR 'the the'
  
Support for searches by files hashes. It accepts several links (ed2k://|file|name|size|Hash (anything after the hash will be ignored)), (ed2k:size:Hash), (magnet:?xt=ed2k:Hash), ...
+
Double-quotes can be used to interpret ( ) (brackets) as normal characters instad of boolean modifyiers
 
+
ed2k::<md4hash> or ed2k:<size>:<md4hash>
+
 
+
---
+
 
+
The letter ñ is an alias to n letter in searches. A search of 'espana' know matches 'españa'
+

Revision as of 05:54, 23 January 2006

Description

This article explains how to tweak searches and some handy tips and tricks when searching with aMule.

Notice that this tricks may not work for you at some moments, since it depends on the server's running software. However, most ed2k servers run the latest version of lugdunum, so this tricks will work for 99% of your servers.

Find files similar or related to some other file

In the search box enter related::<hash> where <hash> is the is the hash value of some file. The results you will get will be files which are related or similar to that file.

Actually, what the server does is read an index with all files all clients are sharing and see, out of those sharing the file with has value <hash>, which are the most popular files. Low availablity files aren't listed.

Notes

You can combine the above tricks, so someone could try to search something like related::<some_hash> AND Video AND SIZE > 1000000

The server checks if the file is known.

Search for file except extension ...

You can use the not <query> and !<query> format in the "extension" field too in the search window.

This way you can search for files not containing the given extension, which is often very useful.

The special 'Ñ' character

Current server and client software support unicode so it is no more an issue, but older versions would not support non-english characters, such as the ñ spanish character.

As a solution, the ñ character was aliased to n. So, searching for españa or espana would give the same results.

This aliasing applies also to unicode-supporting clients and servers. The only thing you should notice is that in this case, since ñ is a different character than n and unicoded recognizes it, searching for words containing n will display results containing ñ, but not the other way round.

Search for hashes or exact file

If you want to search for any file which's has value is <hash>' (where <hash> is any MD4 hash value), you can search for edk2:<hash> and you will get the results.

As an extension, if you want to search for an exact file (maybe you want to see its availability or its rate) and searching it by its hash value gives several non-equal files, you can narrow the results by searching by the file's hash value <hash> and size <size>: ed2k:<size>:<hash>

Or even simply the file's ed2k link (anything after the file's hash in the link will be ignored): ed2k://|file|<name>|<size>|<hash>

Boolean search


Default operation: AND -> word separators: , ; . : - _ ' / ! (space)

and & or not

Ability to perform exact searches : Clients can enclose words in ' . Examples : 'blank & john' OR 'the the'

Double-quotes can be used to interpret ( ) (brackets) as normal characters instad of boolean modifyiers