Difference between revisions of "Search regexp"
Line 1: | Line 1: | ||
− | + | related::<md4hash> | |
− | + | So someone could try to search something like | |
− | + | ||
− | + | ||
− | + | ||
− | + | related::<md4hash> AND Video AND SIZE > 1000000 | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | The server checks if the file is known. | |
− | + | If not -> End of request, 0 result. | |
− | + | If yes : It scans the list of clients that share this file. | |
− | + | A temporary 'working set' is inited to empty. | |
− | + | For each client in the list, it scans the list of its shares, adding found file in a working set : | |
− | + | If the file has a small #availability (like 1 : only one people share it), ignore it. | |
− | + | If the file (md4 hash) is already in the working set, adds 1 to the 'related count' | |
− | + | If not, check if the file meets the search criteria (if any was specified in the search req) | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | If | + | If the file meets the criteria, adds it to the working set with the 'related count' set to 1. |
− | + | Some logic could be added to make sure the working set could not use too much ram (if a threshold is hit, just do a garbage collection to free half of the entries for example) | |
− | + | ||
− | + | ||
− | + | Then sort the working set byt the 'related count' key, and give the 300 files having the highers 'related count'. We then free the working set (no 'more' request could be asked to the server to get next 300 files, because keeping the working set in memory would be too expensive) | |
− | + | --- | |
− | + | Searches for multiple file extensions, support not<space> or !<no_space> operator too in file extension (like "zip,rar,cbz,cbr" or "!wme,!wma" or "not wme") | |
+ | |||
+ | --- | ||
+ | |||
+ | Ability to perform exact searches : Clients can enclose words in ' . Examples : 'blank & john' OR 'the the' | ||
+ | |||
+ | --- | ||
+ | |||
+ | Double-quotes can be used to interpret ( ) (brackets) as normal characters instad of boolean modifyiers | ||
+ | |||
+ | --- | ||
+ | |||
+ | words separators: , ; . : - _ ' / ! | ||
+ | |||
+ | --- | ||
+ | |||
+ | and & or not | ||
+ | |||
+ | --- | ||
+ | |||
+ | Support for searches by files hashes. It accepts several links (ed2k://|file|name|size|Hash (anything after the hash will be ignored)), (ed2k:size:Hash), (magnet:?xt=ed2k:Hash), ... | ||
+ | |||
+ | ed2k::<md4hash> or ed2k:<size>:<md4hash> | ||
+ | |||
+ | --- | ||
+ | |||
+ | The letter ñ is an alias to n letter in searches. A search of 'espana' know matches 'españa' |
Revision as of 00:35, 15 December 2005
related::<md4hash>
So someone could try to search something like
related::<md4hash> AND Video AND SIZE > 1000000
The server checks if the file is known.
If not -> End of request, 0 result.
If yes : It scans the list of clients that share this file.
A temporary 'working set' is inited to empty.
For each client in the list, it scans the list of its shares, adding found file in a working set :
If the file has a small #availability (like 1 : only one people share it), ignore it.
If the file (md4 hash) is already in the working set, adds 1 to the 'related count'
If not, check if the file meets the search criteria (if any was specified in the search req)
If the file meets the criteria, adds it to the working set with the 'related count' set to 1.
Some logic could be added to make sure the working set could not use too much ram (if a threshold is hit, just do a garbage collection to free half of the entries for example)
Then sort the working set byt the 'related count' key, and give the 300 files having the highers 'related count'. We then free the working set (no 'more' request could be asked to the server to get next 300 files, because keeping the working set in memory would be too expensive)
---
Searches for multiple file extensions, support not<space> or !<no_space> operator too in file extension (like "zip,rar,cbz,cbr" or "!wme,!wma" or "not wme")
---
Ability to perform exact searches : Clients can enclose words in ' . Examples : 'blank & john' OR 'the the'
---
Double-quotes can be used to interpret ( ) (brackets) as normal characters instad of boolean modifyiers
---
words separators: , ; . : - _ ' / !
---
and & or not
---
Support for searches by files hashes. It accepts several links (ed2k://|file|name|size|Hash (anything after the hash will be ignored)), (ed2k:size:Hash), (magnet:?xt=ed2k:Hash), ...
ed2k::<md4hash> or ed2k:<size>:<md4hash>
---
The letter ñ is an alias to n letter in searches. A search of 'espana' know matches 'españa'