Format specifiers

From AMule Project FAQ
Revision as of 09:57, 20 November 2004 by Jacobo221 (Talk | contribs)

Jump to: navigation, search

Introduction

aMule is developped in the C++ programming language. This language alows to handle strings very easily but sometimes it needs a little tweaking in those strings.

When translating aMule, you might encounter with strange things. Sometimes this will be just typos, but sometimes they are there on purpose.

This document is a must read for anyone willing or actually translating aMule. It describes all the cases of groups of characters which should not ever be modified.

So, the following is a description of all the groups of characters which are not supposed to be modified and what they actually mean.

Escape codes

Non-representable ASCII codes

This are codes which represent characters of the ASCII codeset which aren't representable with the keyboard.

  • \a -> This will normally cause an audible alert (sometimes visual) like a beep
  • \b -> Will go back one character
  • \f -> On most systems this will clean the screen
  • \n -> Ends the current line and starts a new one, palcing the cursor at the begining
  • \r -> Goes to the beggining of the current line
  • \t -> Torizontal tabulation
  • \v -> Vertical tabulation
  • \<octal digits> -> Will display the value octal digits in octal
  • \x<hex digits> -> Will display the value hex digits in hexadecimal

Disambiguation escape codes

The following are not characters non-representable on the keyboard, but due to limitations in the C++ programming language, are needed to be used this way:

  • \? -> Displays a question mark ( ? ) to avoid trigraph translations (not all compilers support trigraph translating, so it's not allways necessary)
  • \\ -> Displays a backslash ( \ )
  • \ -> Displays a single quote ( ' )
  • \" -> Displays a double quote ( " )

Examples

The following are some examples for the above escape codes. They are listed as couples of code-line + output. So, the first line represents the line in the way it is written into the C++ code and the second line (or group of lines if it needs more than one line) represents how that C++ code line is displayed on execution:

    • Code line: I am an angel\aOh, true, I am not.
    • Output: I am an angelOh, true, I am not (A beep will be heard right after displaying the word angel and the next word (Oh) will not be displayed untill the beep finishes.
    • Code line: I have 6\b5 fingers in my right hand
    • Output: I have 5 fingers in my right hand
    • Code line: Where is the <RETURN> key???\nAh, here it is!
    • Output: Where is the <RETURN> key???
      Ah, here it is!
    • Code line: I am a BIG lier\rI'm married with Marilyn Monroe
    • Output: I'm married with Marilyn Monroe
    • Code line: \141\115\165\x6C\x65
    • Output: aMule (Notice that the octal value of a in the ASCII codeset is 141, the ctal value of M is 115, the octal value of u is 165, the hexadecimal value of l is 6C and the hexadecimal value of e is 65)
    • Code line: Isn\'t it complicated to use the \" and \\ characters\?
    • Output: Isn't it complicated to use the " and \ characters?

Format specifiers

Basic format specifiers

Format specifiers are groups of characters which will be substituted with something else. The format specifier itself specified which type of data it will be substituted with:

  • %d -> Decimal value (signed integer type). Equivalen to %i
  • %i -> Decimal value (signed integer type). Equivalen to %d
  • %u -> Natural number (unsigned integer type).
  • %x -> Hexadecimal value represented with lowercase characters (unsigned integer type)
  • %X -> Hexadecimal value represented with uppercase characters (unsigned integer type)
  • %o -> Octal value (unsigned integer type)
  • %f -> Rational number (number with a floating point) with the normal (showing all numbers) notation (both float and double types)
  • %e -> Rational number (number with a floating point) with exponential notation using lowercase e (both float and double types)
  • %E -> Rational number (number with a floating point) with exponential notation using uppercase E (both float and double types)
  • %g -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, a lowercase e will be used (both float and double types)
  • %G -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, an uppercase E will be used (both float and double types)
  • %c -> A single character text representation (integer type)
  • %s -> A string (array, pointer of integers type)
  • %p -> Displays a memory addess (pointer type)
  • %n -> The variable that is assigned to this format specifier will be given the value of the number of characters displayed up to know (integer type)

Type extensions

Sometimes, some characters can be inserted between % and the character representing the type of data. This insterted characters are meant to extend the information about the type of data the format specifier is going to be substituted with:

  • h -> Will turn into short integer type. Valid for d, i, o, u, x, X and n.
  • l -> Will turn into long integer type. Valid for for d, i, o, u, x, X and n.
  • L -> Will turn into long double type. Valid for 'e, E, f, F, g, G

Output tweaks

Also, some of the format specifiers allow to tweak a bt how they should be outputted. This tweaking codes must be inserted between % and the type character:

  • - -> Aligns to the left
  • + -> Prints plus ( + ) sign even when the number is positive. Valid for *d, i, e, E, f, g and G.
  • 0 -> Fill the blank spaces with zeros ( 0 ) instead of spaces. Valid for *d, i, u, x, X, o, e, E, f, g and G.
  • # -> It will act in different ways depending on the type of data:
    • o': A zero ( 0 ) will be prepended when the data is non-zero.
    • x and X: Prepends a zero ( 0 ) to the data.
    • f, e, E, g and G: Displays the decimal point even when the data is an integer (no decimals).
    • g and G: The trailing zeros are not removed.
  • <non-zero decimal value> -> Specifies the minimum width the data must occupy (if not all is occupied, it will be padded). Can be used together with 0.
  • .<decimal value> -> It will act in different ways depending on the type of data:
    • f, e and E: Specify the amount of decimals it is allowed to display (if .0, no decimals will be displayed).
    • s: Maximum amount of characters to display (if .0, no characters will be displayed).
  • %% -> This is not a format specifier, instead, it is only meant to be used to avoid ambiguousity. It will display a single % character.

Examples

The following are some examples for the above format specifiers. They are listed as couples of code-line + output. So, the first line represents the line in the way it is written into the C++ code and the second line (or group of lines if it needs more than one line) represents. The data for which the format specifiers are being substituted is random (well, not really random, I've just set something meaning-full in each example):

    • Code line: I am %s and I am %d years old.
    • Output: I am Jacobo221 and I am 19 years old.
    • Code line: The first letter in the english alphabet is %c
    • Output: The first letter in the english alphabet is A
    • Code line: There exists a format specifier which is %%%c
    • Output: There exists a format specifier which is %E
    • Code line: There exists a format specifier which is %%c
    • Output: There exists a format specifier which is %c
    • Code line: %E and %e are the same number
    • Output: 9.186329E+00 and 9.186329e+00 are the same number
    • Code line: %g is in normal notation while %g is in exponential notation
    • Output: 0.25 is in normal notation while 3.234234E+34 is in exponential notation
    • Code line: %+d is a positive number
    • Output: +5 is a positive number
    • Code line: %05f says: Am I not 0-plenty?
    • Output: 000.250000 says: Am I not 0-plenty?
    • Code line: Both %#o and %#X start with a zero
    • Output: Both 0345 and 065FC start with a zero
    • Codeline: %010x must be plenty of zeros
    • Output: 00000065fc must be plenty of zeros
    • Codeline: Pi number is %.2f
    • Output: Pi number is 3.14
    • Codeline: Look what happens when you read the four letters in Firefox: %.4s
    • Output: Look what happens when you read the four letters in Firefox: Fire

Other stuff

Character cases

Leading and ending blank spaces

Examples

Overall example

Example:

Untranslated:
msgid="I am %s and I a %d-year old\nand I\'m a happy \"aMule\" user "
msgstr=""
Would become (translation to spanish):
msgid="I am %s, %d years old\nand &I\'m a happy \"rabitty-aMule\" user "
msgstr="Soy %s y tengo %d años\ny soy un fel&iz usuario del \"conejillo-aMule\" "
Explanation:
  1. %s and %d must be copied literally since they will be substituted in the program with some string or number. General rule: anything between a character % and the next letter-character (that is, a, b, 'c, etc...) or percentage character (%) must be copied literally.
  2. \n must be copied literally too since it brakes the line into a new line. The general rule is: \ and it's very next character (can be \, a, n, t, ,f, ", ', etc) must be copied literally.
  3. & must be placed 'before the very same letter in the translation since it indicates the combination ALT+letter that will select that option.
  4. The ending space is left since in the original message it was there. Never remove startng or ending spaces, even if they look ugly. They are there for some reason. Nomally this will be because either before or after that string comes another string. For example: "Opening file " has a blank space at the end, so you can expect that right after it the name of a file will be displayed. Something like Opening file server.met