Difference between revisions of "Format specifiers"

From AMule Project FAQ
Jump to: navigation, search
Line 58: Line 58:
 
=== Basic format specifiers ===
 
=== Basic format specifiers ===
  
%d   signed int variable, decimal representation (equivalent to %i)
+
Format specifiers are groups of characters which will be substituted with something else. The format specifier itself specified which type of data it will be substituted with:
%u   unsigned int variable, decimal representation
+
*''%d'' -> Decimal value (signed integer type). Equivalen to %i
%x   unsigned int variable, lowercase hexadecimal representation
+
*''%i'' -> Decimal value (signed integer type). Equivalen to %d
%X   unsigned int variable, uppercase hexadecimal representation
+
*''%u'' -> Natural number (unsigned integer type).
%o   unsigned int variable, octal representation
+
*''%x'' -> Hexadecimal value represented with lowercase characters (unsigned integer type)
    (there are no format specifiers for binary representation)
+
*''%X'' -> Hexadecimal value represented with uppercase characters (unsigned integer type)
%f   float/double, normal notation
+
*''%o'' -> Octal value (unsigned integer type)
%e   float/double, exponential notation (%E uses E instead of e)
+
*''%f'' -> Rational number (number with a floating point) with the normal (showing all numbers) notation (both float and double types)
%g   float/double, notation %f or %e chosen depending on value (%E if %G)
+
*''%e'' -> Rational number (number with a floating point) with exponential notation using lowercase ''e'' (both float and double types)
%c   character (passed as int), text representation
+
*''%E'' -> Rational number (number with a floating point) with exponential notation using uppercase ''E'' (both float and double types)
%s   string (see 10. Arrays, strings and pointers)
+
*''%g'' -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, a lowercase ''e'' will be used  (both float and double types)
%p   pointer (see 10. Arrays, strings and pointers)
+
*''%G'' -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, an uppercase ''E'' will be used  (both float and double types)
%n   number of characters written upto now will be written to int
+
*''%c'' -> A single character text representation (integer type)
    that the corresponding argument points to
+
*''%s'' -> A string (array, pointer of integers type)
 +
*''%p'' -> Displays a memory addess (pointer type)
 +
*''%n'' -> The variable that is assigned to this format specifier will be given the value of the number of characters displayed up to know (integer type)
  
 
=== Type extensions ===
 
=== Type extensions ===
  
You can change the type of the printed variable by inserting one of the following characters between the % sign and the type character (for example: %ld for long int instead of an int).
+
Sometimes, some characters can be inserted between ''%'' and the character representing the type of data. This insterted characters are meant to extend the information about the type of data the format specifier is going to be substituted with:
h     for d,i,o,u,x,X: short int instead of int
+
*''h'' -> Will turn into short integer type. Valid for ''d'', ''i'', ''o'', ''u'', ''x'', ''X'' and ''n''.
      (the short int will be promoted to int when passed anyway)
+
*''l'' -> Will turn into long integer type. Valid for for ''d'', ''i'', ''o'', ''u'', ''x'', ''X'' and ''n''.
      for n: store result in short int instead of int
+
*''L'' -> Will turn into long double type. Valid for 'e'', ''E'', ''f'', ''F'', ''g'', ''G''
l     for d,i,o,u,x,X: long int instead of int
+
      for n: store result in long int instead of int
+
      Do NOT use for e,E,f,F,g,G for e.g. printing doubles.
+
L     for e,E,f,F,g,G: long double instead of float/double
+
  
 
=== Output tweaks ===
 
=== Output tweaks ===
  
There are some flags and modifiers that can be put between the % and the type character:
+
Also, some of the format specifiers allow to tweak a bt how they should be outputted. This tweaking codes must be inserted between ''%'' and the type character:
-     left alignment, pad on the right with spaces (default=right alignment)
+
*''-'' -> Aligns to the left
+     print plus sign if positive (default=only print minus sign if negative)
+
*''+'' -> Prints plus ( ''+'' ) sign even when the number is positive. Valid for *''d'', ''i'', ''e'', ''E'', ''f'', ''g'' and ''G''.
      (for signed numbers only)
+
*''0'' -> Fill the blank spaces with zeros ( ''0'' ) instead of spaces. Valid for *''d'', ''i'', ''u'', ''x'', ''X'', ''o'', ''e'', ''E'', ''f'', ''g'' and ''G''.
space print space if positive (default=only print minus sign if negative)
+
*''#'' -> It will act in different ways depending on the type of data:
      (for signed numbers only)
+
**''o': A zero ( ''0'' ) will be prepended when the data is non-zero.
0     pad with zeros instead of with spaces (for numbers only)
+
**''x'' and ''X'': Prepends a zero ( ''0'' ) to the data.
#     "alternate form": - o: 0 will be prepended to a non-zero result
+
**''f'', ''e'', ''E'', ''g'' and ''G'': Displays the decimal point even when the data is an integer (no decimals).
                        - x/X: prepends 0x/0X to result
+
**''g'' and ''G'': The trailing zeros are not removed.
                        - f/F,e/E,g/G: decimal point even if no decimals
+
*''<non-zero decimal value>'' -> Specifies the minimum width the data must occupy (if not all is occupied, it will be padded). Can be used together with ''0''.
                        - g/G: trailing zeros are not removed
+
*''.<decimal value>'' -> It will act in different ways depending on the type of data:
<nonzero decimal value> specify field width to which result will be padded
+
**''f'', ''e'' and ''E'': Specify the amount of decimals it is allowed to display (if ''.0'', no decimals will be displayed).
      (this can be used together with the 0 flag)
+
**''s'': Maximum amount of characters to display (if ''.0'', no characters will be displayed).
*     field width will be passed as int parameter before the actual argument
+
*''%%'' -> This is not a format specifier, instead, it is only meant to be used to avoid ambiguousity. It will display a single ''%'' character.
.<nonzero decimal value> specify precision (default for f/F,e/E = 6)
+
      (for s, precision will limit the number of printed characters)
+
.0   no decimal point is printed for f/F,e/E
+
.*    precision will be passed as int parameter before the actual argument
+
  
=== Example ===
+
=== Examples ===
 +
 
 +
The following are some examples for the above format specifiers. They are listed as couples of code-line + output. So, the first line represents the line in the way it is written into the [http://www.icce.rug.nl/documents/cplusplus C++] code and the second line (or group of lines if it needs more than one line) represents. The data for which the format specifiers are being substituted is random (well, not really random, I've just set something meaning-full in each example):
 +
 
 +
**Code line: ''I am %s and I am %d years old.''
 +
**Output: ''I am [[User:Jacobo221|Jacobo221]] and I am 19 years old.''
 +
 
 +
**Code line: ''The first letter in the english alphabet is %c''
 +
**Output: ''The first letter in the english alphabet is A''
 +
 
 +
**Code line: ''There exists a format specifier which is %%%c''
 +
**Output: ''There exists a format specifier which is %E''
 +
 
 +
**Code line: ''There exists a format specifier which is %%c''
 +
**Output: ''There exists a format specifier which is %c''
 +
 
 +
**Code line: ''%E and %e are the same number''
 +
**Output: ''9.186329E+00 and 9.186329e+00 are the same number''
 +
 
 +
**Code line: ''%g is in normal notation while %g is in exponential notation''
 +
**Output: ''0.25 is in normal notation while 3.234234E+34 is in exponential notation''
 +
 
 +
**Code line: ''%+d is a positive number''
 +
**Output: ''+5 is a positive number''
 +
 
 +
**Code line: ''%05f says: Am I not 0-plenty?''
 +
**Output: ''000.250000 says: Am I not 0-plenty?''
 +
 
 +
**Code line: ''Both %#o and %#X start with a zero''
 +
**Output: ''Both 0345 and 065FC start with a zero''
 +
 
 +
**Codeline: ''%010x must be plenty of zeros''
 +
**Output: ''00000065fc must be plenty of zeros''
 +
 
 +
**Codeline: ''Pi number is %.2f''
 +
**Output: ''Pi number is 3.14''
 +
 
 +
**Codeline: ''Look what happens when you read the four letters in [http://www.mozilla.org/products/firefox Firefox]: %.4s''
 +
**Output: ''Look what happens when you read the four letters in [http://www.mozilla.org/products/firefox Firefox]: Fire''
  
 
== Other stuff ==
 
== Other stuff ==
 +
 +
=== Character cases ===
  
 
=== Leading and ending blank spaces ===
 
=== Leading and ending blank spaces ===
 +
 +
=== Examples ===
  
 
== Overall example ==
 
== Overall example ==
Line 115: Line 151:
 
'''Example:'''
 
'''Example:'''
 
:Untranslated:
 
:Untranslated:
:''msgid="I am %s and I a %d-year old\nand I'm a happy \"aMule\" user "''
+
:''msgid="I am %s and I a %d-year old\nand I\'m a happy \"aMule\" user "''
 
:''msgstr=""
 
:''msgstr=""
  
 
:Would become (translation to spanish):
 
:Would become (translation to spanish):
:''msgid="I am %s, %d years old\nand &I'm a happy \"rabitty-aMule\" user "''
+
:''msgid="I am %s, %d years old\nand &I\'m a happy \"rabitty-aMule\" user "''
 
:''msgstr="Soy %s y tengo %d años\ny soy un fel&iz usuario del \"conejillo-aMule\" "''
 
:''msgstr="Soy %s y tengo %d años\ny soy un fel&iz usuario del \"conejillo-aMule\" "''
  

Revision as of 08:57, 20 November 2004

Introduction

aMule is developped in the C++ programming language. This language alows to handle strings very easily but sometimes it needs a little tweaking in those strings.

When translating aMule, you might encounter with strange things. Sometimes this will be just typos, but sometimes they are there on purpose.

This document is a must read for anyone willing or actually translating aMule. It describes all the cases of groups of characters which should not ever be modified.

So, the following is a description of all the groups of characters which are not supposed to be modified and what they actually mean.

Escape codes

Non-representable ASCII codes

This are codes which represent characters of the ASCII codeset which aren't representable with the keyboard.

  • \a -> This will normally cause an audible alert (sometimes visual) like a beep
  • \b -> Will go back one character
  • \f -> On most systems this will clean the screen
  • \n -> Ends the current line and starts a new one, palcing the cursor at the begining
  • \r -> Goes to the beggining of the current line
  • \t -> Torizontal tabulation
  • \v -> Vertical tabulation
  • \<octal digits> -> Will display the value octal digits in octal
  • \x<hex digits> -> Will display the value hex digits in hexadecimal

Disambiguation escape codes

The following are not characters non-representable on the keyboard, but due to limitations in the C++ programming language, are needed to be used this way:

  • \? -> Displays a question mark ( ? ) to avoid trigraph translations (not all compilers support trigraph translating, so it's not allways necessary)
  • \\ -> Displays a backslash ( \ )
  • \ -> Displays a single quote ( ' )
  • \" -> Displays a double quote ( " )

Examples

The following are some examples for the above escape codes. They are listed as couples of code-line + output. So, the first line represents the line in the way it is written into the C++ code and the second line (or group of lines if it needs more than one line) represents how that C++ code line is displayed on execution:

    • Code line: I am an angel\aOh, true, I am not.
    • Output: I am an angelOh, true, I am not (A beep will be heard right after displaying the word angel and the next word (Oh) will not be displayed untill the beep finishes.
    • Code line: I have 6\b5 fingers in my right hand
    • Output: I have 5 fingers in my right hand
    • Code line: Where is the <RETURN> key???\nAh, here it is!
    • Output: Where is the <RETURN> key???
      Ah, here it is!
    • Code line: I am a BIG lier\rI'm married with Marilyn Monroe
    • Output: I'm married with Marilyn Monroe
    • Code line: \141\115\165\x6C\x65
    • Output: aMule (Notice that the octal value of a in the ASCII codeset is 141, the ctal value of M is 115, the octal value of u is 165, the hexadecimal value of l is 6C and the hexadecimal value of e is 65)
    • Code line: Isn\'t it complicated to use the \" and \\ characters\?
    • Output: Isn't it complicated to use the " and \ characters?

Format specifiers

Basic format specifiers

Format specifiers are groups of characters which will be substituted with something else. The format specifier itself specified which type of data it will be substituted with:

  • %d -> Decimal value (signed integer type). Equivalen to %i
  • %i -> Decimal value (signed integer type). Equivalen to %d
  • %u -> Natural number (unsigned integer type).
  • %x -> Hexadecimal value represented with lowercase characters (unsigned integer type)
  • %X -> Hexadecimal value represented with uppercase characters (unsigned integer type)
  • %o -> Octal value (unsigned integer type)
  • %f -> Rational number (number with a floating point) with the normal (showing all numbers) notation (both float and double types)
  • %e -> Rational number (number with a floating point) with exponential notation using lowercase e (both float and double types)
  • %E -> Rational number (number with a floating point) with exponential notation using uppercase E (both float and double types)
  • %g -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, a lowercase e will be used (both float and double types)
  • %G -> Rational number (number with a floating point) with normal or exponential notation depending on the value. If exponential, an uppercase E will be used (both float and double types)
  • %c -> A single character text representation (integer type)
  • %s -> A string (array, pointer of integers type)
  • %p -> Displays a memory addess (pointer type)
  • %n -> The variable that is assigned to this format specifier will be given the value of the number of characters displayed up to know (integer type)

Type extensions

Sometimes, some characters can be inserted between % and the character representing the type of data. This insterted characters are meant to extend the information about the type of data the format specifier is going to be substituted with:

  • h -> Will turn into short integer type. Valid for d, i, o, u, x, X and n.
  • l -> Will turn into long integer type. Valid for for d, i, o, u, x, X and n.
  • L -> Will turn into long double type. Valid for 'e, E, f, F, g, G

Output tweaks

Also, some of the format specifiers allow to tweak a bt how they should be outputted. This tweaking codes must be inserted between % and the type character:

  • - -> Aligns to the left
  • + -> Prints plus ( + ) sign even when the number is positive. Valid for *d, i, e, E, f, g and G.
  • 0 -> Fill the blank spaces with zeros ( 0 ) instead of spaces. Valid for *d, i, u, x, X, o, e, E, f, g and G.
  • # -> It will act in different ways depending on the type of data:
    • o': A zero ( 0 ) will be prepended when the data is non-zero.
    • x and X: Prepends a zero ( 0 ) to the data.
    • f, e, E, g and G: Displays the decimal point even when the data is an integer (no decimals).
    • g and G: The trailing zeros are not removed.
  • <non-zero decimal value> -> Specifies the minimum width the data must occupy (if not all is occupied, it will be padded). Can be used together with 0.
  • .<decimal value> -> It will act in different ways depending on the type of data:
    • f, e and E: Specify the amount of decimals it is allowed to display (if .0, no decimals will be displayed).
    • s: Maximum amount of characters to display (if .0, no characters will be displayed).
  • %% -> This is not a format specifier, instead, it is only meant to be used to avoid ambiguousity. It will display a single % character.

Examples

The following are some examples for the above format specifiers. They are listed as couples of code-line + output. So, the first line represents the line in the way it is written into the C++ code and the second line (or group of lines if it needs more than one line) represents. The data for which the format specifiers are being substituted is random (well, not really random, I've just set something meaning-full in each example):

    • Code line: I am %s and I am %d years old.
    • Output: I am Jacobo221 and I am 19 years old.
    • Code line: The first letter in the english alphabet is %c
    • Output: The first letter in the english alphabet is A
    • Code line: There exists a format specifier which is %%%c
    • Output: There exists a format specifier which is %E
    • Code line: There exists a format specifier which is %%c
    • Output: There exists a format specifier which is %c
    • Code line: %E and %e are the same number
    • Output: 9.186329E+00 and 9.186329e+00 are the same number
    • Code line: %g is in normal notation while %g is in exponential notation
    • Output: 0.25 is in normal notation while 3.234234E+34 is in exponential notation
    • Code line: %+d is a positive number
    • Output: +5 is a positive number
    • Code line: %05f says: Am I not 0-plenty?
    • Output: 000.250000 says: Am I not 0-plenty?
    • Code line: Both %#o and %#X start with a zero
    • Output: Both 0345 and 065FC start with a zero
    • Codeline: %010x must be plenty of zeros
    • Output: 00000065fc must be plenty of zeros
    • Codeline: Pi number is %.2f
    • Output: Pi number is 3.14
    • Codeline: Look what happens when you read the four letters in Firefox: %.4s
    • Output: Look what happens when you read the four letters in Firefox: Fire

Other stuff

Character cases

Leading and ending blank spaces

Examples

Overall example

Example:

Untranslated:
msgid="I am %s and I a %d-year old\nand I\'m a happy \"aMule\" user "
msgstr=""
Would become (translation to spanish):
msgid="I am %s, %d years old\nand &I\'m a happy \"rabitty-aMule\" user "
msgstr="Soy %s y tengo %d años\ny soy un fel&iz usuario del \"conejillo-aMule\" "
Explanation:
  1. %s and %d must be copied literally since they will be substituted in the program with some string or number. General rule: anything between a character % and the next letter-character (that is, a, b, 'c, etc...) or percentage character (%) must be copied literally.
  2. \n must be copied literally too since it brakes the line into a new line. The general rule is: \ and it's very next character (can be \, a, n, t, ,f, ", ', etc) must be copied literally.
  3. & must be placed 'before the very same letter in the translation since it indicates the combination ALT+letter that will select that option.
  4. The ending space is left since in the original message it was there. Never remove startng or ending spaces, even if they look ugly. They are there for some reason. Nomally this will be because either before or after that string comes another string. For example: "Opening file " has a blank space at the end, so you can expect that right after it the name of a file will be displayed. Something like Opening file server.met