structure
> loop ?skip-first-bytes="<integer>"
[since version 1.2]
optional integer specifying the number of bytes at the beginning of the loop
structure that must be ignored, because not dumped in the hi file.
it allows to define a loop of X entries, where the first entry is only
partially dumped, instead of defining a loop of 1 entry, with only part of the
next loop columns, and an additional loop of X-1 entries.
ex: ket.xml
<loop count="10" skip-first-bytes="2">
output
output fields are used to display an element as a single key/value pair.
output table and embedded columns are used to display all occurences/indexes of
a set of elements.
[since version 1.4]
output identifier allowing to link it with a structure. see '
structure' object for more information.
ex: fixeight.xml
<structure output="broken_topscore">...</structure>
<output id="broken_topscore">...</output>
field
output > field
?id="<field_identifier>"
optional identifier allowing to associate an output field to a specific input
element sharing the same id, giving the value to be displayed
if no identifier is provided, the attribute 'src' or a non-empty content is
expected to define the value to be displayed
ex: ddonpach.xml
<field id="TOP SCORE"
format="*10" display="extra"/>
output > field
?src="<elt_id>|index"
optional identifier specifying which element must be used as input to the
field
if 'src' is specified, the input value of the field is taken from the related
element, whenever a field id is set or not, even using another element
identifier.
'src' value can also be the keyword 'index', meaning that the current output
table index is used as input to this field.
'src' value can also be the keyword 'unsorted_index', meaning that the current
output table index, before any sorting, is used as input to this field.
ex: -
output > field
?format="<format_identifier>|<direct_implicit_format>"
output > field ?format="<identifier1>;<identifier2>;..."
optional reference to a format identifier describing how to format the field to
display it.
if the format identifier is simple enough, this format definition can be
skipped as the program will automatically create it, computing its content from
the identifier itself.
(note: the special implicit format identifier "0x" means that "0x" will be
added a the beginning of the value itself, to emphasis the hexadecimal
representation of the output if needed)
see
column format for more information
more than one format identifier can be specified, separated by ';'
ex: ddonpach.xml
<field id="TOP SCORE" format="*10" display="extra"/>
output > field
?display="always|extra|debug"
optional keyword specifying in which context this field must be displayed
(default is always).
always: the field is always displayed
extra: the field is displayed only if extra information are requested, using
-ra command-line argument.
debug: the field is displayed only if debug information are requested, using
-rd command-line argument.
ex: ddonpach.xml
<field id="TOP SCORE" format="*10" display="extra"/>
the field content can be non-empty to specify hard-coded string to be
displayed.
see
column content for more information.
ex: -
-
table
output > table
@?id="<table_id>:<value>"
[since version 1.6]
optional attribute specifying the table identifier
it is used only when extracting data in xml
ex: sonicwi3.xml
<table id="2 PLAYERS"
line-ignore="2P SCORE:0">
<column id="RANK" src="index" format="+1"/>
<column id="2P SCORE"/>
...
output > table
@?line-ignore="<column_id>:<value>"
optional attribute specifying what are the lines to not display
for example, line-ignore="SCORE:0" means that all lines of the output table
that have a SCORE value equals to 0 must be skipped
ex: tempest.xml
<table line-ignore="SCORE:0">
<column id="RANK" format="+1" src="index"/>
<column id="NAME"/>
<column id="SCORE"/>
</table>
output > table
?line-ignore-operator="<operator>"
optional operator specifying the operator to be applied to know if a specific
table line must be skipped, according to the line-ignore attribute (default is
'==')
supported operators: <, ==, >, >=, <=, !=
ex: dino.xml
<table line-ignore-operator=">"
line-ignore="SCORE:99000000">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE"/>
<column id="NAME"/>
<column id="CHARACTER" format="character"/>
<column id="STAGE"/>
<column id="SPACE" display="debug"/>
<column id="UNKNOWN" format="0x" display="debug"/>
</table>
output > table
?sort="<column_id>"
optional identifier specifying that a column must be taken into account to sort
the output table.
by default, the column values are converted as decimal or string, to do the
sorting (in this priority order). it is rarely used as sorting can be done in
99% of the cases during the input elements decoding, using 'table-index'
attribute, which is the recommended way to sort elements as it is best fitted
to the original display algorithm
ex: srumbler.xml
<table sort-order="asc" sort="SCORE
RANK">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE"/>
<column id="SCORE RANK"
display="debug"/>
<column id="NAME"/>
<column id="NAME RANK" display="debug"/>
</table>
output > table
@?sort-order="asc|desc"
optional keyword specifying what is the sort order to be used, if 'sort'
attribute has been defined (default is 'asc')
ex: srumbler.xml
<table sort-order="asc"
sort="SCORE RANK">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE"/>
<column id="SCORE RANK" display="debug"/>
<column id="NAME"/>
<column id="NAME RANK" display="debug"/>
</table>
output > table
@?sort-format="<format_identifier>|<direct_implicit_format>"
output > table
@?sort-format="<identifier1>;<identifier2>;..."
[since version 1.7]
optional identifier specifying how to format the sort column values.
if the format identifier is simple enough, this format definition can be
skipped as the program will automatically create it, computing its content from
the identifier itself.
more than one format identifier can be specified, separated by ';'
ex: kof2001.xml
<table id="WINS" sort="WIN %"
sort-format="TrimR%"
sort-order="desc">
<column id="CHARACTER" src="unsorted_index" format="character"/>
<column id="WIN %" src="WIN
CHARACTER" format="win;Suffix%"/>
<column id="TOTAL" display="debug"/>
</table>
output > table
?lines-max="<integer>"
optional integer specifying the maximum number of table lines to be displayed,
in case 'line-ignore' attribute is not adapted, which is very rare
ex: carnival.xml
<table line-ignore="SCORE:0" lines-max="3">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE" format="*10"/>
<column id="NAME"/>
</table>
output > table
?display="always|extra|debug"
[since version 1.6]
optional keyword specifying in which context this table must be displayed
(default is always).
always: the table is always displayed
extra: the table is displayed only if extra information are requested, using
-ra command-line argument.
debug: the table is displayed only if debug information are requested, using
-rd command-line argument
ex: srumbler.xml
<table sort-order="asc" sort="SCORE RANK">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE"/>
<column id="SCORE RANK" display="debug"/>
<column id="NAME"/>
<column id="NAME RANK" display="debug"/>
</table>
column
output > table > column
?id="<column_identifier>"
optional identifier allowing to associate an output column to a specific input
element sharing the same id, giving the value to be displayed
if no identifier is provided, the attribute 'src' or a non-empty content is
expected to define the value to be displayed
ex: ddonpach.xml
<column id="SCORE"
format="score"/>
output > table > column
?src="<elt_id>|index"
optional identifier specifying which element must be used as input to the
column
if 'src' is specified, the input value of the column is taken from the related
element, whenever a field id is set or not, even using another element
identifier.
'src' value can also be the keyword 'index', meaning that the current output
table index is used as input to this column.
'src' value can also be the keyword 'unsorted_index', meaning that the current
output table index, before any sorting, is used as input to this field.
ex: ddonpach.xml
<column id="RANK" src="index"
format="+1"/>
output > table >
column ?format="<format_identifier>|<direct_implicit_format>"
output > table > column
?format="<identifier1>;<identifier2>;..."
optional reference to a format identifier describing how to format the column
to display it.
see format object definition below to understand the formatting
possibilities.
if the format identifier is simple
enough, this format definition can be skipped as the program will automatically
create it, computing its content from the identifier itself.
operator |
definition |
example |
introduction |
*<number> |
multiply |
|
|
/<number> |
divide |
5800 / 60 => 96.66666 |
|
d<number> |
divide and trunc as integer |
5800 d 60 => 96 |
|
D<number> |
divide and round to the nearest
integer |
5800 D 60 => 97 |
[since version 1.3] |
-<number> |
substract |
|
|
+<number> |
add |
|
|
%<number> |
remainder |
5800 % 60 => 66666 |
|
><integer> |
shift |
|
[since version 1.2] |
LC or Lowercase |
lowercase |
|
[since version 1.3] |
UC or Uppercase |
uppercase |
|
[since version 1.2] |
Capitalize |
capitalize |
|
[since version 1.3] |
Round or R |
round |
|
[since version 1.3] |
Trunc or T |
trunc |
|
[since version 1.3] |
TrimL<character> |
trim <character> from left
side |
TrimL0 |
[since version 1.3] |
TrimR<character> |
trim <character> from right
side |
TrimR0 |
[since version 1.3] |
Trim<character> |
trim <character> from both
sides |
Trim0 |
[since version 1.3] |
PadL<integer><character> |
pad <character> from left
side,
until a maximum of <integer> characters is reached |
PadL60 |
[since version 1.3] |
PadR<integer><character> |
pad <character> from right
side,
until a maximum of <integer> characters is reached |
PadR80 |
[since version 1.3] |
Suffix<string> |
concatenate <string> at the
end, if not empty |
Suffix% |
[since version 1.3] |
Prefix<string> |
concatenate <string> at the
beginning, if not empty |
PrefixSTG |
[since version 1.3] |
LoopIndex |
assign loop index to the current
element |
|
[since version 1.3] |
more than one format identifier can be specified, separated by ';'.
(note: the special implicit format identifier "0x" means that "0x" will be
added a the beginning of the value itself, to emphasis the hexadecimal
representation of the output if needed).
see related format operations definition for more information.
ex: ddonpach.xml
<table>
<column id="RANK" src="index" format="+1"/>
...
</table>
<format
id="+1"><add>1</add></format> <!-- not strictly
necessary //-->
output > table >
column ?display="always|extra|debug"
optional keyword specifying in which context this column must be displayed
(default is always).
always: the column is always displayed
extra: the column is displayed only if extra information are requested, using
-ra command-line argument.
debug: the column is displayed only if debug information are requested, using
-rd command-line argument
ex: srumbler.xml
<table sort-order="asc" sort="SCORE RANK">
<column id="RANK" format="+1" src="index"/>
<column id="SCORE"/>
<column id="SCORE RANK" display="debug"/>
<column id="NAME"/>
<column id="NAME RANK" display="debug"/>
</table>
output > table >
column ?content
the column content can be non-empty to specify hard-coded string to be
displayed.
in this case, column id is not necessary.
ex: spnchoutj.xml
<format id="percentage" input-as-subcolumns-input="true">
<concat>
<column format="integer"/>
<column>.</column>
<column format="decimal"/>
<column>%</column>
</concat>
</format>
<hi2txt>
<structure>...</structure>
<bitmask>...</bitmask>
<output>...</output>
<format>...</format>
<charset>...</charset>
</hi2txt>
format
format allows to transform a value or a set of values into another value,
embedding a set of operations for this purpose.
identifier of the format, allowing it to be referenced, called by a field or
column.
ex: spnchoutj.xml
<format id="percentage"
input-as-subcolumns-input="true">
...
</format>
format
?formatter="<regex>"
regular expression allowing to format the output string, after all embedded
operations.
formatter string syntax follows the Java format string syntax described in the
Java
doc of the class 'Formatter'.
ex: trackfld.xml
<format id="time" formatter="%.2fsec">
<divide>100</divide>
</format>
format
?apply-to="char|value"
keyword specifying if the format is applicable on the input value as a whole or
on each 'character' of the input value (default is value).
it is targeting 'text' inpout element.
supported keywords:
char => format is applied on each character of the input value
value => format is applied one time for the whole value
ex: klax.xml, where the format append '.' after each character of the input
value.
<format id="+dot" apply-to="char">
<suffix>.</suffix>
</format>
format
?input-as-subcolumns-input="yes|no"
boolean specifying that the input value of the format must not be used by the
embedded operations directly, but as input value of the columns/fields
contained inside these operations (default is no).
in this case, sub-columns/fields can skip the 'id' or 'src' attribute, as the
input value for them is implicitely the value of the input calling the
embedding format.
it allows to re-use multiple times the same input in different parts of a
format, without having to define intermediate columns.
note that sub-columns/fields -embedded in an operation of this format- that are
defining a non-empty content are not impacted as this content hardcodes itself
the column input value.
note that operation of this format that doesn't use any sub-column/field also
ignore this attribute, allowing to chain the result of operations inside a
format, even if the format uses 'input-as-subcolumns-input' (see tgm2.xml
format@id="medal"/trim). [since version 1.2]
ex: spnchoutj.xml
<format id="percentage" input-as-subcolumns-input="true">
<concat>
<column format="integer"/>
<column>.</column>
<column format="decimal"/>
<column>%</column>
</concat>
</format>
supported format operations
integer added to the value of the column/field calling this format
ex: klax.xml
<format id="+1">
<add>1</add>
</format>
format > ?prefix
<string> @empty <string> @consume <yes|no>
string to concatenate in front of the value of the column/field calling this
format.
if the input value is empty, nothing is appended.
'empty' attribute allows to define what is considered as an empty input value
(default is no value at all, but it can be set to "0" for example).
[since version 1.2]
'consume' attribute indicates that an empty input value must be consumed by
this operation, which will not return any output value (only useful when
'empty' attribute is set to something different than no value).
[since version 1.2]
this operation can be implicitly defined when calling the format,
using"Prefix<string>" (see column/field
format
attribute).
[since version 1.3]
ex: turfmast.xml
<format id="+"><prefix>+</prefix></format>
ex: tgm2.xml => for medal value 0, nothing is appended and this '0' value is
not returned/displayed
[since version 1.2]
<format id="medal_ac">
<prefix empty="0"
consume="yes"> AC</prefix>
</format>
format > ?suffix
<string> @empty <string> @consume <yes|no>
string to concatenate at the end of the value of the column/field calling this
format.
if the input value is empty, nothing is appended.
'empty' attribute allows to define what is considered as an empty input value
(default is no value at all, but it can be set to "0" for example).
[since version 1.2]
'consume' attribute indicates that an empty input value must be consumed by
this operation, which will not return any output value (only useful when
'empty' attribute is set to something different than no value).
[since version 1.2]
this operation can be implicitly defined when calling the format, using
"Suffix<string>" (see column/field
format
attribute).
[since version 1.3]
ex: klax.xml
<format id="+dot" apply-to="char">
<suffix>.</suffix>
</format>
format >
?multiply <integer>
integer multiplied to the value of the column/field calling this format
this operation can be implicitly defined when calling the format, using
"*<number>" (see column/field
format
attribute).
ex: turfmast.xml
<format id="-">
<multiply>-1</multiply>
<add>256</add>
<prefix>-</prefix>
</format>
format > ?divide
<integer>
integer dividing the value of the column/field calling this format: the result
can be a float.
if the result must keep only the decimal part, see the '
remainder' operation.
if the result must keep only the integer part, see the'
divide_trunc' operation.
if the result must be rounded to the nearest integer, see the'
divide_round' operation.
this operation can be implicitly defined when calling the format,
using"/<number>" (see column/field
format
attribute).
ex: trackfld.xml
<format id="time" formatter="%.2fsec">
<divide>100</divide>
</format>
format > ?sum
(<field>|<column>)+
list of columns/fields to sum all together.
these columns/fields must be number.
this operation can be implicitly defined when calling the format,
using"+<number>" (see column/field
format
attribute).
ex: ddonpach.xml
<format id="score">
<sum>
<column id="SCORE1" format="*10"/>
<column id="SCORE2"/>
</sum>
</format>
format > ?concat
(<field>|<column>|<txt>)+
list of columns/fields to be concatenated all together.
note: a specific 'txt' element can be used also to store hard-coded text with
improved readibility [since version
1.1.20140809].
ex: ddonpach.xml
<format id="area">
<concat>
<column id="LOOP" format="default_loop;-"/>
<column id="STAGE" format="default_stage"/>
</concat>
</format>
ex: kindmgp.xml
<format id="time" input-as-subcolumns-input="yes">
<concat>
<column format="MN"/>
<txt>'</txt>
<column format="SEC"/>
<txt>''</txt>
<column format="CS"/>
</concat>
</format>
format > ?min
(<field>|<column>)+
list of columns/fields from where the minimum value will be selected
ex: -
<format id="score">
<min>
<field id="VALUE_1"/>
<field id="VALUE_2"/>
<field id="VALUE_3"/>
</min>
</format>
format > ?max
(<field>|<column>)+
list of columns/fields from where the maximum value will be selected
ex: phoenix.xml
<format id="score">
<max>
<field id="TOP SCORE ALT"/>
<field id="SCORE 1 ALT"/>
<field id="SCORE 2 ALT"/>
</max>
</format>
format > ?pad
<string> @direction <left|right> @max <integer>
string to append before or after (attribute 'direction') the value of the
column/field calling this format, enabling to reach the maximum number of
characters requested ('max' attribute), but no more.
in case the initial value is already >= maximum specified, nothing is
added.
this operation can be implicitly defined when calling the format, using
"PadL<integer_for_max><character>",
"PadR<integer_for_max><character>" (see column/field
format attribute).
[since version
1.3]
ex: spnchoutj.xml
<format id="ms">
<remainder>100</remainder>
<pad direction="left"
max="2">0</pad>
</format>
format > ?trim
<string> @direction <left|right|both>
all consecutive occurences of the specified string will be removed, starting
from the start (direction="left"), the end (direction="right"), or both
(direction="both"
[since version 1.3]) of the value
of the column/field calling this format.
this operation can be implicitly defined when calling the format, using
"TrimL
", "TrimR" or
"Trim" (see column/field format
attribute).
ex: twincobr.xml
<format id="trim">
<trim
direction="left">0</trim>
</format>
format >
?substract <integer>
integer substracted from the value of the column/field calling this format.
this operation can be implicitly defined when calling the format, using
"-<number>" (see column/field
format
attribute).
ex: gng.xml
<format id="rp1">
<substract>44</substract>
</format>
format >
?remainder <integer>
integer dividing the value of the column/field calling this format: only the
remainder part is kept.
if the result must keep integer and decimal parts, see the '
divide' operation.
if the result must keep only the integer part, see the'
divide_trunc' operation.
if the result must be rounded to the nearest integer, see the'
divide_round' operation.
this operation can be implicitly defined when calling the format, using
"%<number>" (see column/field
format
attribute).
ex: spnchoutj.xml
<format id="ms">
<remainder>100</remainder>
<pad direction="left" max="2">0</pad>
</format>
convert a number into an integer, by skipping the decimal part.
this operation can be implicitly defined when calling the format, using "Trunc"
(see column/field
format attribute).
[since version 1.3]
ex: -
<format id="integer">
<trunc/>
</format>
convert a number into the nearest rounded integer.
this operation can be implicitly defined when calling the format, using "Round"
(see column/field
format attribute).
[since version 1.3]
ex: -
<format id="rounded">
<round/>
</format>
format >
?divide_trunc (<value>|field|column)*>
integer dividing the value of the column/field calling this format: only the
integer part is kept.
if the result must keep integer and decimal parts, see the '
divide' operation.
if the result must keep only the decimal part, see the '
remainder' operation.
if the result must be rounded to the nearest integer, see the'
divide_round' operation.
this operation can be implicitly defined when calling the format, using
"d<number>" (see column/field
format
attribute).
ex: spnchoutj.xml
<format id="sec">
<remainder>10000</remainder>
<divide_trunc>100</divide_trunc>
<pad direction="left" max="2">0</pad>
</format>
format >
?divide_round (<value>|field|column)*>
[since version 1.3]
integer dividing the value of the column/field calling this format: the result
is rounded to the nearest integer .
if the result must keep integer and decimal parts, see the '
divide' operation.
if the result must keep only the decimal part, see the '
remainder' operation.
if the result must keep only the integer part, see the'
divide_trunc' operation.
this operation can be implicitly defined when calling the format, using
"D<number>" (see column/field
format
attribute).
ex: mushitam.xml
<format id="time" input-as-subcolumns-input="yes">
<concat>
<column format="d60;d60"/>
<txt>:</txt>
<column format="d60;%60;PadL20"/>
<txt>:</txt>
<column format="%60;*100;D60;PadL20"/>
</concat>
</format>
format > ?replace
@src <string> @dst <string>
all occurences of the 'src' string will be replaced by the 'dst' string
ex: suprmrio.xml
<format id="world">
<replace src="0"
dst="W-"/>
</format>
[since version 1.2]
input number is shifted
[7] by the specified
value.
if the bit sequence 0001 0111 (decimal 23) were subjected to a logical shift of
one bit position to the left would yield: 0010 1110 (decimal 46).
this operation can be implicitly defined when calling the format, using
"><number>" (see column/field
format
attribute).
ex:
<format id="medal">
<shift>2</shift>
</format>
[since version 1.3]
all characters of input data are put in lower case.
this operation can be implicitly defined when calling the format, using
"Lowercase" or "LC" (see column/field
format
attribute).
ex:
<format id="trial">
<lowercase/>
</format>
[since version 1.2, behavior modified in 1.3]
all characters of input data are put in upper case.
this operation can be implicitly defined when calling the format, using
"Uppercase" or "UC" (see column/field
format
attribute).
ex: guwange.xml
<column id="NAME H" format="transliteration;UC;parenthesis"/>
ex:
<format id="trial">
<uppercase/>
</format>
[since version 1.3]
first character of input data is put in upper case.
this operation can be implicitly defined when calling the format, using
"Capitalize" (see column/field
format
attribute).
ex:
<format id="trial">
<capitalize/>
</format>
[since version 1.3]
the formatted element takes the value of the loop index, if it is inside a
loop.
this operation is done after "table-index" attribute management, allowing to
use the real data as the index inside the table (@table-index="itself"), but
after, modifying this data to be the loop index.
the main usage is to handle a list of pointers on the real data, allowing to
sort them.
this operation can be implicitly defined when calling the format, using
"LoopIndex" (see column/field
format
attribute).
ex: inthehunt.xml
<structure file=".hi">
...
<loop count="12">
<elt size="2" type="int" id="POINTER" endianness="little_endian"
table-index="itself" table-index-format="-384;/16" format="LoopIndex"/>
</loop>
</structure>
<output>
<table line-ignore="NAME:" sort="POINTER">
...
</table>
</output>
operation converting an input value into a specific value.
multiple cases can be defined.
inside the same format, multiple groups of 'case' can be defined, separated by
other operations, allowing to chain multiple operations, including 'case'.
[since version 1.2]
format > ?case
@src="?" @dst="?" @default="yes|no"
'src' attribute specifies which input value matches the case (the src value can
be a value in base 10 (nn)or in base 16 (0xnn)).
'dst' attribute specifies in which string the input will be converted, if the
'src' matches the input.
'default' attribute specifies which 'case' among the list of cases will be used
if the input value doesn't match any case 'src'.
ex: 1941.xml
<format id="grade_mapping">
<case src="0" dst="SECOND LIEUTENANT" default="yes"/>
<case src="1" dst="FIRST LIEUTENANT"/>
<case src="2"
dst="CAPTAIN"/>
<case src="3" dst="MAJOR"/>
<case src="4" dst="LIEUTENANT
COLONEL"/>
<case src="5" dst="COLONEL"/>
<case src="6" dst="6"/>
<case src="7" dst="7"/>
</format>
ex: ad2083.xml
<format id="ad2083" apply-to="char">
<case src="0x40" dst="
"/>
<case src="0x5C" dst="."/>
</format>
ex: multiple and different groups of 'case'
<format id="medal">
<case src="A"
dst="10"/>
<case src="B"
dst="100"/>
<pad direction="left" max="3">0</pad>
<case src="000"
dst=""/>
</format>
format
> ?case @operator <operator>
operator defining how to check if the input value matches the 'src' (default is
'==')
supported operators: <, ==, >, >=, <=, !=
ex: turfmast.xml
<format id="score">
<case src="0" dst="EVEN"/>
<case src="240" operator="<" format="+"/>
<case src="240" operator=">=" format="-"/>
</format>
format > ?case @operator-format
"<format_identifier>|<direct_implicit_format>"
format > ?case @operator-format
"<identifier1>;<identifier2>;..."
identifier allowing to format input value before applied the match.
input value + operator-format = formated input value
formated input value + operator + src value = 'case' matches or not
more than one format identifier can be specified, separated by ';'
ex: dariusg.xml
<case src="16" operator-format="-1792;%256"
operator="<" dst="ABDGKQW"/>
format >
?case@format="<format_identifier>|<direct_implicit_format>"
format > ?case@format="<identifier1>;<identifier2>;..."
identifier allowing to format the input value to produce the result.
more than one format identifier can be specified, separated by ';'
ex: turfmast.xml
<format id="score">
<case src="0" dst="EVEN"/>
<case src="240" operator="<" format="+"/>
<case src="240" operator=">=" format="-"/>
</format>
charset
charset is a set of translations from a raw value into a character (char).
charset
id="<identifier>"
identifier of the charset, allowing it to be referenced, called by a field or
column.
ex: ddonpach.j
<structure>
...
<loop count="5"><elt size="6" type="text" id="NAME" ... charset="ddonpach"/></loop>
...
</structure>
...
<charset id="ddonpach">
<char src="0x00" dst=" "/>
<char src="0x38" dst="."/>
</charset>
pre-defined charsets are defined: see
elt@charset section.
[since
version 1.2]
operation converting an input character into an output character or string
multiple chars can be defined.
charset > char
@src="" @dst="" ?@default="yes|no"
'src' attribute specifies which input value matches the char (the src value can
be a value in base 10 (nn)or in base 16 (0xnn)).
'dst' attribute specifies in which string the input will be converted, if the
'src' matches the input: it can be a single letter, an empty string, a string
with multiple characters, including special characters.
'default' attribute specifies which 'case' among the list of cases will be used
if the input value doesn't match any case 'src'.
ex: ddonpach.j
<charset id="ddonpach">
<char src="0x00" dst=" "/>
<char src="0x38" dst="."/>
</charset>
ex: mooncrst.xml
<char src="0xFF" dst=""/>
<!-- 'end of name' indicator specific to mooncrstg //-->
ex: gigawing.xml
<char src="0x063E" dst="Dr."/>
<char src="0x0642" dst=".Jr"/>
<char src="0x0646" dst="St."/>
<char src="0x07B6" dst="&black-heart;"/>
<char src="0x07BA" dst="&black-diamond;"/>
<char src="0x07BE" dst="&black-club;"/>
bitmask
bitmask is a set of data selection at bit level (character), mainly targetting
character extraction.
bitmask
@id="<identifier>"
identifier of the bitmask, allowing it to be referenced, called by a field or
column.
ex: rtype.xml
<elt size="3" type="int" id="SCORE" bitmask="score" base="16"/>
...
<bitmask id="score"> <!--
bytes re-ordering: 1 - 0 - 2 //-->
<character mask="00000000 11111111 00000000"/>
<character mask="11111111 00000000 00000000"/>
<character mask="00000000 00000000 11111111"/>
</bitmask>
bitmask >
?@byte-completion="yes|no"
default is yes.
activate or not completion up to 8 bits (1 byte) for each individual
'character/mask' result.
when each character mask defines a single characters, and less than 8 bits is
selected, it is necessary to indicates that completion up to a full byte is
necessary for each of these characters.
when character mask selects less than 8 bits, but multiple characters are
defined only to re-order bytes (ex: take 4 end bits, then 4 first bits, using 2
characters), this completion is inadequate.
note that the final output is always stored as a bytes array, meaning
autocompletion of the final result.
ex: slapfigh.xml
<bitmask id="score" byte-completion="no"> <!-- low
nibble 3rd byte + 2nd byte + 1st byte + high nibble 3rd byte //-->
<character mask="00000000 00000000 00001111"/>
<character mask="00000000 11111111 00000000"/>
<character mask="11111111 00000000 00000000"/>
<character mask="00000000 00000000 11110000"/>
</bitmask>
a character can be used to
- skip unused bits
- select individual bits to build each output character
- re-order bits to be able to decode the full string
- etc.
bitmask > character
@mask="<mask>"
sequence of 0 and 1, allowing to skip/select specific bits from the full input
value
each mask takes the full input data and create an individual 'character' output
(set of bits, with completion or not).
[IMPORTANT NOTE] each character is working on the full input: it means that the mask must define all bits from the full entry. So, for a 3 characters name (on 3 bytes), each character mask must define 3*8 bits and put enough bits to 1 to specify how to extract one character.
the outputs of all masks are concatenated together, at bit level.
note that the final output is always stored as a bytes array, meaning
autocompletion of the final result.
ex: rtype.xml
<bitmask id="score"> <!-- bytes re-ordering: 1 - 0 - 2 //-->
<character mask="00000000 11111111
00000000"/>
<character mask="11111111 00000000
00000000"/>
<character mask="00000000 00000000
11111111"/>
</bitmask>
ex: solarq.xml
<bitmask id="name">
<character mask="00000000 00111111
00000000 00000000"/>
<character mask="00111111 11000000
00000000 00000000"/>
<character mask="00000000 00000000
00000000 00111111"/>
</bitmask>
example: byte-completion = 'yes'
input 00000000 11111111 10101010
input + mask 1 "00001111 1111" => 00001111 (already 8 bits, so no need
for completion)
input + mask 2 "00000000 00000000 1111" => 1010 + completion =>
00001010
output "0000111100001010" already on 2 bytes
example: byte-completion = 'no'
input 00000000 11111111 10101010
input + mask 1 "00001111 1111" => 00001111
input + mask 2 "00000000 00000000 1111" => 1010 (no completion)
output "000011111010" meaning "00000000 11111010" on 2 bytes
References
1. "A Tutorial on Data Representation",
section 3.9 'Big Endian vs Little Endian', Chua Hock-Chuan
http://www3.ntu.edu.sg/home/ehchua/programming/java/DataRepresentation.html
2. "Slimming Strings With Custom Base-40
Packing", Al Williams
http://www.drdobbs.com/embedded-systems/slimming-strings-with-custom-base-40-pac/229400732
3. "ASCII", Wikipedia
http://en.wikipedia.org/wiki/US-ASCII
4. "Unicode entities", Wikipedia
http://en.wikibooks.org/wiki/Unicode
http://en.wikibooks.org/wiki/Unicode/Character_reference
http://en.wikibooks.org/wiki/Unicode/List_of_useful_symbols
http://unicode-table.com
5. "
Binary-to-text encoding",
Wikipedia
http://en.wikipedia.org/wiki/Binary-to-text_encoding
6. "Base 32", Wikipedia
http://en.wikipedia.org/wiki/Base32
7. "Shift operation", Wikipedia
http://en.wikipedia.org/wiki/Logical_shift
History
2020-03-13 v1.9: add decoding parameter 'nibble-trim', introduced in hi2txt v1.11
2019-09-01 v1.8: add decoding parameter 'byte-trunc', introduced in hi2txt v1.10
2017-01-23 more details about "bitmask character" behavior
2016-04-09 aligned with hi2txt v1.7
2015-28-11 aligned with hi2txt v1.6
...
2014-09-01 v1.2