Ordering

Ordering by binary value of a code results an insufficient output, because:

A binary value of a code is not standing on ordering sequence
For example, in Japanese, the binary values of SBCS Katakana characters are in the middle of DBCS characters'.
Each DBCS language has several ways to order DBCS. This depends on what is required.
For example, here are the following sequences for Japanese:
- Radical stroke count sequence
- Total stroke count sequence
- Phonetic reading sequence
- Representative phonetic reading sequence
- Combination of the above
- User defined sequence
Usually more information other than character code points is needed to sort, such as phonetic reading.

To satisfy the minimum requirements, you may use the collating sequence table provided by OS/2 in accordance with the current process code page (DosGetCollate) and re-align the order as follows:

Check if the byte is an SBCS character or the first byte of a DBCS character.
If so, translate it using the OS/2 collating sequence table.
If not (i.e. if the byte is the second byte of a DBCS character), leave it as it is.
Perform ordering by using those values.

You should also provide user exit(s) for national language-unique ordering.

[Back: Replacing/Overwriting Characters]
[Next: Normalization - Wide Character]