LCMapString

Functional Difference from WIN95

T.B.D.

Functional Difference from SBCS Open32

New

Implementation

This function maps one character string to another, performing a specified locale-dependent transformation. The function can also be used to generate a sort key for the input string. Instead of UniTransLower or UniTransUpper, it is same to call UniCreateTransformObject and UniTransformStr.

───────────────────────────────────────────────────────────────
 WIN95 flag           Mapping in OS/2      

 LCMAP_FULLWIDTH      Mapping table in Open32 *1

 LCMAP_HALFWIDTH      Mapping table in Open32 *2

 LCMAP_HIRAGANA       Mapping table in Open32 *3

 LCMAP_KATAKANA       Mapping table in Open32 *4

 LCMAP_LOWERCASE      UniTransLower

 LCMAP_UPPERCASE      UniTransUpper

 LCMAP_SORTKEY        UniStrxfrm      *5
───────────────────────────────────────────────────────────────

SORT_STRINGSORT
 NORM_IGNORECASE     *6
 NORM_IGNOREKANATYPE *7
 NORM_IGNORENONSPACE *8
 NORM_IGNORESYMBOLS
 NORM_IGNOREWIDTH    *9

Note:

Maps the half-width character to the full-width character with using the ToFullTBL[] and FullHalfTbl[], which are the internal hard-coded table in Open32.
Maps the full-width character to the half-width character. As for Hiragana and Katakana characters, they are mapped by using the FullHalfTbl[], which is the internal hard-coded table in Open32.
Maps the Katakana character to the Hiragana character by using ToHiraTBL[], which is the internal hard-coded range table in Open32.
Maps the Hiragana character to the Katakana character with using the table; ToKataTBL[], which is the internal hard-coded range table in Open32.
When LCMAP_SORTKEY is set, this function maps characters to the appropriate characters in most cases like as below, before getting the sort key. There are three cases; NORM_IGNORESYMBOLS, and NORM_IGNOREKANATYPE with NORM_IGNOREWIDTH, and SORT_STRINGTYPE, in that this function operates against the result sort key.
Before getting the sort key, this function maps the upper character to the lower character.
Before getting the sort key, this function maps the Hiragana character to the Katakana character.
Before getting the sort key, this function maps the character combined with the nonspacing character (VOICE_SOUND, and SEMIVOICE_SOUND) to two characters; the base character and the nonspacing character. ToBaseTBL[] is the table for it, which is the inte rnal table in Open32, which maps characters combined with nonspacing character
If NORM_IGNOREWIDTH and NORM_IGNOREKANATYPE is set, then the Hiragana is mapped to the Katakana.

Open32 generates the sort key to be able to return the same sort key as in WIN95 except the Unicode character. weight.

Here is the definition of the sort key array.

According to the MSDN;

Unicode sort weights] 0x01
Diacritic weights] 0x01
Case weights] 0x01
Special weights] ---- 0x00

Unicode weights] 0x01
Diacritic weights] 0x01
Case weights] 0x01
Special weights] 0x01
Other weights] ----- 0x00

The separator; 0x01 always exists even if the weight is the empty.
The sort key array of the string is grouped per each kind of the weight.
Unicode character weight is common between characters whose other weight are different. For example, Katakana 'a' is equal to Hiragana 'a' in Unicode character weight.
Some weight can be omitted. If nothing weight follows, such kind of weight doesn't appear.
Here is the weight value, which is found from the actual result.

Alpha-numeric (AW) Weights
- 1st byte of Unicode character weight.
  - 0x00-0x08, etc.
  - Special character 0x09-0x0d !"#$%&()*,./:;?@<[>]^_`{|}</a></b></c></d>
  - Math symbol +<<=<;>
  - 0a \</e>
  - 0c Numeric character 0123456789
  - 0e Alphabet character ABC-Z abc-z
  - 22 Katakana Hiragana Katakana, Hiragana
  - 8x Kanji character Kanji,
  - fe 0xfd 0xfe
  - ff
  -
  Character weight is common between characters whose other weight are different.
  - eg1 Half-width lower
  - Half-width upper
  - Full-width lower
  - Full-width upper
  - eg2 Small Half-width Katakana
  - Large Half-width Katakana
  - Small Full-width Katakana
  - Large Full-width Katakana
  - Large Full-width Katakana
  - Small Full-width Hiragana
  - Large Full-width Hiragana
  - Diacritic Weight (DW)
    - 02 Single-byte lower character ( Omitted if nothing follows.)
    - 03 Katakana voice sound nonspacing character.
    - 04 Katakana semi-voice sound nonspacing character.
    - Case Weight (CW)
      - 02 Single-byte lower character ( Omitted if nothing follows.)
      - 03 Double-byte lower character
      - 0c Single-byte upper character
      - 0d Double-byte upper character
      This field is separated by the unique separator; 0xFF.
      - Field1
        
        c4 small Katakana or small Hiragana
        
        c6 Single-byte Katakana ( Omitted if nothing follows.)
        -
        Field2
        
        Katakana or Hiragana ( Only one in the field regardless the n
        
        c4 Katakana
        
        e4 Hiragana ( Omitted if nothing follows.)
        -
        Field3
        
        c4 Single-byte Katakana
        
        c5 small Hiragana ( Omitted if nothing follows.)
        
        Other weight.
        
        MSDN does not describes about this field. The weight in this field is 4 bytes-weight per a character. It begins with 0x80 0x70 0x06. This weight is ignored in NORM_IGNORESYMBOLS flag and SORT_STRINGSORT flag. When the SORT_STRINGSORT flag is set, the character weight is generated for characters which has the weight in this field. In that character weight, the 1st byte is the 0x07, 0x08, and 0x0A listed above. And the 2nd byte is the original 4th byte of this field.
        
        [Back: IsDBCSLeadByteEx]
        [Next: MultiByteToWideChar, WideCharToMultiByte]