Character Set and Code Page

Graphic characters are printable or displayable symbols, such as letters, numbers and punctuation marks. A collection of graphic characters is called a graphic character set, often simply called a character set. Each language requires its own graphic character set in order to be properly printed or displayed. Characters are encoded according to a code page, which is a matrix of rows and columns that assigns graphic and control characters to specific hexadecimal values, called code points. Code pages are classified into two types according to its encoding scheme.

In order to identify a graphic character set and its corresponding code page for a given national language, a coded character set is defined as a combination of graphic character set and code page.

The following identifiers are defined and registered in IBM standards.

GCSGIDs and CPGIDs for each DBCS National Language shows all the coded character sets used in each DBCS country. The type "HOST" indicates the coded character sets supported in IBM host systems and the type "PC" indicates those supported in DBCS OS/2 NLS version.

 ┌────────────────────────────────────────────────────────────────────────────┐ │ Table m. Coded Character Set Requirements for Mixed HOST/PC                │
 ├────────────────────────────┬───────┬───────┬───────────────┬───────────────┤
 │ COUNTRY/LANGUAGE           │ CCSID │ Mixed │     SBCS      │     DBCS      │
 │                            │ (Deci │ code  ├───────────────┼───────────────┤
 │                            │  mal) │ page  │    CGCSGID    │    CGCSGID    │
 │                            │       │       ├───────┬───────┼───────┬───────┤
 │                            │       │       │GCSGID │ CPGID │GCSGID │ CPGID │
 ├────────────────────────────┼───────┼───────┼───────┼───────┼───────┼───────┤
 │JAPANESE HOST (KATAKANA)    │ 05026 │ 00930 │ 01172 │ 00290 │ 00370 │ 00300 │
 │JAPANESE HOST (LATIN)       │ 05035 │ 00939 │ 01172 │ 01027 │ 00370 │ 00300 │
 │JAPANESE PC                 │ 00942 │ 00942 │ 01172 │ 01041 │ 00370 │ 00301 │
 │JAPANESE PC (Extended)      │ 00932 │ 00932 │ 01122 │ 00897 │ 00370 │ 00301 │
 ├────────────────────────────┼───────┼───────┼───────┼───────┼───────┼───────┤
 │KOREAN HOST                 │ 00933 │ 00933 │ 01173 │ 00833 │ 00934 │ 00834 │
 │KOREAN PC                   │ 00934 │ 00934 │ 01224 │ 00891 │ 00934 │ 00926 │
 │KOREAN PC (Extended)        │ 00944 │ 00944 │ 01173 │ 01040 │ 00934 │ 00926 │
 │KOREAN PC (IBM KS)          │ 00949 │ 00949 │ 01278 │ 01088 │ 01050 │ 00951 │
 ├────────────────────────────┼───────┼───────┼───────┼───────┼───────┼───────┤
 │TRADITIONAL CHINESE HOST    │ 00937 │ 00937 │ 01175 │ 00037 │ 00935 │ 00835 │
 │TRADITIONAL CHINESE PC      │ 00948 │ 00948 │ 01175 │ 01043 │ 00935 │ 00927 │
 │TRADITIONAL CHINESE PC      │ 00938 │ 00938 │ 00103 │ 00904 │ 00935 │ 00927 │
 │           (Extended)       │       │       │       │       │       │       │
 │TRADITIONAL CHINESE PC      │ 00950 │ 00950 │ 00103 │ 01114 │ 00935 │ 00947 │
 │           (IBM BIG-5)      │       │       │       │       │       │       │
 ├────────────────────────────┼───────┼───────┼───────┼───────┼───────┼───────┤
 │SIMPLIFIED CHINESE HOST     │ 00935 │ 00935 │ 01174 │ 00836 │ 00937 │ 00837 │
 │SIMPLIFIED CHINESE PC       │ 01381 │ 01381 │ 01174 │ 01115 │ 00937 │ 01380 │
 │             (IBM GB)       │       │       │       │       │       │       │
 └────────────────────────────┴───────┴───────┴───────┴───────┴───────┴───────┘
GCSGIDs and CPGIDs for each DBCS National Language

Note: The listed CGCSGIDs of PC are for the code pages that do not contain display─only characters.


[Back: Coded Character Sets]
[Next: Combined Code Page ID (Mixture of SBCS and DBCS)]