2024/11/25
.EBCDIC translation config file.
In addition to the specific internal conversion table,
EBCDIC file support allows the use of external converters such as ICU/WindowsApi/iconv.
When using the internal conversion table, you can also specify a specific EBCDIC conversion
without a cfg file by specifying a command line parameter.
The location of the cfg file is specified in the ini file. The default is ::xeebc.map
EBCDIC_cfg ="" #("::xeebc.map")# EBCDIC translation config filename
In the XCV utility, this is specified by the /MF:mapfile parameter.
****************************************************************************
There are two types of conversion patterns.
(a).Japanese conversion using internal tables.
In addition to specifying command line parameters, you can specify conversion options.
Options such as SJIS_OPT, SOSI_A2E, and SOSI_E2A are provided.
(b).Conversion using ICU or system converter(Windows API, Linux iconv).
You can use the uconv command for ICU and the iconv command for iconv.
List of converters can be obtained with "uconv -l" (ICU) or "iconv -l".
In Windows 11, ICU seems to be installed by default, but uconv.exe is not available.
If uconv is not available, try "xcv -list -ICU".
The Windows code page and ICU code page can be listed by "xcv -List".
Note:Windows code page 20290 is a Japanese Katakana extension,
but Japanese Kanji does not seem to be supported. If DBCS is required, use ICU.
Optional parameter.
#####################################################################################
CONVERTER : Converter selection
0 : Convert using internal tables only.
MAP_E2A, MAP_A2E allows Character-by-character adjustment.
1 : Convert using ICU (requires ICU to be installed)
2 : Convert using iconv API on Linux, WideCharToMultibyte/MultiByteToWideChar on Windows.
e.g.
CONVERTER 1 # use ICU converter
## The following ICU_DLL_SUFFIX and ICU_API_SUFFIX are required for "CONVERTER 1",
but xe checks the system directory, so you probably don't need to specify them.
If an error occurs, please specify them.
(On Windows, check c:\Windows\System32, and on Linux, use "uconv --version" to check.)
ICU_DLL_SUFFIX : Suffix for ".dll" or ".so" library name.
e.g.
ICU_DLL_SUFFIX 44
ICU_API_SUFFIX : Suffix for the ICU API name.
e.g.
ICU_DLL_SUFFIX _44
## The next ICU_DATA does not need to be specified if c:\Windows\Globalization\ICU is OK on Windows 11.
It is not necessary to specify it on Linux if "uconv --version" is OK.
If the converter Open fails, search libicudata.so.xx then specify its directory.
When you have created your own ".cnv" file and placed it somewhere other than the default folder,
you will need to specify its location.
You can also set it in the environment variable ICU_DATA.
ICU_DATA
e.g. (Linux) ICU_DATA /system/usr/icu:/data/data/yourcnvs
(Windows) ICU_DATA w:\icu\icu562\icu\bin;w:\icu\icu481\icu\bin
DBCS_CHARSET : Specifies the converter that supports DBCS in UCS<-->EBCDIC conversion.
If this is not specified, all conversions will be SBCS mode.
When "CONVERTER 0" is used, specify it in SJIS_OPT instead of DBCS_CHARSET.
e.g.
DBCS_CHARSET cp939 #Japanese
DBCS_CHARSET cp933 #Korean Mixed EBCDIC
DBCS_CHARSET cp935 #Chinese(Simplified) Mixed EBCDIC
DBCS_CHARSET cp937 #Chinese(Traditional)Mixed EBCDIC
SBCS_CHARSET : Specify the UCS2<-->EBCDIC SBCS converter.
Not required if DBCS_CHARSET is specified.
If "DefaultMap" is specified, the internal table equivalent to ISO8859-1<-->EBCDIC of CP037 is used.
"DefaultMapEuro" is equivalent to CP1140, and converts ebc-9f to u-20ac (Euro Sign) instead of u00a4 (Currency Sign).
If "CONVERTER 0" is specified, only DEFAULTMAP/DeafultMapEuro can be specified, and DBCS_CHARSET cannot be specified.
If "CONVERTER 1" is specified, there are converters that define 9f-->20ac (Euro sign)
such as cp037 vs cp1140 and cp273 vs cp1141, so use them appropriately.
e.g.
SBCS_CHARSET DefaultMap
SBCS_CHARSET CP1140 #ICU; EBCDIC 037+Euro
SBCS_CHARSET 500 #Windows codepage;IBM EBCDIC International
LOCAL_CHARSET : Name of the Unicode<-->PC code page converter.
If not specified, the code page will be determined from the environment variables, etc.
e.g.
LOCAL_CHARSET 437 # Windows codepage
LOCAL_CHARSET ISO-8859-1 # Linux, iconv converter
MAP_E2A/MAP_A2E: Specifies the SBCS conversion adjustment for each code point.
Only effective when "CONVERTER 0" is used.
E2A is effective for "CV b2m" but has no effect on "CV m2b".
The inverse of E2A must be specified separately by A2E.
e.g.
MAP_E2A 0xa0:0xaf
MAP_A2E 0xaf:0xa0
MAP_E2A 0xa1:~ # EBCDIC 0xa1 -> ASCII tilde
MAP_E2A 0xa1:u0101 # Uxxxx format is target of MAP_E2A only
MAP_A2E ~:0xa1 # EBCDIC 0xa1 <- ASCII tilde
If you use SJIS_OPT KANA_EXT/ENG_EXT, you need to be careful when you set Ascii>0x80 in E2A.
In ShiftJis, 0x81<-->0x9f, 0xe0<-->0xfc, and in EUC-JP, 0xa1<-->0xfe are defined
as the first byte of Japanese DBCS, so if you M2B the output of B2M, there is a possibility that SBCS will be mistakenly converted to DBCS.
For example, the following example is thought to be intended to swap ebc-15 and ebc-25.
MAP_E2A 0x15:0x0a
MAP_E2A 0x25:0x85
MAP_A2E 0x0a:0x15
MAP_A2E 0x85:0x25
However, since 0x85 is the first byte of SJIS DBCS.
So, the B2M output 0x85xx is considered Japanese DBCS when combined with the byte following 0x85.
SUBCHAR_0a : Controls the output of 0x0a (line feed code) when converting to a PC code page.
Parameter valid only for pattern (b)-Use external converter above.
1 : Replace 0x0a with converted SBCS alternative character.
0 : Output 0x0a as it is. (Default value)
SUBCHAR_S2D : When converting to a PC code page, is EBCDIC SBCS->multibyte conversion allowed?
This parameter is only valid for pattern (b)-Use external converter.
For example, in cp037, ebc-a7==>u-00a7, but in cp932, "CV b2m" results in u-00a7==> 0x8198 (Double-Byte).
If you set "SUBCHAR_S2D 1", it will be u-00a7==> '?'.
1 : Replace with SBCS alternative character.
0 : Allow multibyte output. (Default value)
SJIS_OPT : Specifies SJIS conversion options.
This parameter is only valid for pattern (a)-Use internal table.
The XCV/CV command has optional parameter with the same effect,
and the command parameter specification takes precedence.
ENG_EXT: Japanese English Lowercase Extended (CP939=CP300+CP1027)
KANA_EXT: Japanese Katakana Extended (CP930=CP300+CP290)
IBM: Maps EBCDIC kanji to the SJIS-IBM area (default value)
NEC: Maps EBCDIC kanji to the SJIS-NEC area
JIS78: SJIS 1978 version
JIS83: SJIS 1983 version (default value)
e.g.
SJIS_OPT NEC
SJIS_OPT JIS78
SJIS_OPT KANA_EXT
SOSI_A2E : SO/SI setting option when converting DBCS to EBCDIC.
The XCV/CV/SAVe/REPlace/COPy/... commands have optional parameters with the same effect which take precedence.
The default value is INS.
INS : Inserts SO(0xe), SI(0x0f). The output is expanded.
REP : Replaces spaces before and after the DBCS string if there are any, inserts if not.
SHIFT: In addition to REP, absorbs the expansion caused by the insertion by deleting the trailing spaces.
e.g.
SOSI_A2E REP
SOSI_E2A : How SO/SI is handled when converting DBCS from EBCDIC.
Commands such as XCV/CV/SAVe/REPlace/COPy/... have optional parameters with the same effect which takes precedence.
DEL : Delete SO/SI. This shortens the output line length.
REP : Replace SO/SI with ASCII space (default).
## Sample file ##
xeebc.map ~
###########################################################################
# CONVERTER 1 # 0:Internal Table, 1:ICU, 2:iconv/WindowsAPI||+v124R
# ICU_DLL_SUFFIX 44 # ICU dllname suffix
# ICU_API_SUFFIX _44 # ICU apiname suffix
# DBCS_CHARSET cp939 #(Linux)EBCDIC Japanese English lower-case letter extension.~||+v124R
# SBCS_CHARSET cp037 #(ICU)EBCDIC-US ||+v124R
# SBCS_CHARSET 37 #(Windows)ECDIC-US ||+v124I
# SBCS_CHARSET DefaultMapEuro # for "Converter 0"
# LOCAL_CHARSET ISO-8859-1 #(Linux)Latin-1 ||+v124R
# LOCAL_CHARSET 28591 #(Windows Codepage) for ISO-8859-1||+v124R
#
# SJIS_OPT ENG_EXT # ENG_EXT/KANA_EXT
# SJIS_OPT NEC # IBM/NEX/JIS78/JIS83
#
# MAP_E2A 0xa2: 0x5c # Yen sign and backslash
# MAP_A2E 0x5c: 0xa2 #
# MAP_E2A 0xa1: ~ # tilde and upper bar
# MAP_A2E ~: 0xa1 #
# MAP_E2A 0xa0: ? # tilde and upper bar
#
# SOSI_A2E INS # INS/REP/SHIFT
# SOSI_E2A DEL # DEL/REP
# SUBCHAR_0a 1 #1/0 replace by SBCS substitution char.
# SUBCHAR_S2D 1 #1/0 replace converter output by sub-char when SBCS is translated to pc-DBCS.
##################################################################################||~v124R
#
## Use Internal mapping table
#
# CONVERTER 0
# SJIS_OPT KANA_EXT
# MAP_E2A 0x15:0x0a
#
## Use ICU , DBCS codepage
#
# CONVERTER 1
# DBCS_CHARSET cp930
#
## Use ICU , SBCS codepage
#
# CONVERTER 1
# SBCS_CHARSET cp037
#
## Use iconv (Linux)
#
# CONVERTER 2
# SBCS_CHARSET cp1047
#
##################################################################################