**2024/11/30
= Code Conversion
CV [e2s|s2e|e2e|s2s|s2j|j2s|e2j|j2e] [z] [NX] [.lab1 .lab2]
CV {s2u|u2s|s2f|f2s|u2f|f2u|e2f|f2e|m2f|f2m} [bom] [ucs4] [be|le] [ibm|nec]
[\u] ['err-rep-char']
CV {b2s|s2b|b2e|e2b|b2a|a2b} [CRLF|ANK] [CP290|CP1027|KANA-EXT|ENG-EXT]
[JIS78] [-S[a|d|r|m|s]|-SBCS] ['err-rep-char']
CV {K2L|L2K}
CV x2c [EBC] [i[-m[:{n|max|rep}]]] [mult | -Hpre[XXpost]]
[CRLF|ANK] [CP290|CP1027|KANA-EXT|ENG-EXT] [JIS78] [-S[bcs|a|d|r]
CV {M2B|B2M} [CP290|CP1027|KANA-EXT|ENG-EXT] ['err-rep-char']
CV {M2B|B2M|B2F|F2B} [CPEB:ebcdic-codepage] [-S{a|d|r}]
CV B2B [CPEB:ebcdic-codepage] [-S{a|d|r}] [SETCP]
CV M2M -F:charset -T:charset [-SBCS] [-ANK] [ICU] ['err-rep-char']
DOS version support 1st format only.
[NX] and [.lab1 .lab2] is common to all format.
Utility XCV is contained in the zip file.
e2s :EUC -->SJIS
s2e :SJIS-->EUC
s2s :SJIS hankaku-katakana-->zenkaku-katakana
e2e :EUC hankaku-katakana-->zenkaku-katakana
s2j/j2s :SJIS<-->JIS
e2j/j2e :EUC <-->JIS
s2u/u2s :SJIS<-->UCS2(2byte Unicode)
s2f/f2s :SJIS<-->UTF8(8 bit UTF)
#if defined(W32)||defined(LNX)
e2f/f2e :JapaneseEUC<-->UTF8(8bit UTF)
m2f/f2m :File's CodePage<-->UTF8(8bit UTF)
Between UTF8 and Codepage specified when open the file.
Invalid code is replaced by "?", but there is non reversible conversion
even if not replaced to "?". I those case reverse conversion cannot return
to the source code.
#endif
u2f/f2u :UCS2<-->UTF8
b2s/s2b :IBM-EBCDIC<-->SJIS(Japanese DBCS conversion).
b2e/e2b :IBM-EBCDIC<-->EUC-JP conversion.
b2a/a2b :IBM-EBCDIC<-->ASCII SBCS(SO/SI(0x0e/0x0f) is ignored).
k2L/L2k :HankakuKatakana<-->EnglishLowerCase letter.
When EBCDIC-->ASCII translated lowercase letter to
hankaku-katakana,re-translate the file to recover
lowercase letter.
k2L:re-conv by CP1027 the result of CP290
L2k:re-conv by CP290 the result of CP1027
x2c :convert hex notation string to char(ignore space).
#if (defined(W32)||defined(LNX))
M2B/B2M :translation Local Codepage<->EBCDIC according to a mapfile.
mapfile name is specified on INI file.
Define parameter in the mapfile such as EBCDIC/Local and SBCS mapping to adjust translation.
'err-rep-char' option override the SUBCHAR parameter of the mapfile.
See "EBCDIC translation".
CPEB:codepage pattern allows to specify the EBCDIC codepage other than specified on cfg file.
F2B/B2F :UTF8<->EBCDIC("M" of M2B/B2M is UTF8)
Th file opened as CPU8 dose not accept F2B, open as CPLC then apply F2B.
EOL code is not changed unless /Me option is specified.
ex) "END /Me", "SAVe xx /Me".
Even after saved, you can change EOL code by "EDIt xx /Mue" (change LF to BCDIC EOL:0x15)
Also for B2F, it will be required to change EOL code to (CR)LF.
B2B :translation between different EBCDIC codepage.
Specify target codepage by CPEB:tgtcp( source codepage is of the file)
Translation is done using ICU or iconv(for Linux) according to ::xeebc.map spcification
SETCP :change current file's codepage.
Translated file is displayed using the codepage.
Without SETCP, after translation to the codepage, code is displayed using original codepage.
Use "EBC SETCP:codepage" cmd to change codepage without translation.
M2M :translation between any codepage.
Specify codepage codepage in -f:charset -t:charset option.
Available codepage/charset are list by
"xe -c?" (Windows)
"iconv -l" (Linux)
"uconv -l" (ICU).
Default of /f: and /t: is "0"(=CP_ACP) for Windows,
is determined by current locale and iconv list for Linux.
For Windows, "UTF8" is acceptable as alternative of "6500".
Use not M2M but B2M/M2B/B2B for EBCDIC translation.
#endif
z :hankaku-katakana-->zenkaku-katakana
#ifdef UNX
z of e2e if conversion type missing.
#else
z of s2s if conversion type missing.
#endif
BOM :Check/Write BOM(UTF-8 type mark) on top of file.
effective for u2f,s2f,f2u,f2s.
UCS4 :unicode by 4 byte. for U2F and F2u.
If missing "UCS4" for f2u,xe outputs surrogate pair for ucs4.
BE/LE :unicode endian type. Default is BE for UNIX else LE.
use with u2s,u2f,s2u,f2u.
ICU :use ICU for M2M
IBM/NEC :for the case multiple SJIS code point for a unicode,
specify target mapping region.
use with u2s,f2s. default is IBM.
JIS78 :DBCS conversion by OLD-JIS sequence.
default is JIS83(NEW-JIS) sequence.
ANK :replace control char by 'err-rep-char'(e.g. '?' for B2M),
\u :\uxxxx format unicode input conversion(u2s,u2f).
:\uxxxx format unicode output conversion(s2u,f2u).
CRLF :replace CR,LF by 'err-rep-char'.
EBC :After x2c,do IBM-EBCDIC-->ASCII translation.
SO effect(ShiftOut of Japanese Kanji) continues to next line
except the case SBCS option or rep option is specified.
CP290 :use IBM-CP290(katakana extension) for SBCS conversion.
/CP1027 default table for SBCS conversion is IBM-CP1027(english lowercase letter extension).
KANA-EXT/ :KANA-EXT:CP290
ENG-EXT ENG-EXT:CP1027
i[-m[:{n|max|rep}]]
:Input column. default is TopOfLine. Up to EndOfLine
when m is missing.
n:Output column. default is TopOfLine.
max:Output to the right of longest line.
rep:convert hex digit in the place.
mult :continue hex conversion process even if invalid hex digit
detected.
-Hpre[XXpost]:convert between prefix and postfix.
XX is delimitter of prefix and postfix.
ex). -H0x' -Hx'xx'
-S[a | d | i | r | m | s | bcs]
a :assume SO on the top of line.
bcs :SBCS mode(no DBCS consideration)
d :delete SO(0x0e)/SI(0x0f) for B2x.
SO/SI is not set for x2B.
i :for x2B, insert SO(0x0e)/SI(0x0f).
r :for x2B;replace adjacent space char by SO/SI
or insert SO/SI if no adjacent space char.
for b2x, replace SO/SI to space.
m :for B2c;effect of SO continues to the next line.
s :same as -Sr, but following spaces are reduced by the expanded length
by SO/SI insertion if those are not single space.
default is replace SO/SI by space for B2x, insert SO/SI for x2B.
xeebc.map defines default for B2M/M2B/B2F/F2B.
err-rep-char:substitution char for conversion error.
Char and HEX notation. for SBCS and DBCS.
ex) xcv f2s "."
xcv m2b "6f" "4040"
xcv b2m "?-"
use with u2s,f2s. default is '?'.
NX :limit the line to be converted to not excluded line.
.lab1/2 :limit the line to be converted to that label range.