2025/08/25



   .NLS support.

      Internal error message is by English only except on Japanese environment.
      This issue is about file encoding and screen display language.

      See "Japanese DBCS, Code conversion" about Japanese.

      For Linux console version,
      set "OPT LINECH OFF" not to use ACS(LineDrawingCharacter) if your language
      has SBCS codepoint 0x80 to 0xff.

      (Note)On Windows, it is better not to use UTF8 encoded filename that may not be properly
            translate to UTF-16 Windows internal codepage
            because Windows assumes input is locale code.

      Windows:
        To use different codepage from system default,
        you have to set DOS prompt codepage for console version.
        For ex, enter "chcp 28591" or "chcp 1252" for Germany.
        Both 28591 and 1252 is ISO-8859-1 and 1252 has codepoint of 0x80-0x9f.
        For GUI version, use /C cmdline parameter like as "xe /c1252".
        Then select CharSet from "Other" combobox, ANSI for ISO-8859-1.
        Beforehand, you have to add the language through Windows-Constrol-panel.
        And change language selection through Language-bar.
      Linux:
        You have to set codepage of terminal emulator such as gnome-terminal for console version.
        You have set LANG environment matched to terminal emulator encoding
        about UTF8 or not.
        Consideration for keyboard layout is also required.
        Use SCIM setting up or "setxkbmap" cmd like as "setxkbmap de" for Germany.
        SCIM operation on FC5 is System->Management->Preference->SCIM.

        For gxe/wxe, selected font may supports ligature.
        Ligature means to combine two glyph to one glyph for some combination such as "fi", "ff".
        If this chkbox is Off, keep mono-spacing.
        If On, cursor position may be unmatched with display position,
        gxe/wxe accepts ligature to utf8 and locale code file,
        xe displays the character at the cursor position byte offset.
        "OPT LIGATURE" cmd or LIG cmd(A+";" key) is available.
          Ligature is applied to UTF8 file only for console version.
          Ligature is not applied for the file opened as binary file.
        I heard that in some language glyphs split are un-readable without ligature.
        Try with combination with Unicode combining character option
        which is set by "OPT UNICOMB" cmd or CMB cmd(A+":" key).

      A+u key("UTF SWKB" as cmd) switches treatment of kbd input between UTF8 and locale code.
      See "UTF8 support" for detail.

      -----------------------------------------------------------------------

      Followings are how to display Simplified Chinese (GB18030) in Japanese environment.
        GB18030 4-digit DBCS and EUC 3-byte supplementary Kanji characters are displayed with tab padding characters.
        The display of padding characters can be toggled on/off with "TAB" command.
        If IME is not available, you can enter characters in Hex input mode (toggle with C+F11).

      For Windows
        For Console version (xe), set the command prompt properties, e.g. chcp 54936.
        For the GUI version (wxe), specify code page parameter, wxe /c54936.
        In Setup dialog, set CharSet to ANSI or select GB2312 from Others.
        GB18030 is extension of GB2312, but 4-byte DBCS is also supported in GB2312.
        You may also need to change the FontStyle.
        For other than GB18030, see "Windows CodePage & Font" below and the command line parameter "-C."

      For Linux,
        For Console version (xe), specify -Czh_CN.GB18030 command line parameter.
        If you get the error "setlocale failed," run "locale-gen" once.
          sudo locale-gen "zh_CN.GB18030"
        In terminal emulator settings, select a font that can display kanji.

        For the GUI version (gxe), specify the -Czh_CN.GB18030.
        If you get also the error "setlocale failed," run "locale-gen".
        And also, select a font that can display kanji from the Setup menu by "Font Change" button.

      Windows CodePage & Font

           Windows:wingdi.h defines as following

                #define ANSI_CHARSET            0
                #define DEFAULT_CHARSET         1
                #define SYMBOL_CHARSET          2
                #define SHIFTJIS_CHARSET        128
                #define HANGEUL_CHARSET         129
                #define HANGUL_CHARSET          129
                #define GB2312_CHARSET          134
                #define CHINESEBIG5_CHARSET     136
                #define OEM_CHARSET             255

                #define JOHAB_CHARSET           130
                #define HEBREW_CHARSET          177
                #define ARABIC_CHARSET          178
                #define GREEK_CHARSET           161
                #define TURKISH_CHARSET         162
                #define VIETNAMESE_CHARSET      163
                #define THAI_CHARSET            222
                #define EASTEUROPE_CHARSET      238
                #define RUSSIAN_CHARSET         204

                #define MAC_CHARSET             77
                #define BALTIC_CHARSET          186

      command-line parameter.
        -C  : change locale charset.

              Windows : Codepage.    ex) -c949  -cGerman_Germany.1252
                Use xcv cmd("xcv -List) for available codepage.
                For xe console version, font is determined by "command prompt"'s
                charset property. You may see strange glyph.
                For wxe, you have to set also charset on setup dialog.

              Linux  :Charset        ex) -cGBK, -ciso88591 -czh_CN.GB18030
                Available charset is displayed by xcv cmd or "iconv --list".

                Default Charset is get from LANG environment if the Charset is not UTF8.
                ex) iso88591 when LANG is "de_DE.iso88591".
                If the Charset is UTF8 charset is selected as following.
                (selects available charset from the left-hand)
                    Locale     Charset
                    ------     -------
                     zh_CN     GB18030,GBK,GB2312
                     ko_KR     UHC,EUC_KR
                     ja_JP     eucjp
                On fullscreen console, "ISO88591" if iconv supported or "C"
                is selected.

                Axe uses ICU converter as following.
                    zh_CN :"GB18030","GBK","GB2312"
                    ko_KR :"korean","EUC-KR"
                    ja_JP :"EUC-JP"
                    zh_TW :"Big5-HKSCS","Big5"
                    else  :"ISO-8859-1"

                For other locale, get by nl_langinfo after setlocale by localecode only like as "setlocale(LC_ALL,"de_DE")".
                If setlocale failed(chk it by "locale -a" cmd), iso88591 is selected.

                For gxe, input from GTK is UTF8, gxe translate it to this charset
                and translate back to UTF8 when display to screen.
                For xe console version, input from terminal emulator is translated
                to this charset. If -c is not specified default charset is selected
                using LANG environment.(If LANG is UTF8,determins proper charset).
                And translate to ucs to display using ncursesw.
                Ex) 0xa4a2 is pronounced "a" by Japanese, and the same glyph is 0xaaa2 on EUC-KR.
                    When enter "a" key then Enter key on IME window,
                                           Input from IME      glyph
                   ----------------------  ----------------    ------
                   EUC-JP.UTF8 + -cEUC-KR  aaa2( by KR)     "yy"
                   EUC-JP      + -cEUC-KR  a4a2( by JP)     "xx"
                     yy(Japanese Hiragana) and xx(Hangul) is not displayed by ASCII screen.
                     xe console version may display space by the reason of terminal emulator font selection)


        -Nm : Accept UTF8 byte sequence itself.
              When /Nm is specified, for UTF8 code input to CPLC(non UTF8) file
              set UTF8 code itself if Alt+u ON(indecated by =u=> on command input line),
              set translated locale code if Alt+u OFF(===>) or "?" if translation error occured.