Unicode Support

The supported file formats for text files and scripts and their notation in popular editors are shown in this table:

AutoIt Notation Notepad Notepad++ SciTE (AutoIt Default Editor)
UTF8 without BOM ANSI or UTF-8 depending on content (will force a BOM if saved) UTF-8 without BOM UTF-8
UTF8 with BOM UTF-8 UTF-8 UTF-8 with BOM
ANSI ANSI ANSI 8 bit / Code Page Property
UTF16 Little Endian Unicode UCS-2 Little Endian UCS-2 Little Endian
UTF16 Little Endian without BOM Unicode (will force a BOM if saved) UCS-2 Little Endian without BOM (can't be selected) Unsupported
UTF16 Big Endian Unicode big endian UCS-2 Big Endian UCS-2 Big Endian
UTF16 Big Endian without BOM Unsupported UCS-2 Big Endian without BOM (can't be selected) Unsupported

 

The recommended script format is UTF-8 with BOM as that works best with notepad.exe and the AutoIt editor SciTe and guards against scripts unintentionally getting saved in a particular code page.

ANSI formats are not recommended as they can cause problems when run on machines with different locales.

UTF16 BE or LE without a BOM are not recommended. They are widely unsupported. Even with a BOM they are not particularly common and they take up much more disk space than UTF8.

  • File operations on text files not opened with FileOpen() and explicit unicode flags auto-detect encoding similar to most modern editors. This includes all file functions that are used with a filename, for example FileRead("filename.txt"). Specifically:
    • Files containing a BOM will be opened in the relevant mode as per that BOM. UTF-8 and UTF-16 BOMs are checked.
    • UTF-8 and UTF-16 files without a BOM will be automatically detected and opened in the relevant mode.
    • Files containing nulls are opened in Binary ($FO_BINARY) mode by default (unless they are detected as valid UTF-16). Previously they would be opened in ANSI mode. Use the $FO_ANSI flag to override.
    • Files containing only characters 1-127 are opened in UTF-8 with no BOM ($FO_UTF8_NOBOM) mode by default. Previously they would be opened in ANSI mode. Use the $FO_ANSI flag to override.
    • Files containing only characters 1-255 are opened in ANSI ($FO_ANSI) mode by default.
    • Due to the above FileGetEncoding() now returns 512 ($FO_ANSI) or 256 ($FO_UTF8_NOBOM) instead of 0 which was undocumented but indicated ANSI.

    Current Limitations

    There are a few parts of AutoIt that don't yet have full Unicode support. These are:

    • Send and ControlSend - Instead, Use ControlSetText or the Clipboard functions.
    • Console operations are converted to ANSI.

    These limits will be addressed in future versions if possible.