- What is the UNICODE format
- Managing UNICODE
- Implicit conversions
Handling Unicode in character strings
What is the UNICODE format The UNICODE format is an encoding system that assigns a unique number to each character. This encoding is performed on 16 bits. This number can be read regardless of the platform, software or language used. The Unicode format can support all the character sets of the planet. The management of the Unicode format is taken into account: - when managing the character strings
- when managing the data files
- when managing the controls that display data coming from character strings or data files.
To manage UNICODE, WINDEV offers: - an option to define the string format at runtime in the project configuration description.
To change the string format in the current configuration:
- Open the project description window: on the "Project" tab, in the "Project configuration" group, click "Current configuration".
- In the window that appears, go to the "Unicode" tab and select the desired mode: Use ANSI strings at run time or Use UNICODE strings at runtime.
- the UNICODE String type.
- conversion functions:
These functions are used to perform conversions from Ansi to Unicode and from Unicode to Ansi. These conversions are automatically performed when handling character strings (see the next paragraph).
| | AnsiToUnicode | Converts:- an ANSI string (Windows) to a UNICODE string.
- a buffer containing an ANSI string (Windows) to a buffer containing a UNICODE string.
| StringToUTF8 | Converts an ANSI or UNICODE string to UTF-8. | UnicodeToAnsi | Converts:- a UNICODE string to ANSI (Windows).
- a buffer containing a UNICODE string to a buffer containing an ANSI string (Windows).
| UTF8ToString | Converts a UTF-8 string to ANSI or UNICODE. |
- the use of functions for handling the character strings:
| | Complete | Returns a character string of a specified length. | CompleteDir | Adds a backslash to the end of a string, if necessary. | ExtractString | Allows you to:- extract a substring from a string based on a specified string separator.
- search for substrings in a string based on a specified string separator.
| Left | Extracts the left part (i.e., the first characters) from a string or buffer. | Length | Returns:- the length of a string, i.e., the number of characters in the string (including spaces and binary zeros).
- the size of a buffer, i.e., the number of bytes in the buffer.
| Middle | Extracts: - a substring from a string starting at a specified position.
- part of a buffer starting at a specified position.
| NoAccent | Converts accented characters in a string to non-accented characters. | NoSpace | Returns a string after removing the spaces: - from the left and right side of the initial string.
- within the string.
| Position | Finds the position of a specified string within another string. | PositionOccurrence | Finds the Xth position of a string within another string. | RepeatString | Concatenates N number of copies of the same specified string or buffer. | Replace | Replaces all occurrences of a specified substring in a string with another specified substring. | Reverse | Returns the character that corresponds to the difference between the ASCII code of a specific character in a string and 255. | Right | Extracts the right part (i.e., the last characters) from a string or buffer. | StringCompare | Compares two strings character by character:- according to the sequence of ASCII characters.
- according to the alphabetical order.
| StringCount | Calculates: - the number of occurrences of a specific character string (by respecting the search criteria) in another character string.
- the number of occurrences of a set of strings in an array.
| StringFormat | Formats a character string according to the selected options. | TypeVar | Identifies the type of an expression, a variable (during a call to a procedure for example) or a control. | Val | Returns the numeric value of a character string. |
- the use of operators for handling the character strings:
- the use of operators for browsing the character strings:
- the use of functions for handling the text files:
| | fWrite | Writes a character string into an external file (ANSI or UNICODE format). | fWriteLine | Writes a line into a text file (in ANSI or UNICODE format). | fRead | Reads a block of bytes (characters) in an external file (ANSI or UNICODE). | fReadLine | Reads a line in an external file (in ANSI or UNICODE format). | fOpen | Opens an external file (ANSI or Unicode) to handle it by programming. |
Implicit conversions From version 12, the conversions are implicitly performed between the character strings in ANSI format and the character strings in UNICODE format. These conversions are performed by using the current character set (defined by ChangeCharset). From version 15, the UNICODE character strings are automatically converted when assigned to an HFSQL item of following types: boolean, integer (any size, signed or unsigned), currency, numeric or real. Use cases:
// With the conversion functions ResU is UNICODE string = AnsiToUnicode("Test of a string") // Implicit conversion ResU is UNICODE string = "Test of a string"
Res is string ResU is UNICODE string // With the conversion functions ResU = AnsiToUnicode(Res) Res = UnicodeToAnsi(ResU) // Implicit conversion ResU = Res Res = ResU
Res is string ResU is UNICODE string // With the conversion functions IF ResU <> AnsiToUnicode("") THEN ... IF ResU <> AnsiToUnicode(Res) THEN ... // Implicit conversion IF ResU <> "" THEN ... IF ResU <> Res THEN...
nFil is int = fOpen("E:\temp\Unicode.txt", foRead + foUnicode) ResU is UNICODE string ResU = fReadLine(nFile) WHILE ResU <> EOT Trace(ResU) ResU = fReadLine(nFile) END
This page is also available for…
|
|
|
|