SET LEXICAL ON, effects on string comparison and character sorting in indexes

I Love Xbase++ (ILX)
The portal for Xbase++ developers worldwide

Pat France

New member
Staff member
I am here to help you!
Aug 9, 2022
22
3
3
Customer Identifier
E114627
SET LEXICAL ON activates character weighing and matching based on lexical rules, which can be changed and extended by the application.
This feature is particularly useful when dealing with language-specifics, such as accents or the German Umlauts. For example, the name "Kovaç" may be entered as "Kovac" by a non-native user, similarly to "König", which may be entered as "Koenig" or "Konig" depending on the user's keyboard or laziness. By defining suitable lexical rules using SetLexRule(), the application can have these strings be treated as identical, so using "Koenig" as a search term will also bring up "König", for example. This scheme can even be extended to character combinations, for example, that "McDonald's" is treated the same as "Mac. Donalds".

Activating lexical character comparison with SET LEXICAL ON changes the system's behavior in several areas.

- SET LEXICAL affects the simple equality operator (=), the size comparison operators (<, <=, >, >=) and the not equal operators (<>, !=, #)
The exact equal operator (==) must be used when checking for exact (binary) equality.

- SET LEXICAL takes precedence over SET EXACT
Code relying on specific SET EXACT behavior regarding trailing blanks in strings must be tested carefully with SET LEXICAL ON.

- SET LEXICAL affects sorting when creating indexes with the native Xbase++ indexing engines (CDXDBE, NTXDBE). An index created with SET LEXICAL ON establishes an order with respect to the lexical rules defined by the application. However, the lexical rules used when creating the index are not stored in the index file.
Searching and maintaining an index including a complete reindex should always be performed with the same SET LEXICAL setting and rule set.

- DbSeek() respects the lexical rules activated with SET LEXICAL ON. Searches for "Kovaç" and for "Kovac" match the same record if rules were specified accordingly.

Examples for using SET LEXICAL and SetLexRule() can be found here: https://doc.alaska-software.com/content/cmd_xppfcref_set_lexical.html
 
Last edited by a moderator: