For developers working with Xbase++ on Windows operating systems with more than 32 logical processors, this article is a must-read as it explains how the SetLogicalProcessor() function operates in general and specifically with more than 32 logical processors. This function is designed to manage CPU affinity in Xbase++ applications and services and is based on the Windows API's process affinity mask concept.
The function is directly tied to the affinity mask used in Windows 32bit/64bit operating systems, which is a 32-bit integer where each bit represents a logical processor. This design inherently limits the OS to handle a maximum of 32 logical processors. For systems with more than 32 cores, this presents a significant constraint.
In practice, when an Xbase++ application starts, it may be assigned to a specific processor group, and the `SetLogicalProcessor()` function will then define which of the processors within that group (ranging from 0 to 31) the process will utilize. It is crucial to note that Windows does not facilitate easy and efficient movement of processes between these groups once they are assigned.
However, the Windows start command comes with a /node option to define the node at which a process shall start. For more details see start /? on the shell.
Overview of CPU Affinity and SetLogicalProcessor()
In Xbase++ applications running on Windows, `SetLogicalProcessor()` is utilized to assign a process to a specific CPU core, referred to as setting the CPU affinity. CPU affinity is a critical aspect of performance optimization, allowing an application to restrict its execution to a particular logical processor.The function is directly tied to the affinity mask used in Windows 32bit/64bit operating systems, which is a 32-bit integer where each bit represents a logical processor. This design inherently limits the OS to handle a maximum of 32 logical processors. For systems with more than 32 cores, this presents a significant constraint.
Limitations Due to Windows Architectural Choices
The primary limitation of the `SetLogicalProcessor()` function arises from its dependence on the 32-bit affinity mask. In both 32-bit and 64-bit Windows environments, this mask can only accommodate up to 32 logical processors. This limitation is a design choice by Microsoft, rooted in the early days of multi-core processing where having more than 32 cores was uncommon.Microsoft's Workaround: Processor Groups
To address the limitation of the traditional affinity mask and better support high-performance computing systems, Microsoft introduced the concept of processor groups. Processor groups segment the CPU cores into clusters of up to 32/64, allowing Windows to manage systems with a high number of processors by handling multiple such groups. Technically processor groups must be seen as NUMA nodes. That's why Microsoft uses these terms sometimes in an inconsistent manner.In practice, when an Xbase++ application starts, it may be assigned to a specific processor group, and the `SetLogicalProcessor()` function will then define which of the processors within that group (ranging from 0 to 31) the process will utilize. It is crucial to note that Windows does not facilitate easy and efficient movement of processes between these groups once they are assigned.
Practical Considerations for Xbase++ Developers
In practice, nodes/groups are seen as a logical unit for different workloads with similar dynamics. Otherwise, the whole subject is more theoretical in nature, because except for the "number cruncher" use case, the CPUs of host systems are always distributed across many VMs, so the problem does not exist at the application level.However, the Windows start command comes with a /node option to define the node at which a process shall start. For more details see start /? on the shell.
Special Case: Windows Services
A notable exception in the management of processor groups is with Windows Services, where it is possible to pre-select a processor group before the service is launched. This capability allows for better optimization of how services are allocated to processor groups, potentially enhancing performance and response times.Conclusion
For Xbase++ developers, using `SetLogicalProcessor()` function to manage CPU allocation beyond 32 logical processors and focusing on applications, this issue is not relevant either, because due to the isolation concept, applications are typically distributed across multiple VMs when the host system gets that large in terms of logical CPUs. Although the introduction of processor groups by Microsoft offers a solution to the limitations of the affinity mask, it also requires developers to consider new strategies for process management and optimization in multi-core environments. As systems continue to evolve with increasing core counts, leveraging these features effectively will be key to achieving optimal application performance.TL;DR
Use the code below to distributed your applications effectively over different logical processors on your Host/VM and forget about the processor groups/nodes in terms of your application perspective.
Xbase++:
SetLogicalProcessor( RandomInt(GetLogicalProcessorCount()) )