PGI Compiler's Processor Dispatch Code
PGI Compiler will insert the following routines which are invoked before user program's main.__pgi_ctrl_init __pgi_is_gh_b __pgi_is_gh __pgi_is_amd __pgi_align_faultand if the user program's is OpenMP enabled, the __pgi_unified_version_select7 routine will be called.
Anatomy of __pgi_ctrl_init
This function basically calls __pgi_is_gh_b (gh is PGI's internal code name for AMD K10/Barcelona microarchitecture. To wit, search for "gh"="barcelona" line in the bin/x86rc file.) to check if the CPU supports Misaligned Exception bit in MXCSR (MMX Extension Control/Status Register), if so, call __pgi_align_fault to set that bit to 1, so Misaligned Exception will not be raised.Misaligned Exception bit is an AMD-only feature, so __pgi_is_gh_b will call __pgi_is_gh, which will call __pgi_is_amd to check if the CPU vendor ID string is AMD. If so, it then execute cpuid instruction with EAX=0x80000001, and examine bit 7 of ECX. If this bit is 1, then Misaligned Exception bit is supported.
Anatomy of __pgi_unified_version_select7
The pseudo code of __pgi_unified_version_select7 (PGI Unified Binary version selector) is as follows:if (vendor ID string is "GenuineIntel") { Call cpuid with EAX=0x80000000. If (EAX > 0x80000000) { Call cpuid with EAX=0x80000001 else Set EAX=EBX=ECX=EDX=0 If (bit 29 of EDX is true) { /* this means the CPU is running in 64-bit mode */ Check for SSE3, SSSE3, SSE4.1 capabilities in sequence Save the result in __pgi_uni_ver7 return; } else Call __pgi_abort } else if (vendor ID string is "AuthenticAMD") { Call cpuid with EAX=0x80000000. If (EAX > 0x80000000) Call cpuid with EAX=0x80000001 else Set EAX=EBX=ECX=EDX=0 If (bit 29 of EDX is true) { /* this means the CPU is running in 64-bit mode */ Check for SSE3 and SSE4a capabilities in sequence Save the result in __pgi_uni_ver7 return; } else Call __pgi_abort } Call __pgi_abort
The global variable __pgi_uni_ver7 is referred in AMD Core Math Library (ACML) routines, such as libacml*, and CUDA accelerator libraries such as libacc*