Analysis of PGI compiler's processor dispatch code

PGI Compiler's Processor Dispatch Code

PGI Compiler will insert the following routines which are invoked before user program's main.
__pgi_ctrl_init
__pgi_is_gh_b
__pgi_is_gh
__pgi_is_amd
__pgi_align_fault
and if the user program's is OpenMP enabled, the __pgi_unified_version_select7 routine will be called.

Anatomy of __pgi_ctrl_init

This function basically calls __pgi_is_gh_b (gh is PGI's internal code name for AMD K10/Barcelona microarchitecture. To wit, search for "gh"="barcelona" line in the bin/x86rc file.) to check if the CPU supports Misaligned Exception bit in MXCSR (MMX Extension Control/Status Register), if so, call __pgi_align_fault to set that bit to 1, so Misaligned Exception will not be raised.

Misaligned Exception bit is an AMD-only feature, so __pgi_is_gh_b will call __pgi_is_gh, which will call __pgi_is_amd to check if the CPU vendor ID string is AMD. If so, it then execute cpuid instruction with EAX=0x80000001, and examine bit 7 of ECX. If this bit is 1, then Misaligned Exception bit is supported.

Anatomy of __pgi_unified_version_select7

The pseudo code of __pgi_unified_version_select7 (PGI Unified Binary version selector) is as follows:
   if (vendor ID string is "GenuineIntel") {
      Call cpuid with EAX=0x80000000.
      If (EAX > 0x80000000) {
         Call cpuid with EAX=0x80000001
      else
         Set EAX=EBX=ECX=EDX=0

      If (bit 29 of EDX is true) {
         /* this means the CPU is running in 64-bit mode */
         Check for SSE3, SSSE3, SSE4.1 capabilities in sequence
         Save the result in __pgi_uni_ver7
         return;
      }
      else
         Call __pgi_abort
   }
   else if (vendor ID string is "AuthenticAMD") {
      Call cpuid with EAX=0x80000000.
      If (EAX > 0x80000000)
         Call cpuid with EAX=0x80000001
      else
         Set EAX=EBX=ECX=EDX=0

      If (bit 29 of EDX is true) {
         /* this means the CPU is running in 64-bit mode */
         Check for SSE3 and SSE4a capabilities in sequence
         Save the result in __pgi_uni_ver7
         return;
      }
      else
         Call __pgi_abort
   }

   Call __pgi_abort

The global variable __pgi_uni_ver7 is referred in AMD Core Math Library (ACML) routines, such as libacml*, and CUDA accelerator libraries such as libacc*