A brief history of Linux Libc
Libc 4 | A fork of GNU Libc version 1: a.out executable binaries. |
Libc 5 | Still a fork of GNU Libc version 1: ELF executable binaries. |
Libc 6 | The same as GNU Libc version 2: POSIX compliance, 64-bit support, IPv6, i18n. Moving to Glibc2 was heralded by Red Hat in December 1997, with their release of Red Hat Linux 5.0. |
GNU Libc availability
GNU libc releases and development trunkWhat's contained in GNU libc ? (libdl, libpthread, librt, and all that)
General tips for building GNU libc
GNU libc's "INSTALL" and "FAQ" documents
Alternatives to GNU Libc
Development trunk of FreeBSD's libc, Development trunk of OpenBSD's libcStripped down version, for embedded systems: uClibc, EGLIBC (Embedded GLibc), Newlib Android Bionic
C version of C++'s STL: GLib (GTK+/GNOME Lib), APR (Apache Portable Runtime), NSPR (Netscape Portable Runtime)
Feature test macros
Certain functions/features will be available if the following macros are defined:
_LARGEFILE_SOURCE _LARGEFILE64_SOURCE |
Enables 32 bit systems to use files of sizes beyond the usual limit of 2GB.
fseeko and ftello will be available. |
_FILE_OFFSET_BITS | If this macro is defined to be 64, then the large file interface replaces the old interface, i.e. the functions are not made available under different names as in _LARGEFILE64_SOURCE. |
_GNU_SOURCE | If this macro is defined, then all GNU libc extra features are included. (e.g. ISO C89, ISO C99, _LARGEFILE_SOURCE etc) |
_REENTRANT _THREAD_SAFE | If this macro is defined, reentrant versions of several functions get declared. |
Run-time environment
program_invocation_name program_invocation_short_name |
program_invocation_name is the same as argv[0] and program_invocation_short_name
is the one without directory name.
To use, #include <errno.h> and #define _GNU_SOURCE. On Linux, one can also use readlink("/proc/self/exe", buffer, BUFFER_SIZE)or read the content of /proc/self/cmdline to do the same thing. |
atexit on_exit |
Allow to register a callback function, which is invoked at normal program termination.
on_exit allows the callback to take a pointer argument. |
gnu_get_libc_version gnu_get_libc_release | Get GNU libc version/release information. |
confstr | Get GNU libc's pthread version information (use _CS_GNU_LIBPTHREAD_VERSION argument) |
get_nprocs_conf get_nprocs |
Get number of processors in the system (configured and actually on-line ones).
Alternatively, one can use sysconf(_SC_NPROCESSORS_CONF)and sysconf(_SC_NPROCESSORS_ONLN) |
getpagesize sysconf(_SC_PAGESIZE) | Get the page size of the process. |
get_phys_pages sysconf(_SC_PHYS_PAGES) get_avphys_pages sysconf(_SC_AVPHYS_PAGES) | Get the number of total/total available physical pages. |
backtrace backtrace_symbols | Produce backtrace of call stack. |
Dynamic link libraries
Note: The LD_-prefix environmental variables are recognized by the runtime linker ld.so. A complete explanation can be found in ld.so's man page.
dlopen dlsym dlvsym dlclose dlerror dladdr dladdr2 dlinfo |
Load dynamic link libraries and functions in them.
dlvsym does the same as dlsym but takes a version string as an additional argument. |
dladdr | Inquire at runtime to find out which shared object contains the given function pointer |
dl_iterate_phdr | Inquire at runtime to find out which shared objects it has loaded; it walks through list of shared objects. |
LD_LIBRARY_PATH | A colon-separated list of directories to search for DLLs |
LD_RUN_PATH |
A colon-separated list of directories to search for DLLs used during link time.
This environmental variable is used by the linker, not the dynamic linker. |
LD_PRELOAD | A whitespace-separated list of DLL file names to be loaded before all other DLLs. |
LD_DEBUG |
Set to a non-empty string to display debugging information during runtime. The string
can be libs (display library search paths),
reloc (display relocation processing),
files (display progress for input file),
symbols (display symbol table processing),
bindings (display symbol binding information),
versions (display version dependencies),
statistics (display relocation statistics),
all (display everything),
help (display available options)
Use LD_DEBUG_OUTPUT to set the output file name. |
LD_TRACE_LOADED_OBJECTS | Set to a non-empty string to display DLL loading during runtime but do not execute the executable binary. The effect is like running the ldd command. |
LD_VERBOSE | Display DLL versions when LD_TRACE_LOADED_OBJECTS is in effect. |
LD_BIND_NOW | Set to a non-empty string to enable the dynamic linker ld.so to resolve all names at program startup instead of lazy linking. |
LD_AUDIT |
Similar to LD_PRELOAD, but load auditing libraries instead.
These auditing libraries can be used to monitor dynamic loading during runtime.
See rtld-audit's man page for details. |
LD_DYNAMIC_WEAK | Set to 1 to allow weak symbols in the DLL to be overriden by user's code. The user's program cannot contain set-user-ID/set-group-ID calls. |
LD_PROFILE |
Set to DLL file name to enable profiling of that DLL.
Use LD_PROFILE_OUTPUT to set the output file name. |
LD_SHOW_AUXV | Set to 1 to show the content of auxiliary array passed up from the kernel. |
LD_USE_LOAD_BIAS | Set to 0 to disable prelinking. |
LD_ORIGIN_PATH | The $ORIGIN in RPATH or RUNPATH will be expanded to the value of LD_ORIGIN_PATH. |
/etc/ld.so.conf /etc/ld.so.cache |
Cache of DLL paths.
/etc/ld.so.conf is a text file. /etc/ld.so.cache is a binary file. To see its contents, use the following command: ldconfig -p. For details, see ldconfig's man page |
Memory management
brk sbrk | Low-level system calls for memory allocation (actually, resize the data segment). |
mmap |
Memory (must be at least the size of a page)
allocated via mmap (either with MAP_ANONYMOUS
flag or with /dev/null mapping) will be returned
to OS immediately on munmap call.
This is not true for the traditional malloc/calloc calls. |
xmalloc xcalloc xrealloc xstrdup xmemdup | [libiberty] Allocate memory without fail. If malloc fails, this will print a message to stderr (using the name set by xmalloc_set_program_name, if any) and then call xexit. |
obstack | The memory is organized as a stack of objects. The allocation/deallocation is faster than malloc/free. |
alloca strdupa |
Allocate memory from stack instead of from heap. Alternatively, one can use GNU C's variable-size arrays. strdupa is the same as strdup, except it uses alloca instead of malloc. |
mallopt | Set tunable parameters for malloc. For example, one can set that all chunks larger than certain value are allocated using mmap. |
memalign posix_memalign valloc | Allocate aligned memory blocks. |
posix_madvise madvise |
Allow to announce an intention to access
memory in a specific pattern (e.g. sequential, random, don't need, etc)
in the future, thus allowing the kernel
to perform appropriate optimizations.
madvise is actually a Linux system call and is not part of GNU libc. |
alloc_hugepages free_hugepages |
Allocate or free huge pages.
This is actually a Linux system call and is not part of GNU libc. The Linux kernel must be configured to use hugeTLB feature ("CONFIG_HUGETLB_PAGE"). Also see here. |
__malloc_hook __realloc_hook __free_hook __memalign_hook __malloc_initialize_hook |
Allow to register callback functions, which are invoked at calls of malloc, realloc, free, etc |
mcheck mprobe |
This will cause malloc to perform occasional consistency checks,
such as writing past the end of a allocated memory block.
This is implemented via above memory allocation hooks. |
MALLOC_CHECK_ |
Set this environmental variable to 1 to enable run-time memory
check. Set it to 2 to enable program termination on any memory check failure. |
mtrace muntrace |
Enable/disable malloc tracing facility. When these functions are used, one can set environmental variable MALLOC_TRACE to point to a file name for output. |
memusage.sh | This is a shell script in glibc distrbution (under malloc/ directory in glibc source tree). It uses LD_PRELOAD to load libmemusage.so, which will provide multi-colored table of heap memory usage when the user program terminates normally. |
mallinfo | Get the statistics for malloc. |
getpagesize sysconf(_SC_PAGESIZE) | Get the page size of the process. |
get_phys_pages sysconf(_SC_PHYS_PAGES) get_avphys_pages sysconf(_SC_AVPHYS_PAGES) | Get the number of total/total available physical pages. |
mlock munlock mlockall munlockall mprotect getrlimit(RLIMIT_MEMLOCK,...) |
Lock/unlock a (virtual) memory range. Locked pages will not be swapped out. Usually only the superuser can lock a page. getrlimit returns the maximal number of pages that can be locked. |
String processing
dyn_string | [libiberty] C++ like strings |
strdupa | strdupa is the same as strdup, except it uses alloca instead of malloc |
strsep | strsep is the preferred way to tokenize a string. It can
replace the thread-safe strtok_r. For example:
const char string[] = "words separated by spaces -- and, punctuation!"; const char delimiters[] = " .,;:!-"; running = strdupa (string); token = strsep (&running, delimiters); /* token => "words" */ token = strsep (&running, delimiters); /* token => "separated" */ token = strsep (&running, delimiters); /* token => "by" */ |
mempcpy stpcpy | These are like memcpy and strcpy, except the return value is a pointer to the last byte in the destination or the end of the string. |
memccpy | Like memcpy, except it stops if a byte matching a specified character is met. |
strverscmp | Compare two strings as if they hold indices/version numbers. |
memmem | Like strstr, except for generic character arrays. |
fnmatch glob | Filename style wildcard/pattern matching. |
regcomp regexec regfree | POSIX regular expression matching. |
strspn strcspn |
Find the length of the first segment (of a string) consisting entirely of characters contained
within a given mask ("skip set").
strcspn will find length of initial segment (of a string) not matching mask ("stop set") |
strfry | Randomize the input string (actually, pseudorandom anagram) using rand function. |
memfrob | Convert (frobnicate) an array of data to something unrecognizable and back again. It is like ROT13 encryption, except it works on arbitrary binary data. |
l64a a64l | Convert between 32-bit integer and base-64 strings. |
strtod strtol strtoul | Convert strings (in hex, decimal, etc) to numbers. |
human_readable | [Gnulib] Convert numbers to human readable format. |
strvis
strunvis |
[BSD] Convert binary stream into a printable format. |
crc32 | [libiberty] Compute the 32-bit CRC of a buffer. |
argz envz |
Create Unix-style argument vector argv.
Envz vectors are just argz vectors, except that each element in an envz vector is a name-value pair, separated by "=" character. |
dupargv | [libiberty] Duplicate an argument vector. |
freeargv | [libiberty] Free an argument vector. |
get_mempolicy set_mempolicy |
Get/set default NUMA memory policy for a process and its children.
This is actually a Linux system feature is not part of GNU libc. |
move_pages migrate_pages mbind |
Move/Bind the specified/all pages of a process to the specified memory nodes.
This is actually a Linux system feature is not part of GNU libc. |
Error handling
To use the following functions, #include <errno.h>strerror stderror_r | Maps the error code (usually stored in the global variable errno) to the corresponding error message. |
perror | If called with NULL, just print the error message (corresponding to global variable errno) to stderr, otherwise, print the error message and the user-supplied string to stderr. |
error error_at_line | A printf-like alternative to perror. It can also
optionally call exit or continue execution. This is the preferred generic error reporting function. |
err verr | Very much like error, except the errnum argument
is automatically supplied with errno.
verr is the version which takes va_list (variable-length arguments) |
error_message_count | This variable is incremented each time error or error_at_line is called. |
fmtmsg addseverity | Print formatted (error/warn/info) messages. This is System V extension. |
Data structures & algorithms
lsearch lfind |
Perform linear search. lsearch will append the element if it is not found, while lfind will not. |
bsearch | Perform binary search on a sorted array. |
hcreate hsearch hdestroy | Create and manage a hash search table (which has a predefined fixed size) There are also thread-safe reentrant version of these functions. |
Hash | [Gnulib] Create and manage a hash search table. |
uthash | Macros for creating and managing a hash search table. |
tsearch tfind twalk tdelete |
Create and manage a binary search tree. See examples here |
insque remque |
Manipulate queues built from doubly linked lists |
fibheap | [libiberty] Fibonacci heap |
splay_tree
(or here |
[libiberty] Splay tree |
SPLAY_XXXX | [OpenBSD] Macros for manipulating splay trees |
RB_XXXX | [OpenBSD] Macros for manipulating red-black trees |
SLIST_XXXX | Macros for manipulating singly-linked lists. Source code here |
STAILQ_XXXX | Macros for manipulating singly-linked tail queues |
LIST_XXXX | Macros for manipulating doubly-linked lists |
TAILQ_XXXX | Macros for manipulating doubly-linked tail queues |
CIRCLEQ_XXXX | Macros for manipulating circular doubly-linked tail queues |
Formatted I/O
asprintf | This is like sprintf, except that it will dynamically enlarge the buffer if the buffer is too small to hold the string. |
scanf | A GNU extension allows scanf to automatically allocate a big enough buffer, by specifying the 'a' flag character. |
parse_printf_format | Get information about the number and types of arguments expected by the printf template string. |
register_printf_function |
Allow to register a callback function to handle a user-defined
conversion specifier character used in printf. A pair of pre-defined callback functions are printf_size and printf_size_info, which will print floating-point numbers with k, m, g .. suffix, e.g. 1024 will be printed as 1k. |
Files & I/O
freopen |
A combination of fclose and fopen. This is useful for redirecting
predefined streams like stdin, stdout and stderr to specific files:
freopen ("myfile.txt","w",stdout); printf ("This sentence is redirected to a file."); fclose (stdout); |
fpathconf pathconf | Get system configuration values (e.g. maximum length of a filename) for files. |
fstatfs statfs fstatvfs statvfs |
Get get file system statistics. |
__freadable __fwritable | Test if a file stream is read/writeable. |
fdmatch | [libiberty] Test if two file descriptors refer to the same file. |
fcloseall | Close all file streams of a process. |
_flushlbf | Flush all file streams of a process. |
__fpurge fclean |
Empty the buffer of a file stream.
fclean will force output for output streams and give the data in the buffer back to system for input streams. |
setvbuf setbuf setbuffer |
Set the buffering mode (e.g. _IOFBF for full buffering, _IOLBF for line buffering, or _IONBF for unbuffered) of a file stream.
BUFSIZ is an integer constant that is used by setbuf. |
__flbf __fbufsize __fpending | Query the buffering mode, buffer size, and pending number of bytes of a file stream. |
flockfile ftrylockfile funlockfile |
Lock a file stream (in a multithreading program).
Note: fcntl can do the same for file descriptors. |
fputc_unlocked fgetc_unlocked fputs_unlocked fgets_unlocked fread_unlocked fwrite_unlocked |
These are like the fputc, fgetc..., except they do not
implicitly lock the file streams (in a multithreading program).
The implicitly locking of fputc, fgetc... can be controlled by __fsetlocking function. |
getline getdelim |
Read an entire line from a file stream and store text
in a buffer. getline is preferred over fgets because it will enlarge the buffer if the buffer is too small to hold the entire line. getdelim is like getline except that the character which tells it to stop reading is not a necesarily a newline. To use, #include <stdio.h>, #include <stdlib.h>, #include <string.h>, and #define _GNU_SOURCE. |
ferror feof clearerr | Check for (and clear) errors or EOF related to file streams. |
fmemopen open_memstream open_obstack_stream |
Open a stream which is a buffer in the memory instead of a file.
open_memstream is the same as fmemopen, except that it automatically allocates & grows the buffer. open_obstack_stream is the same as open_memstream, but it uses object stack instead of malloc. |
fopencookie |
Allow to create a custom implementation
for a standard I/O stream. fmemopen and open_memstream are implemented using fopencookie. |
pread pwrite |
Like read/write, except one can read from/write to a specific offset (counting from the
file starting position) instead of current position. It is like doing a lseek and then a read/write. |
readv writev | Scatter/gather I/O. |
sync fsync fdatasync | Make sure all data associated with the given file descriptor is written. These are blocking calls. |
lseek |
lseek can be used to create a "sparse" file by: lseek beyond the end of a file, and then do a write.
ftruncate can do the same when the new size is larger than the current size. |
fcntl | Control operations on files: Duplicate file descriptor (i.e. dup/dup2), get/set flags (e.g. O_RDONLY, O_CREAT, etc) associated with the file descriptor, get/set file locks, etc. |
aio_read aio_write ... lio_listio |
Asynchronous I/O. Must be linked to the real-time library librt. For example code, see AIO man page. |
sendfile splice |
High performance copy from one file descriptor to another.
This is actually a Linux system call and is not part of GNU libc. |
tee |
Duplicate the content from one file descriptor to another.
This is actually a Linux system call and is not part of GNU libc. |
vmsplice | High performance copy from user space buffer to a file descriptor (pipe). |
posix_fadvise |
Allow to announce an intention to access
file data in a specific pattern (e.g. sequential, random, no reuse, etc)
in the future, thus allowing the kernel
to perform appropriate optimizations.
This is actually a Linux system call and is not part of GNU libc. |
posix_fallocate
fallocate |
Pre-allocate file space.
fallocate is actually a Linux system call and is not part of GNU libc. |
readahead |
Perform file readahead into page cache.
This is actually a Linux system call and is not part of GNU libc. |
select poll pselect ppoll epoll |
Synchronous I/O multiplexing: It allows to monitor multiple file descriptors or wait until one
or more of the file descriptors become "ready"
The difference between select and poll is for poll, the user must allocate an array of pollfd structures and pass the number of entries in this array. p-prefix version of select/poll allows nano-second precision. epoll is a high-performance O(1) poll and is available in Linux at least version 2.5.44. BSD has similar functionality called kqueue. |
getcwd get_current_dir_name chdir fchdir |
Get/set the current working directory.
fchdir can set the current working directory to the directory associated with the given file descriptor. |
opendir fdopendir dirfd readdir seekdir rewinddir closedir | Get names of files in a directory. |
scandir alphasort versionsort | Get names of files in a directory in a sorted order. |
ftw nftw |
File tree walk: For each name of file in a directory, invoke the user-specified
callback.
This is like Perl's File::Find function. |
fnmatch glob | Filename style wildcard/pattern matching. |
wordexp | Shell style word expansion, e.g. expand ~foo to the home directory of foo. It will also do variable substitution, command substitution, wildcard expansion, etc. |
basename | [libiberty] Get the last component of path name. |
canonicalize_file_name realpath |
Get the absolute name of the file name. |
access | Test the access permission of a pathname. |
truncate ftruncate |
Change (not just reduce) the size of a file.
ftruncate can be used to create a "sparse" file (lseek can do the same, on Linux) |
tmpfile tmpnam tempnam mktemp mkstemp mkostemp mkdtemp |
Create a temporary binary file (for update mode) or just get a temporary file name.
tempnam returns a unique file name with user-specified directory (under which the temporary file is created) and file name prefix. It is achieved via mktemp call. mkstemp and mkdtemp are the same as tempname, but the actual temporary file or directory will be created and a file handler will be returned. The differences of these calls can be readily distinguished by their function prototypes: FILE *tmpfile (); --+ char *tmpnam (char *s); +--- Returns a file name char *tempnam (const char *dir, const char *pfx); | char *mktemp (char *template); --+ int mkstemp (char *template); --+ int mkstemps (char *template, int suffixlen); | int mkostemp (char *template, int flags); +--- Returns a file handler int mkostemps(char *template, int suffixlen, int flags); | int mkdtemp (char *template); --+Caveat: When using mktemp or mkstemp, the template argument cannot be a constant string since mktemp or mkstemp will modify the template argument. For example, the following will lead to segmentation fault: char *s = mktemp("/tmp/foo.XXXXXX");Instead, one should use: char s[] = "/tmp/foo.XXXXXX"; mktemp(s);In general, the directory under which temporary files are created is determined by the following precedence rules (see choose_tmpdir function in libiberty):
|
inotify |
Monitor filesystem/file/directory changes.
inotify APIs are actually Linux system calls. For a command-line example, see here. |
Processes & threads
vfork | This is similar to fork but the child process created via vfork shares its parent's address space until it calls _exit or one of the exec functions. In the meantime, the parent process suspends execution. |
clone | Create kernel threads. |
nice getpriority setpriority |
Get/set scheduling priority for the process. |
sched_getscheduler sched_setscheduler |
Get/set scheduling policy and the associated parameters for the process. |
sched_getaffinity sched_setaffinity pthread_setaffinity_np pthread_getaffinity_np |
Get/set a process's CPU affinity. |
sched_yield pthread_yield |
Cause the currently executing thread to relinquish the CPU.
sched_yield is actually a Linux system call and is not part of GNU libc. |
futex | Fast userspace mutex. This is a building block for fast userspace locking and semaphores and is used by GCC's LibGOMP in OpenMP implementation. |
semop semtimedop |
System V semaphores and timed semaphores. |
getcpu /dev/cpuset |
Determine CPU and NUMA node on which the calling thread is running.
/dev/cpuset is a pseudo-file-system interface to confine processes to processor and memory node subsets. This is actually a Linux system feature is not part of GNU libc. |
confstr | Get GNU libc's pthread version information (use _CS_GNU_LIBPTHREAD_VERSION argument) |
get_nprocs_conf get_nprocs |
Get number of processors in the system (configured and actually on-line ones).
Alternatively, one can use sysconf(_SC_NPROCESSORS_CONF)and sysconf(_SC_NPROCESSORS_ONLN) |
get_mempolicy set_mempolicy |
Get/set default NUMA memory policy for a process and its children.
This is actually a Linux system feature is not part of GNU libc. |
move_pages migrate_pages mbind |
Move/Bind the specified/all pages of a process to the specified memory nodes.
This is actually a Linux system feature is not part of GNU libc. |
pex_* |
[libiberty] These pex-prefix functions
allow to have more control over executing
one or more programs and retrieving their output. See here for details. |
Timing
get_run_time | [libiberty] Get the time used so far, in microseconds. |
usleep nanosleep clock_nanosleep |
High-resolution sleep. |
CPU cycles |
rdtsc is an x86 instruction which counts the number of clock cycles since reset. However, it must be used with care on multi-core systems.
uint_64 get_cycles() { unsigned l,h; __asm__ __volatile__("rdtsc": "=a" (l), "=d" (h)); return l + (((uint_64)h) << 32); } Alternatively, gcc has built-in function called __builtin_ia32_rdtsc (to use, include header file x86intrin.h) For Itanium: uint_64 get_cycles() { uint_64 t; __asm__ __volatile__("mov %0=ar.itc" : "=r"(t)); return t; } For PowerPC: uint_64 get_cycles() { uint_64 t; unsigned tbu, tbl, tbu2; if (sizeof(void *) == 8) { __asm__ __volatile__("mftb %0" : "=r" (t)); return t; } else { do { __asm__ __volatile__("mftbu %0" : "=r" (tbu)); __asm__ __volatile__("mftb %0" : "=r" (tbl)); __asm__ __volatile__("mftbu %0" : "=r" (tbu2)); } while (tbu != tbu2); return ((uint_64)tbu << 32) + tbl; } } |
clock |
Report CPU time used. The result must
be divided by CLOCKS_PER_SEC to determine the time in seconds.
Beware of overflow issue with this function. |
times | Same as above, but less likely to have overflows. |
getrusage vtimes |
Report wall clock time ("real"), time spent in user space ("user"), and time spent in kernel space ("sys").
It can also report memory usage. |
clock_getres clock_gettime clock_settime clock_getcpuclockid |
Nanosecond-resolution clocks.
This is actually a Linux system feature (POSIX 1003.1) and is not part of GNU libc. |
HPET |
High Precision Event Timer.
It is accessible via ioctl calls.
This is available in Linux kernel version 2.6.x. |
getitimer setitimer |
Interval timers in three different time domains (wall clock time, user time, or user+system time). When the timer expires, signals SIGALRM, SIGVTALRM, and SIGPROF, respectively, will be delivered. |
profil | Profile the user program itself.
This is actually a Linux system call and is not part of GNU libc. |