Introduction to Linux Libraries

Introduction to Linux Libraries

You can practice troubleshooting a Linux library issue in a real server with this SadServers Scenario

Linux Libraries

When developing an application, we often leverage external code libraries rather than creating everything from scratch. Libraries consist of collections of items such as functions.

For instance, if we need encryption in our code, we have two options. The first involves mastering encryption principles, including the underlying mathematics and formulas, then implementing them into code from scratch. Assuming everything goes perfectly, our code would work flawlessly and adhere to all necessary standards. The second option, however, is to utilize someone else’s work—libraries that have already undergone this process and proven their functionality and reliability (at least up to the point of our usage; future discoveries of bugs or security issues notwithstanding). In most cases, the latter approach, leveraging existing libraries, is the preferred and advisable path to take.

We have two kinds of libraries:

  • Static: These libraries are included in the compiled application. We add static libraries at compile time, and they become part of the application binary. Since they are already linked during compilation, the application does not have any dependency on these libraries at runtime.
  • Shared/Dynamic: These libraries are used in our application but are not included in the binary. They must be provided at runtime to the application. Instead of being compiled into the application, they are dynamically linked when the application runs.

How Linux Libraries Work

In Linux systems, a component known as the dynamic linker/loader (ld.so, ld-linux*.so*) is responsible for locating the required libraries and loading them before executing the application.

By default, Linux libraries reside in directories such as /lib, /usr/lib, and, in some distributions, /lib64 and /usr/lib64 for the x86_64 architecture. Often, these directories are symbolic links to /usr/lib and /usr/lib64. Libraries follow a naming convention such as libLIBRARYNAME.so.VERSION, for example, liblzma.so.5. It’s important to note that while this naming convention is common, it’s not mandatory or enforced, and libraries may appear in alternative formats.

When the dynamic linker attempts to start an application and load its dependencies, it searches through various locations in a specific order. Some well-known places include:

  • LD_PRELOAD: The LD_PRELOAD environment variable instructs the dynamic linker to load specified libraries even if they are not listed in the application’s dependencies.
  • /etc/ld.so.preload: Similar to LD_PRELOAD, this file contains a list of libraries to be loaded by the dynamic linker, but the libraries are specified in the file rather than in an environment variable.
  • LD_LIBRARY_PATH: The LD_LIBRARY_PATH variable sets directories for the dynamic linker to check before searching the default locations. It functions similarly to the PATH variable for executable files.
  • /etc/ld.so.cache: This file serves as a cache of system libraries. It contains information about libraries available on the system. It’s crucial to update this cache whenever libraries are modified. For example, when installing a package, the ldconfig command is executed to update this cache.

The ld.so.cache file is crucial because the system relies on it to quickly locate system libraries, avoiding the need to search through all mentioned directories such as /usr/lib. This optimization significantly reduces the time and potential delays in starting applications.

It’s important to note that some of the behaviors mentioned above may depend on options provided to the application at compile time. For example, the LD_LIBRARY_PATH variable may not function if the application is executed in secure-execution mode.

Diving Into Practice

Now, let’s move on from theory and dive into practice.

ldd command shows what libraries an executable file depends on. Here’s the output of the ldd command for the sleep command:

$ ldd `which sleep`
        linux-vdso.so.1 (0x00007fffd6d23000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3ad2400000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3ad2688000)

This output indicates the libraries that the sleep command depends on. It’s essential to note that ldd command may not always reveal dependencies, especially when dealing with interpreted languages like Python. For instance, consider the Python script provided below:

#!/usr/bin/env python3
import lzma
import time

print(lzma.__name__)

time.sleep(10)

If we attempt to run ldd directly on the source file, it won’t show anything, as demonstrated below:

ldd ./main.py
        not a dynamic executable

This result is expected because Python is an interpreted language, and the script itself is not a compiled executable file. Instead, it is executed by the Python interpreter at runtime. Therefore, ldd won’t detect any dependencies when applied directly to Python source files. Let’s run the code and check what lsof shows us:

COMMAND    PID  USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
python3 102408 myuser  cwd    DIR  252,1     4096  7046183 /home/myuser/main.py
python3 102408 myuser  rtd    DIR  252,1     4096        2 /
python3 102408 myuser  txt    REG  252,1  5904904 26745137 /usr/bin/python3.10
python3 102408 myuser  mem    REG  252,1 11477872 26747919 /usr/lib/locale/locale-archive
python3 102408 myuser  mem    REG  252,1   170456 26766181 /usr/lib/x86_64-linux-gnu/liblzma.so.5.2.5
python3 102408 myuser  mem    REG  252,1  2220400 26740839 /usr/lib/x86_64-linux-gnu/libc.so.6
python3 102408 myuser  mem    REG  252,1   108936 26766887 /usr/lib/x86_64-linux-gnu/libz.so.1.2.11
python3 102408 myuser  mem    REG  252,1   194872 26765755 /usr/lib/x86_64-linux-gnu/libexpat.so.1.8.7
python3 102408 myuser  mem    REG  252,1   940560 26740857 /usr/lib/x86_64-linux-gnu/libm.so.6
python3 102408 myuser  mem    REG  252,1    45240 26768959 /usr/lib/python3.10/lib-dynload/_lzma.cpython-310-x86_64-linux-gnu.so
python3 102408 myuser  mem    REG  252,1    27002 26748343 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
python3 102408 myuser  mem    REG  252,1   240936 26740828 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
python3 102408 myuser    0u   CHR  136,0      0t0        3 /dev/pts/0
python3 102408 myuser    1u   CHR  136,0      0t0        3 /dev/pts/0
python3 102408 myuser    2u   CHR  136,0      0t0        3 /dev/pts/0

When we run the provided Python script and check the running process with lsof, we can see that many libraries are indeed loaded, including LZMA, which we have used in our Python code. In this case, the LZMA library seems to be written in CPython, and when we check its dependencies using ldd, we find that it relies on other libraries such as liblzma:

$ ldd /usr/lib/python3.10/lib-dynload/_lzma.cpython-310-x86_64-linux-gnu.so
        linux-vdso.so.1 (0x00007ffef99fd000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fd768603000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd768200000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fd768654000)

This demonstrates the intricate dependencies between system libraries, highlighting the importance of understanding and managing them properly. It’s important to be cautious with system libraries, as even small changes can have significant consequences.

Now let’s go back to our sleep command. Here’s the output of the strace command for executing the sleep command with a limited set of system calls to monitor file accesses when we executing sleep. We are doing it to check what files are accessed when we run sleep command:

$ strace -e openat,open,access -f sleep 5s
access("/etc/ld.so.preload", R_OK)      = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3

This output confirms that the system is reading ld.so.preload and ld.so.cache, as mentioned earlier.

Now, Let’s set LD_* variables and check strace again:

$ mkdir -pv /tmp/mylibs/dir{1,2}
mkdir: created directory '/tmp/mylibs'
mkdir: created directory '/tmp/mylibs/dir1'
mkdir: created directory '/tmp/mylibs/dir2'

$ cd /tmp/mylibs/dir1

$ export LD_LIBRARY_PATH="${PWD}:/tmp/mylibs/dir2" LD_PRELOAD="/lib/x86_64-linux-gnu/liblzma.so.5"

$ strace -e openat,open,access -f sleep 5s
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/liblzma.so.5", O_RDONLY|O_CLOEXEC) = 3
access("/etc/ld.so.preload", R_OK)      = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/tmp/mylibs/dir1/glibc-hwcaps/x86-64-v3/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/glibc-hwcaps/x86-64-v2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/tls/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/tls/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir1/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/glibc-hwcaps/x86-64-v3/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/glibc-hwcaps/x86-64-v2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/tls/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/tls/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/tmp/mylibs/dir2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

As you can see, there’s considerably more activity when running sleep with the LD_* variables set. It checks for libraries in directories specified in LD_LIBRARY_PATH and also opens /lib/x86_64-linux-gnu/liblzma.so.5, which we set in LD_PRELOAD.

By using lsof, we can validate that liblzma is being loaded for the sleep process. Here’s the output of the lsof command:

$ sleep infinity &

$ lsof -p $(pgrep -f 'sleep infinity')
lsof: WARNING: can't stat() overlay file system /var/lib/docker/overlay2/ade36c155b2e1d19b09046cd6cbe4d9348b813bd2ec5791bed818820798d8189/merged
      Output information may be incomplete.
lsof: WARNING: can't stat() nsfs file system /run/docker/netns/c5e2679fc7fb
      Output information may be incomplete.
COMMAND    PID  USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
sleep   147063 myuser  cwd    DIR  252,1     4096 11313226 /tmp/mylibs/dir1
sleep   147063 myuser  rtd    DIR  252,1     4096        2 /
sleep   147063 myuser  txt    REG  252,1    35336 26788412 /usr/bin/sleep
sleep   147063 myuser  mem    REG  252,1 11477872 26747919 /usr/lib/locale/locale-archive
sleep   147063 myuser  mem    REG  252,1  2220400 26740839 /usr/lib/x86_64-linux-gnu/libc.so.6
sleep   147063 myuser  mem    REG  252,1   170456 26766181 /usr/lib/x86_64-linux-gnu/liblzma.so.5.2.5
sleep   147063 myuser  mem    REG  252,1   240936 26740828 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
sleep   147063 myuser    0u   CHR  136,2      0t0        5 /dev/pts/2
sleep   147063 myuser    1u   CHR  136,2      0t0        5 /dev/pts/2
sleep   147063 myuser    2u   CHR  136,2      0t0        5 /dev/pts/2

As you can see, /usr/lib/x86_64-linux-gnu/liblzma.so.5.2.5 is indeed loaded into the memory for the sleep process which we did set in LD_PRELOAD variable.

These tricks have various practical applications:

  • Developing: During development, you can utilize temporary paths to test new or updated libraries before integrating them into applications that depend on them, ensuring they work correctly.
  • Management: In cases where an application relies on a library version incompatible with the system’s default, you can override library paths for that specific application, allowing it to use alternative libraries.
  • Security: Unfortunately, these techniques can also be exploited for malicious purposes. Cyber attackers may hijack libraries to manipulate application behavior, potentially introducing backdoors or other security vulnerabilities.

Conclusion

In conclusion, libraries are essential building blocks in the toolkit of system administrators, SREs, and DevOps professionals. Understanding their nuances, dependencies, and loading mechanisms is key to optimizing system performance, ensuring reliability, and enhancing security.

Although library management is not as common nowadays due to the increasing use of containers, it’s still vital to understand these concepts. At the OS level, you might need to troubleshoot, patch, or investigate, making this understanding essential.