LetsBeRealAboutDependencies

Introduction

I’ve been wanting to do this for a few weeks, ever since I typed apt install xorg-dev on my Debian system and it installed literally 30 megabytes of header files. A perennial complaint among various sections of the Rust community is “all these darn programs that use 300 crates to do anything”. These complaints are valid, but my argument is that they’re also not NEW, and they’re certainly not unique to Rust. Rather the difference in dependencies between something written in Rust vs. “traditional” C or C++ is that on Unix systems, all these dependencies are still there, just handled by the system instead of the compiler directly. The distro maintainers do more of the work, and our build systems assume the presence of various system libraries in system places. The only thing new about it is that programmers are exposed to more of the costs of it up-front.

First Attempt

So to start out, let’s collect some data. Let’s take a look at a random Real Life C++ Program What Does Useful Stuff, RViz. To begin, let’s just look at the dynamic libraries it uses, as a hopefully-accurate-ish proxy for the full dependency tree:

$ ldd /usr/bin/rviz
	linux-vdso.so.1 (0x00007ffd93ff7000)
	librviz.so.2d => /usr/lib/x86_64-linux-gnu/librviz.so.2d (0x00007f3e194b1000)
	libQt5Widgets.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5 (0x00007f3e18c6a000)
	libQt5Core.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Core.so.5 (0x00007f3e1851f000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f3e18196000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3e17f7e000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3e17b8d000)
	libboost_filesystem.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.65.1 (0x00007f3e17973000)
	libboost_program_options.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_program_options.so.1.65.1 (0x00007f3e176f2000)
	libboost_system.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_system.so.1.65.1 (0x00007f3e174ed000)
	libboost_thread.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1 (0x00007f3e172c8000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3e170a9000)
	libimage_transport.so.0d => /usr/lib/x86_64-linux-gnu/libimage_transport.so.0d (0x00007f3e16e23000)
	libtinyxml.so.2.6.2 => /usr/lib/x86_64-linux-gnu/libtinyxml.so.2.6.2 (0x00007f3e16c0e000)
	libclass_loader.so.0d => /usr/lib/x86_64-linux-gnu/libclass_loader.so.0d (0x00007f3e169e6000)
	libresource_retriever.so.0d => /usr/lib/x86_64-linux-gnu/libresource_retriever.so.0d (0x00007f3e167e0000)
	libroslib.so.0d => /usr/lib/x86_64-linux-gnu/libroslib.so.0d (0x00007f3e165cd000)
	libtf.so.0d => /usr/lib/x86_64-linux-gnu/libtf.so.0d (0x00007f3e163a1000)
	libmessage_filters.so.1d => /usr/lib/x86_64-linux-gnu/libmessage_filters.so.1d (0x00007f3e1619c000)
	libroscpp.so.1d => /usr/lib/x86_64-linux-gnu/libroscpp.so.1d (0x00007f3e15de4000)
	librosconsole.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole.so.2d (0x00007f3e15bac000)
	libroscpp_serialization.so.0d => /usr/lib/x86_64-linux-gnu/libroscpp_serialization.so.0d (0x00007f3e159a9000)
	librostime.so.0d => /usr/lib/x86_64-linux-gnu/librostime.so.0d (0x00007f3e15789000)
	libconsole_bridge.so.0.4 => /usr/lib/x86_64-linux-gnu/libconsole_bridge.so.0.4 (0x00007f3e15584000)
	libOgreOverlay.so.1.9.0 => /usr/lib/x86_64-linux-gnu/libOgreOverlay.so.1.9.0 (0x00007f3e15324000)
	libOgreMain.so.1.9.0 => /usr/lib/x86_64-linux-gnu/libOgreMain.so.1.9.0 (0x00007f3e14bad000)
	libGL.so.1 => /usr/lib/x86_64-linux-gnu/libGL.so.1 (0x00007f3e14921000)
	libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f3e145e9000)
	libassimp.so.4 => /usr/lib/x86_64-linux-gnu/libassimp.so.4 (0x00007f3e13c1e000)
	libyaml-cpp.so.0.3 => /usr/lib/x86_64-linux-gnu/libyaml-cpp.so.0.3 (0x00007f3e139ae000)
	libQt5Gui.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5 (0x00007f3e13245000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3e12ea7000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f3e12c8a000)
	libicui18n.so.60 => /usr/lib/x86_64-linux-gnu/libicui18n.so.60 (0x00007f3e127e9000)
	libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f3e12432000)
	libdouble-conversion.so.1 => /usr/lib/x86_64-linux-gnu/libdouble-conversion.so.1 (0x00007f3e12221000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3e1201d000)
	libglib-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x00007f3e11d06000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f3e19ae9000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f3e11afe000)
	libPocoFoundation.so.50 => /usr/lib/libPocoFoundation.so.50 (0x00007f3e11755000)
	libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f3e114d6000)
	librospack.so.0d => /usr/lib/x86_64-linux-gnu/librospack.so.0d (0x00007f3e11293000)
	libtf2_ros.so.0d => /usr/lib/x86_64-linux-gnu/libtf2_ros.so.0d (0x00007f3e10fdd000)
	libtf2.so.0d => /usr/lib/x86_64-linux-gnu/libtf2.so.0d (0x00007f3e10da5000)
	libxmlrpcpp.so.1d => /usr/lib/x86_64-linux-gnu/libxmlrpcpp.so.1d (0x00007f3e10b89000)
	libcpp_common.so.0d => /usr/lib/x86_64-linux-gnu/libcpp_common.so.0d (0x00007f3e10980000)
	librosconsole_log4cxx.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole_log4cxx.so.2d (0x00007f3e10765000)
	librosconsole_backend_interface.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole_backend_interface.so.2d (0x00007f3e10563000)
	liblog4cxx.so.10 => /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10 (0x00007f3e1019a000)
	libboost_regex.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.65.1 (0x00007f3e0fe92000)
	libfreetype.so.6 => /usr/lib/x86_64-linux-gnu/libfreetype.so.6 (0x00007f3e0fbde000)
	libXt.so.6 => /usr/lib/x86_64-linux-gnu/libXt.so.6 (0x00007f3e0f975000)
	libXaw.so.7 => /usr/lib/x86_64-linux-gnu/libXaw.so.7 (0x00007f3e0f701000)
	libfreeimage.so.3 => /usr/lib/x86_64-linux-gnu/libfreeimage.so.3 (0x00007f3e0f451000)
	libzzip-0.so.13 => /usr/lib/x86_64-linux-gnu/libzzip-0.so.13 (0x00007f3e0f24a000)
	libGLX.so.0 => /usr/lib/x86_64-linux-gnu/libGLX.so.0 (0x00007f3e0f019000)
	libGLdispatch.so.0 => /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0 (0x00007f3e0ed63000)
	libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f3e0eb3b000)
	libminizip.so.1 => /usr/lib/x86_64-linux-gnu/libminizip.so.1 (0x00007f3e0e930000)
	libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f3e0e6fe000)
	libharfbuzz.so.0 => /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0 (0x00007f3e0e460000)
	libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f3e0c8b7000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f3e0c645000)
	libnghttp2.so.14 => /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f3e0c420000)
	libidn2.so.0 => /usr/lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f3e0c203000)
	librtmp.so.1 => /usr/lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f3e0bfe7000)
	libpsl.so.5 => /usr/lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f3e0bdd9000)
	libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f3e0bb4c000)
	libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f3e0b681000)
	libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f3e0b436000)
	libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f3e0b1e4000)
	liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f3e0afd6000)
	libtinyxml2.so.6 => /usr/lib/x86_64-linux-gnu/libtinyxml2.so.6 (0x00007f3e0adc2000)
	libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 (0x00007f3e0a845000)
	libactionlib.so.0d => /usr/lib/x86_64-linux-gnu/libactionlib.so.0d (0x00007f3e0a623000)
	libb64.so.0d => /usr/lib/x86_64-linux-gnu/libb64.so.0d (0x00007f3e0a420000)
	libapr-1.so.0 => /usr/lib/x86_64-linux-gnu/libapr-1.so.0 (0x00007f3e0a1eb000)
	libaprutil-1.so.0 => /usr/lib/x86_64-linux-gnu/libaprutil-1.so.0 (0x00007f3e09fc0000)
	libSM.so.6 => /usr/lib/x86_64-linux-gnu/libSM.so.6 (0x00007f3e09db8000)
	libICE.so.6 => /usr/lib/x86_64-linux-gnu/libICE.so.6 (0x00007f3e09b9d000)
	libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f3e0998b000)
	libXmu.so.6 => /usr/lib/x86_64-linux-gnu/libXmu.so.6 (0x00007f3e09772000)
	libXpm.so.4 => /usr/lib/x86_64-linux-gnu/libXpm.so.4 (0x00007f3e09560000)
	libjxrglue.so.0 => /usr/lib/x86_64-linux-gnu/libjxrglue.so.0 (0x00007f3e09340000)
	libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f3e090d8000)
	libopenjp2.so.7 => /usr/lib/x86_64-linux-gnu/libopenjp2.so.7 (0x00007f3e08e82000)
	libraw.so.16 => /usr/lib/x86_64-linux-gnu/libraw.so.16 (0x00007f3e08baf000)
	libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f3e08938000)
	libwebpmux.so.3 => /usr/lib/x86_64-linux-gnu/libwebpmux.so.3 (0x00007f3e0872e000)
	libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f3e084c5000)
	libIlmImf-2_2.so.22 => /usr/lib/x86_64-linux-gnu/libIlmImf-2_2.so.22 (0x00007f3e08002000)
	libHalf.so.12 => /usr/lib/x86_64-linux-gnu/libHalf.so.12 (0x00007f3e07dbf000)
	libIex-2_2.so.12 => /usr/lib/x86_64-linux-gnu/libIex-2_2.so.12 (0x00007f3e07ba1000)
	libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f3e0799d000)
	libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f3e07797000)
	libgraphite2.so.3 => /usr/lib/x86_64-linux-gnu/libgraphite2.so.3 (0x00007f3e0756a000)
	libunistring.so.2 => /usr/lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f3e071ec000)
	libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f3e06e86000)
	libhogweed.so.4 => /usr/lib/x86_64-linux-gnu/libhogweed.so.4 (0x00007f3e06c52000)
	libnettle.so.6 => /usr/lib/x86_64-linux-gnu/libnettle.so.6 (0x00007f3e06a1c000)
	libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f3e0679b000)
	libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f3e064c5000)
	libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f3e06293000)
	libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f3e0608f000)
	libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f3e05e84000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f3e05c69000)
	libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f3e05a4e000)
	libgssapi.so.3 => /usr/lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007f3e0580d000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3e0560a000)
	libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f3e05403000)
	libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f3e051cb000)
	libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f3e04f99000)
	libbsd.so.0 => /lib/x86_64-linux-gnu/libbsd.so.0 (0x00007f3e04d84000)
	libjpegxr.so.0 => /usr/lib/x86_64-linux-gnu/libjpegxr.so.0 (0x00007f3e04b50000)
	liblcms2.so.2 => /usr/lib/x86_64-linux-gnu/liblcms2.so.2 (0x00007f3e048f8000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f3e046c9000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f3e044a3000)
	libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f3e04295000)
	libIlmThread-2_2.so.12 => /usr/lib/x86_64-linux-gnu/libIlmThread-2_2.so.12 (0x00007f3e0408e000)
	libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f3e03d5f000)
	libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f3e03b4c000)
	libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f3e03948000)
	libheimntlm.so.0 => /usr/lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007f3e0373f000)
	libkrb5.so.26 => /usr/lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007f3e034b2000)
	libasn1.so.8 => /usr/lib/x86_64-linux-gnu/libasn1.so.8 (0x00007f3e03210000)
	libhcrypto.so.4 => /usr/lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007f3e02fda000)
	libroken.so.18 => /usr/lib/x86_64-linux-gnu/libroken.so.18 (0x00007f3e02dc4000)
	libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f3e02bbc000)
	libwind.so.0 => /usr/lib/x86_64-linux-gnu/libwind.so.0 (0x00007f3e02993000)
	libheimbase.so.1 => /usr/lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007f3e02784000)
	libhx509.so.5 => /usr/lib/x86_64-linux-gnu/libhx509.so.5 (0x00007f3e0253a000)
	libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f3e02231000)

That’s 133 libs. Sure, one .so file doesn’t necessarily mean one build dependency, but still, what are they all doing? Some of them are pretty obvious: SQLite is basically everywhere (for good reason), some things like libc and linux-vdso are just part of the process of doing business on an Ubuntu system, and then there’s a few like libGLX and libxcb that are graphics/GUI libraries that I’d expect a graphics/GUI program to use. Okay, great, now why the heck is all that Kerberos stuff in there? And why are there both libheimbase, which is from the Heimdal Kerberos implementation, and libkrb5, which is the MIT implementation? What the heck is libutil? libIlmThread-2_2 (not to be confused with 2_1 or 2_3 despite the shared object versioning, I expect!)? What about libapr-1?

Note this is not a contrived example. I literally asked myself “what’s a non-trivial C or C++ program that isn’t a web browser?” and RViz was the first thing that popped into my head. Now let’s see what its actual dependencies are. You can get the source for it here, the most recent stable branch is melodic-devel. tokei puts it at about 75k lines of C++, which in my mind classifies it nicely towards the small end of “medium sized”. And according to the CMakeLists find_package directives, its direct dependencies are:

Boost: filesystem, program_options, system, thread
urdfdom_headers
PkgConfig
OGRE
OpenGL
Qt5: QtCore, QtGui, QtOpenGL
About 22 ROS libs
Python
Eigen3 (Optional)
TinyXML2

So, MOST of those DLL’s are coming from transitive dependencies, not things that it depends on directly. There’s two mega-libraries (Boost and Qt), and OGRE and Eigen are pretty chunky, but there’s still lots of various DLL’s that don’t obviously come from any of those. What is libHalf.so, and where does it come from?

Let’s try building this and see how it goes! …wait, there’s no build instructions that I can find. It looks like it just uses CMake, so let’s try that? It apparently uses Catkin, ROS’s pile of custom CMake build scripts, so you can’t just do the usual cd build; cmake ..; make but need some other stuff too.

…You know, let’s not try building it. I’ve used Catkin just enough to know that it’s a great way to convert time into high blood pressure.

Great, what’s next?

Now, maybe ROS’s toolbase isn’t the greatest, cleanest code in the world. I can certainly believe that. Let’s try a few other real programs from different places. What’s another nontrivial C/C++ program I use a lot? Uh, the Evolution email client, why not:

$ ldd /usr/bin/evolution | wc -l
192

…Yeah I don’t even want to look at that. Why not something that isn’t part of GNOME? OBS Studio?

$ ldd /usr/bin/obs | wc -l
151

Ummmmm. VLC?

$ ldd /usr/bin/vlc 
	linux-vdso.so.1 (0x00007ffe586d8000)
	libvlc.so.5 => /usr/lib/x86_64-linux-gnu/libvlc.so.5 (0x00007fe5cc6dc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe5cc4bd000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe5cc2b9000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe5cbec8000)
	libvlccore.so.9 => /usr/lib/x86_64-linux-gnu/libvlccore.so.9 (0x00007fe5cbbb8000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe5cb81a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fe5ccb06000)
	libidn.so.11 => /lib/x86_64-linux-gnu/libidn.so.11 (0x00007fe5cb5e7000)
	libdbus-1.so.3 => /lib/x86_64-linux-gnu/libdbus-1.so.3 (0x00007fe5cb39a000)
	libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x00007fe5cb116000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe5caf0e000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fe5cace8000)
	liblz4.so.1 => /usr/lib/x86_64-linux-gnu/liblz4.so.1 (0x00007fe5caacc000)
	libgcrypt.so.20 => /lib/x86_64-linux-gnu/libgcrypt.so.20 (0x00007fe5ca7b0000)
	libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007fe5ca59b000)

There we go! Now that’s actually small enough to be interesting, especially contrasted with those other programs. There’s some system stuff (why is libsystemd in there? For DBus stuff?), some compression libs and other miscellaneous stuff, GPG for some damn reason, and libvlc and libvlccore.

Now this should be an interesting contrast. Let’s try building it and see what happens.

Okay, first off, it has an actual source code release, with a tarball to download, not just a Github page saying “git clone from the master branch”. Take a few moments and think about what that actually implies. In 2010 this was the norm, back when Sourceforge didn’t quite completely suck yet, and now in 2020 it’s uncommon enough to find as part of my normal dev workflow that it deserves mention. Think about how much smaller that makes the divide between the upstream developer making a thing and the developer using the thing as part of another thing. Second, it has build instructions (unlike RViz) and they start with the disclaimer “This guide is intended for developers and power users. Compiling VLC is not an easy task.”

I’m building on Ubuntu, so following the instructions I start off with sudo apt-get install git build-essential pkg-config libtool automake autopoint gettext That’s basically just tooling: gettext is the GNU i8n tools and libraries, and autopoint which I’ve never seen before is some peripheral tool for working with it. Then there’s a TON of 3rd-party plugins and stuff, but I don’t have all day so let’s just use the stock Ubuntu install’s build deps. Turns out there’s a cool program named debfoster that lists all the transitive dependencies that a package requires:

$ debfoster -d vlc
Package vlc depends on:
  adduser adwaita-icon-theme apt apt-utils at-spi2-core bzip2 ca-certificates coreutils cpp cpp-7
  dbus dconf-gsettings-backend dconf-service debconf debconf-i18n dpkg fdisk file fontconfig
  fontconfig-config fonts-dejavu-core fonts-freefont-ttf gcc-7-base gcc-8-base glib-networking
  glib-networking-common glib-networking-services gpgv gsettings-desktop-schemas
  gtk-update-icon-cache hicolor-icon-theme humanity-icon-theme i965-va-driver init-system-helpers
  krb5-locales liba52-0.7.4 libaa1 libaacs0 libacl1 libapparmor1 libapt-inst2.0 libapt-pkg5.0
  libarchive13 libaribb24-0 libasn1-8-heimdal libasound2 libasound2-data libass9 libasyncns0
  libatk-bridge2.0-0 libatk1.0-0 libatk1.0-data libatomic1 libatspi2.0-0 libattr1 libaudit-common
  libaudit1 libauthen-sasl-perl libavahi-client3 libavahi-common-data libavahi-common3 libavc1394-0
  libavcodec57 libavformat57 libavutil55 libbasicusageenvironment1 libbdplus0 libblkid1 libbluray2
  libbsd0 libbz2-1.0 libc6 libcaca0 libcairo-gobject2 libcairo2 libcap-ng0 libcap2 libcddb2
  libchromaprint1 libcolord2 libcom-err2 libcomerr2 libcroco3 libcrystalhd3 libcups2
  libdata-dump-perl libdatrie1 libdb5.3 libdbus-1-3 libdc1394-22 libdca0 libdconf1
  libdouble-conversion1 libdrm-amdgpu1 libdrm-common libdrm-intel1 libdrm-nouveau2 libdrm-radeon1
  libdrm2 libdvbpsi10 libdvdnav4 libdvdread4 libebml4v5 libedit2 libegl-mesa0 libegl1 libelf1
  libencode-locale-perl libepoxy0 libevdev2 libexpat1 libfaad2 libfdisk1 libffi6 libfile-basedir-perl
  libfile-desktopentry-perl libfile-listing-perl libfile-mimeinfo-perl libflac8 libfont-afm-perl
  libfontconfig1 libfontenc1 libfreetype6 libfribidi0 libgbm1 libgcc1 libgcrypt20 libgdbm-compat4
  libgdbm5 libgdk-pixbuf2.0-0 libgdk-pixbuf2.0-common libgl1 libgl1-mesa-dri libgl1-mesa-glx
  libglapi-mesa libgles2 libglib2.0-0 libglib2.0-data libglvnd0 libglx-mesa0 libglx0 libgme0 libgmp10
  libgnutls30 libgomp1 libgpg-error0 libgpm2 libgraphite2-3 libgroupsock8 libgsm1 libgssapi-krb5-2
  libgssapi3-heimdal libgtk-3-0 libgtk-3-bin libgtk-3-common libgudev-1.0-0 libharfbuzz0b
  libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhogweed4 libhtml-form-perl
  libhtml-format-perl libhtml-parser-perl libhtml-tagset-perl libhtml-tree-perl libhttp-cookies-perl
  libhttp-daemon-perl libhttp-date-perl libhttp-message-perl libhttp-negotiate-perl
  libhx509-5-heimdal libice6 libicu60 libidn11 libidn2-0 libinput-bin libinput10 libio-html-perl
  libio-socket-ssl-perl libipc-system-simple-perl libisl19 libjansson4 libjbig0 libjpeg-turbo8
  libjpeg8 libjson-glib-1.0-0 libjson-glib-1.0-common libk5crypto3 libkate1 libkeyutils1 libkmod2
  libkrb5-26-heimdal libkrb5-3 libkrb5support0 liblcms2-2 libldap-2.4-2 libldap-common libldb1
  liblirc-client0 liblivemedia62 libllvm9 liblocale-gettext-perl liblua5.2-0 liblwp-mediatypes-perl
  liblwp-protocol-https-perl liblz4-1 liblzma5 liblzo2-2 libmad0 libmagic-mgc libmagic1
  libmailtools-perl libmatroska6v5 libmicrodns0 libmount1 libmp3lame0 libmpc3 libmpcdec6 libmpeg2-4
  libmpfr6 libmpg123-0 libmtdev1 libmtp-common libmtp-runtime libmtp9 libncurses5 libncursesw5
  libnet-dbus-perl libnet-http-perl libnet-libidn-perl libnet-smtp-ssl-perl libnet-ssleay-perl
  libnettle6 libnfs11 libnotify4 libnuma1 libogg0 libopenjp2-7 libopenmpt-modplug1 libopenmpt0
  libopus0 libp11-kit0 libpam-modules libpam-modules-bin libpam0g libpango-1.0-0 libpangocairo-1.0-0
  libpangoft2-1.0-0 libpciaccess0 libpcre3 libperl5.26 libpixman-1-0 libplacebo4 libpng16-16 libpopt0
  libpostproc54 libprocps6 libprotobuf-lite10 libproxy-tools libproxy1v5 libpulse0 libpython2.7
  libpython2.7-minimal libpython2.7-stdlib libqt5core5a libqt5dbus5 libqt5gui5 libqt5network5
  libqt5svg5 libqt5widgets5 libqt5x11extras5 libraw1394-11 libreadline7 libresid-builder0c2a
  librest-0.7-0 libroken18-heimdal librsvg2-2 librsvg2-common libsamplerate0 libsasl2-2
  libsasl2-modules libsasl2-modules-db libsdl-image1.2 libsdl1.2debian libseccomp2 libsecret-1-0
  libsecret-common libselinux1 libsemanage-common libsemanage1 libsensors4 libsepol1 libshine3
  libshout3 libsidplay2 libslang2 libsm6 libsmartcols1 libsmbclient libsnappy1v5 libsndfile1
  libsndio6.1 libsoup-gnome2.4-1 libsoup2.4-1 libsoxr0 libspeex1 libspeexdsp1 libsqlite3-0
  libssh-gcrypt-4 libssh2-1 libssl1.1 libstdc++6 libswresample2 libswscale4 libsystemd0 libtag1v5
  libtag1v5-vanilla libtalloc2 libtasn1-6 libtdb1 libtevent0 libtext-charwidth-perl
  libtext-iconv-perl libtext-wrapi18n-perl libthai-data libthai0 libtheora0 libtie-ixhash-perl
  libtiff5 libtimedate-perl libtinfo5 libtry-tiny-perl libtwolame0 libudev1 libunistring2 libupnp6
  liburi-perl libusageenvironment3 libusb-1.0-0 libuuid1 libva-drm2 libva-wayland2 libva-x11-2 libva2
  libvdpau1 libvlc-bin libvlc5 libvlccore9 libvorbis0a libvorbisenc2 libvorbisfile3 libvpx5
  libvulkan1 libwacom-bin libwacom-common libwacom2 libwavpack1 libwayland-client0 libwayland-cursor0
  libwayland-egl1 libwayland-egl1-mesa libwayland-server0 libwbclient0 libwebp6 libwebpmux3
  libwind0-heimdal libwrap0 libwww-perl libwww-robotrules-perl libx11-6 libx11-data
  libx11-protocol-perl libx11-xcb1 libx264-152 libx265-146 libxau6 libxaw7 libxcb-dri2-0
  libxcb-dri3-0 libxcb-glx0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-present0 libxcb-randr0
  libxcb-render-util0 libxcb-render0 libxcb-shape0 libxcb-shm0 libxcb-sync1 libxcb-util1
  libxcb-xfixes0 libxcb-xinerama0 libxcb-xkb1 libxcb-xv0 libxcb1 libxcomposite1 libxcursor1
  libxdamage1 libxdmcp6 libxext6 libxfixes3 libxft2 libxi6 libxinerama1 libxkbcommon-x11-0
  libxkbcommon0 libxml-parser-perl libxml-twig-perl libxml-xpathengine-perl libxml2 libxmu6 libxmuu1
  libxpm4 libxrandr2 libxrender1 libxshmfence1 libxt6 libxtst6 libxv1 libxvidcore4 libxxf86dga1
  libxxf86vm1 libzstd1 libzvbi-common libzvbi0 lsb-base mesa-va-drivers mesa-vdpau-drivers
  mime-support multiarch-support netbase openssl passwd perl perl-base perl-modules-5.26
  perl-openssl-defaults procps psmisc python-talloc qttranslations5-l10n readline-common samba-libs
  sensible-utils shared-mime-info tar ubuntu-keyring ubuntu-mono ucf udev util-linux uuid-runtime
  va-driver-all vdpau-driver-all vlc-bin vlc-data vlc-l10n vlc-plugin-base vlc-plugin-notify
  vlc-plugin-qt vlc-plugin-samba vlc-plugin-skins2 vlc-plugin-video-output vlc-plugin-video-splitter
  vlc-plugin-visualization x11-common x11-utils x11-xserver-utils xdg-user-dirs xdg-utils xkb-data
  xz-utils zlib1g

…welp, in for a penny I guess. Just run apt-get build-dep vlc, and let’s gooooooo!

…Well, I have to admit, half a million lines of C compiles a lot faster than half a million lines of Rust generally does. It took about 2 minutes on one core, and produced a binary and libvlc.so that are together under a meg in size. Where’s the rest of it? Oh, it also spat out another 110 MB of dynamically-loaded libraries. So, the reason the ldd output is small is now clear: VLC includes a module loading system, and all the actual media code lives in modules that are loaded on demand instead of by the system loader.

Also, contrary to the official build docs, building VLC is in fact pretty easy… if you’re using Ubuntu and the libs it packages.

Yeah, but those are all big complex GUI programs!

Obviously, if we want a look at how many dependencies are used by Real Nontrivial C Programs, looking at media-rich, network-touching GUI programs is going to get us a pretty biased view. Okay, what about non-GUI stuff? That’s bound to be simpler, right? Let’s take something command-line only, definitely nontrivial but not too big, preferably not something interactive or media-heavy. lighttpd maybe?

$ debfoster -d lighttpd
Package lighttpd depends on:
  adduser apt apt-utils bzip2 ca-certificates debconf debconf-i18n dpkg file gcc-8-base gpgv libacl1
  libapt-inst2.0 libapt-pkg5.0 libattr1 libaudit-common libaudit1 libbz2-1.0 libc6 libcap-ng0
  libdb5.3 libfam0 libffi6 libgcc1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libhogweed4
  libidn2-0 liblocale-gettext-perl liblz4-1 liblzma5 libmagic-mgc libmagic1 libnettle6 libp11-kit0
  libpam-modules libpam-modules-bin libpam0g libpcre3 libseccomp2 libselinux1 libsemanage-common
  libsemanage1 libsepol1 libssl1.1 libstdc++6 libsystemd0 libtasn1-6 libtext-charwidth-perl
  libtext-iconv-perl libtext-wrapi18n-perl libudev1 libunistring2 libzstd1 lsb-base mime-support
  openssl passwd perl-base spawn-fcgi tar ubuntu-keyring xz-utils zlib1g

That’s not small, but it’s far smaller than anything else we’ve looked at so far. In fact, it seems pretty reasonable to my eye. There’s the usual industry-standard libraries that you would expect, like pcre3 and libbz2, some domain-specific stuff like libtasn1 and libmagic, and some miscellanious other bits and pieces. There’s still some weirdness though; what does a web server need with libgmp10 or libseccomp2, anyway? Maybe those are just needed by something like ubuntu-keyring, which appears to basically just be administrative? But dash, below, doesn’t need ubuntu-keyring

For the sake of SCIENCE, let’s take a look at the ldd output for lighttpd as well, to get a guess for how good our proxies of “apt dependencies” and “dynamically linked libraries” lines up with each other:

$ ldd $(which lighttpd)
	linux-vdso.so.1 (0x00007ffd443fe000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f756969e000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f756949a000)
	libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007f7569295000)
	libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f7569008000)
	libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f7568b3d000)
	libfam.so.0 => /usr/lib/x86_64-linux-gnu/libfam.so.0 (0x00007f7568934000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7568543000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7568324000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7569b4f000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7567f9b000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7567d83000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f75679e5000)

It looks like, similar to vlc, lighttpd manages a lot of modules at runtime instead of specifying them all at compile time. So debfoster would be generally a better guess at a program’s full dependencies, though it may end up saying a program needs more than it does – I rather doubt that you absolutely require debconf to build lighttpd, but you need it to build the Ubuntu package as distributed.

Now I want to see how low we can go. Let’s take a look at a small, simple command-line tool that does one thing well… heck, why not debfoster itself?

$ debfoster -d debfoster
Package debfoster depends on:
  adduser apt apt-utils ca-certificates debconf debconf-i18n dpkg gcc-8-base gpgv libacl1
  libapt-inst2.0 libapt-pkg5.0 libattr1 libaudit-common libaudit1 libbz2-1.0 libc6 libcap-ng0
  libdb5.3 libffi6 libgc1c2 libgcc1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libhogweed4
  libidn2-0 liblocale-gettext-perl liblz4-1 liblzma5 libnettle6 libp11-kit0 libpam-modules
  libpam-modules-bin libpam0g libpcre3 libseccomp2 libselinux1 libsemanage-common libsemanage1
  libsepol1 libssl1.1 libstdc++6 libsystemd0 libtasn1-6 libtext-charwidth-perl libtext-iconv-perl
  libtext-wrapi18n-perl libudev1 libunistring2 libzstd1 openssl passwd perl-base tar ubuntu-keyring
  zlib1g

Now that’s weird, that’s almost as many dependencies as lighttpd. Let’s see… I’ll give libc and libgcc and such a pass for being system libraries, but apart from those there’s 40 libraries by my count. Some, like libsystemd0 and libselinux1 are probably everywhere whether we like it or not, but others like libhogweed4 are a little more exotic (and bewildering). This is a program that is not even 2500 significant lines of C code, by tokei’s count. Looking at its configure.in and trying to decipher as much as I can, all it actually tries to use directly is libgettext, libavl, and libgc. Weird.

“Yeah but that’s GNU software” I hear you say, “GNU tools are bloat-y”. All right, let’s look at something that’s designed to be minimal, the dash shell. 13k SLOC, all written in plain C, no frills.

$ debfoster -d dash
Package dash depends on:
  debianutils dpkg gcc-8-base libacl1 libattr1 libbz2-1.0 libc6 libgcc1 liblzma5 libpcre3 libselinux1
  libzstd1 tar zlib1g

Okay, that’s actually minimal. System stuff, some compression stuff, a little shell-y stuff, regexes, that’s it. Let’s build it.

$ ./autogen.sh; ./configure; time make
...
real	0m0.734s
user	0m0.628s
sys	0m0.118s

FINALLY, reasonable software! Builds fast, bare minimum of deps, nothin’ but what’s absolutely necessary. And it only took… how much work to find? Awesome, let’s try it out!

fish ~ > ~/tmp/dash-0.5.10.2/src/dash
$ cd cd ~/tmp/dash-0.5.10.2█
./src/dash: 1: cd: can't cd to cd

shit a typo, hang on

$ ^[[A^[[A^[[A█

wait, arrow keys don't work

$ cd ~/tmp/dash-0.               █

oh yeah no tab-complete

$ dc ~/tmp/dash-0.5.10.2^A^A█

crap no line-editing at all?  uh...

$ ^D

dash is designed to do one single thing with no frills: run shell scripts. If you’re not using it to run shell scripts, it’s not very useful.

There’s an interesting observation here: The biggest programs so far are the ones designed for humans to interact with. Turns out humans are complicated and making computers do things they want is hard. There’s the argument that humans need to learn computers better, with simpler and more composable primitives, and it will result in great benefits in smaller and more powerful programs. Then there’s the counter-argument that if a program is written once and used many, many times, the extra time a programmer puts into making the learning curve shallower will rapidly be made up for by the time saved by the users. Both these arguments are valid. “Design” is the art of striking a balance between them, and different people will want different balances.

Gotta go deeper

While we’re here, let’s look a little deeper into some of these programs. Part of my theory is that C programs commonly omit deps by re-implementing the bits they need themselves, because for small stuff it’s far easier to just write or copy-paste your own code than to actually use a library. Let’s rummage through some of these programs, briefly, and see if that’s the case.

  • dash: Well, there’s a handful of linked lists and a simple memory allocator, but no gratuitous re-implementations of memcpy or anything like that.
  • lighttpd: Aha, here we are. There’s a SHA1 impl, a base64 impl, a resizeable array, a URL parser, a CRC32 implementation, a layer over the poll(2) equivalent on various platforms, a LALR parser generator called LEMON (specifically designed to be vendor’ed into other code bases), an MD5 impl, a RNG that tries to get secure randomness from several different sources depending on the system, a safe_memclear() that hopefully hasn’t been broken by compiler changes since it was written, a splay tree, and another, different, resizeable array. Or, in terms of Rust crates: sha1, base64, std::vec::Vec, url, crc, mio, lalrpop or something, md5, rand, actually I can’t find a simple safe way to zero memory in Rust, std::collections::BTreeSet, and again std::vec::Vec. These add up to about 8000 significant lines of code, roughly 15% of lighttpd.
  • vlc: Why not, I ain’t scared! This won’t be a complete list, but a skim finds a command line parser, some chunks of libc that are apparently commonly missing on some platforms (mostly Unicode stuff), what looks like a threadpool, a block memory allocator, a thread-safe FIFO, a subsystem for recognizing file magic numbers, a HTTP cookie library, an MD5 implementation, a small MIME type guesser, a bunch of time parsing stuff, a thread-safe wrapper around the POSIX drand48(3) function, a pile of PGP key stuff, an XML parser, a bunch of string functions, a base64 decoder… You get the idea.

So, yeah. An argument I’ve heard against the Go/Rust paradigm of statically linking everything is “you end up with bunches of copies of the same library code compiled into each different program!” All I can say to that is… lol. That said, with these vendored utility libraries there’s a strong incentive to keep them small, simple and task-specific, which is good. On the flip side, I wonder how many times vlc’s XML parser has been fuzzed?

Conclusions

Okay, so what have we learned? Well, first off, my thesis of “it isn’t just Rust or JS that has this problem, you know”… I’m not going to call it conclusively demonstrated, but I’ve found some strong support and a couple decent counterpoints. There are potentially a lot of unexpected dependencies hiding in even a quite small C program. Linux package managers do hide the complexity from you by making it all just “part of what the computer does anyway”, and sometimes that involves a staggering amount of STUFF. A medium-sized Rust project can easily tip the scales at 2-300 crates, which is still rather more dependencies than anything I’ve looked at here, but that’s explained by the simple fact that using libraries in C is such a monumental pain in the ass that it’s not worth trying for anything unless it’s bigger than… well, a base64 parser or a hash function.

The real thing with toools like go, cargo and npm is they move that library management out of the distro’s domain and into the programmer’s. If I’m writing software in C, I’m almost certainly just going to be using the system packages that Ubuntu or Debian provide for all the deps I need, so really it’s a question of “does compiling this program need a lot of deps” or “does compiling this program on a recent Debian system need a lot of deps”. I consider myself a fairly typical programmer, because I’m lazy and if I need to do something that isn’t directly associated with the actual problem I’m trying to solve then I don’t want to spend a lot of time on it. If a Debian package doesn’t exist for a C lib I want, or it’s not up to date enough, I have a 90% chance of saying “well it’s not that important anyway”, an 8% chance of vendoring it into my own code, and a 2% chance of saying “Well I guess I can try to make a real apt package for it”. Debian is big enough that most things you would want are pretty up to date and are in there somewhere. Then, if someone wants to run my C program on Red Hat, I’ll say “good luck, I tried to make it portable but no promises”. Maybe it works and maybe it doesn’t but either way it’s their problem, not mine, unless I’m being paid for it to be my problem. And if a computer doesn’t run Debian, Red Hat, Windows, or a close derivative of one of those things, then I’m just not going to write complex C software for it. The amount of work it will take to get anything done will far outweigh the novelty. Especially when I can use a different language instead, one that’s more capable and takes less work to deal with.

For an example, just look at my reaction to the programs I tried building. RViz: “Heck, I’m not spending 30 minutes just figuring out how to compile it, I’ll pass”. VLC: “Oh, there’s an apt command to just install everything it might need in one go, let’s just do that instead of spending literal hours getting the source needed for each plugin”. By doing that, building a very complex C program becomes almost as easy as using cargo.

go-style package management tools are designed to build programs that work. Not programs that work on Debian if you have the right libs installed, or on Windows 7 or newer, or whatever, but “if you compile it on an OS, it works on that OS”. Rust doesn’t have compiler targets for Debian, or Red Hat, or Arch Linux: it has x86_64-unknown-linux-gnu. If you actually want reproducable software which can use libraries that other people write, and which runs on any platform it builds on, then you have only a few realistic options:

  • One, you can control all the software on the computer that your software might need to talk with. Make sure you have all the right libs installed, in the right places, with the right versions, and your program will work fine. This is what Debian and Red Hat do, this is what Windows does from a different direction. When you don’t have mechanisms for handling this then DLL Hell starts, you can’t move binaries from one computer to another, and this is why installing packages from source all the time on your dev machine is a great way to result in programs with Funky Behavior that break whenever you touch stuff. This is why devops is now a profession, because to make a complex program work you have to be able to control the system outside the program itself. Being able to do this and not need permission to manage the whole computer is also exactly what Docker and other container solutions do. They do other things too, but that automated install and configuration of complex software, without needing root on the target system and without the programs being able to accidentally step on each other, is their killer feature.
  • Two, you can control all the dependencies of your software as part of your build process. This is what go-style package managers do. This is also one reason why go, npm and cargo all statically compile stuff, and recompile everything from scratch at every opportunity: it removes a variable from the process, which isn’t strictly necessary but which simplifes things a lot. This is also exactly what systems like Nix and Guix do, they just take the “manage all dependencies for all programs independently and from source” portion and apply it to the whole OS environment.
  • Or, I suppose, number three, write as much as possible yourself, vendor everything else, and you have 100% control of all dependencies and can do whatever you want. The cost of this is duplicating a lot of code, and making it hard to update any 3rd party code you do use. It also means that you are in charge of testing, verifying and maintaining all this code, which is no small task. Sometimes this is feasible, but it still takes a lot of work that is often tangential to the problem you actually want to solve. Building large, complex systems this way often takes a kinda special type of person.

(Note that containers are another rant: what we actually want for robust, reliable infrastructure is an environment and API that deals with all the above stuff one way or another, and provides explicit, deny-by-default control of everything programs are allowed to do, a la a capability system. Linux is not this, but people keep trying to turn Linux into that with tools like Docker. I think that if an API like WASI or such ever gets popular, that has these properties designed in from the start, life will become much better. Making such a system that is actually nice for humans to use interactively is still a problem, but a different problem.)

If you take a complicated Go or Rust program that doesn’t depend on OS-level details and statically compile it with the same compiler on a Debian system and a Red Hat system, you will get the same program to within some Sufficiently Small delta. There’s always caveats of course, software is complicated, but let’s pretend it’s not actively trying to be complicated. You will get a binary that will not freak out because it wanted libPocoFoundation.so.50 but the latest version it can find is libPocoFoundation.so.48, or because libPocoFoundation.so.50 is in /usr/lib instead of /usr/local/lib. It will not suddenly crash because a system’s copy of libjpeg.so.8 was compiled with a different set of features than it expected. That sort of nonsense is the problem these systems are trying to solve, and the three solutions above are the ones we’ve discovered so far. None is perfect, with all of them you’re making a tradeoff between things they handle well and things they don’t.

There’s undoubtedly other useful to be made though, with different sets of tradeoffs. So, what other solutions can we invent?