LetsBeRealAboutDependencies
Introduction
I’ve been wanting to do this for a few weeks, ever since I typed
apt install xorg-dev
on my Debian system and it installed
literally 30 megabytes of header files. A perennial complaint among
various sections of the Rust community is “all these darn programs that
use 300 crates to do anything”. These complaints are valid, but
my argument is that they’re also not NEW, and they’re certainly not
unique to Rust. Rather the difference in dependencies between something
written in Rust vs. “traditional” C or C++ is that on Unix systems, all
these dependencies are still there, just handled by the system instead
of the compiler directly. The distro maintainers do more of the work,
and our build systems assume the presence of various system libraries in
system places. The only thing new about it is that programmers are
exposed to more of the costs of it up-front.
First Attempt
So to start out, let’s collect some data. Let’s take a look at a random Real Life C++ Program What Does Useful Stuff, RViz. To begin, let’s just look at the dynamic libraries it uses, as a hopefully-accurate-ish proxy for the full dependency tree:
$ ldd /usr/bin/rviz
linux-vdso.so.1 (0x00007ffd93ff7000)
librviz.so.2d => /usr/lib/x86_64-linux-gnu/librviz.so.2d (0x00007f3e194b1000)
libQt5Widgets.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5 (0x00007f3e18c6a000)
libQt5Core.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Core.so.5 (0x00007f3e1851f000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f3e18196000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3e17f7e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3e17b8d000)
libboost_filesystem.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.65.1 (0x00007f3e17973000)
libboost_program_options.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_program_options.so.1.65.1 (0x00007f3e176f2000)
libboost_system.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_system.so.1.65.1 (0x00007f3e174ed000)
libboost_thread.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1 (0x00007f3e172c8000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3e170a9000)
libimage_transport.so.0d => /usr/lib/x86_64-linux-gnu/libimage_transport.so.0d (0x00007f3e16e23000)
libtinyxml.so.2.6.2 => /usr/lib/x86_64-linux-gnu/libtinyxml.so.2.6.2 (0x00007f3e16c0e000)
libclass_loader.so.0d => /usr/lib/x86_64-linux-gnu/libclass_loader.so.0d (0x00007f3e169e6000)
libresource_retriever.so.0d => /usr/lib/x86_64-linux-gnu/libresource_retriever.so.0d (0x00007f3e167e0000)
libroslib.so.0d => /usr/lib/x86_64-linux-gnu/libroslib.so.0d (0x00007f3e165cd000)
libtf.so.0d => /usr/lib/x86_64-linux-gnu/libtf.so.0d (0x00007f3e163a1000)
libmessage_filters.so.1d => /usr/lib/x86_64-linux-gnu/libmessage_filters.so.1d (0x00007f3e1619c000)
libroscpp.so.1d => /usr/lib/x86_64-linux-gnu/libroscpp.so.1d (0x00007f3e15de4000)
librosconsole.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole.so.2d (0x00007f3e15bac000)
libroscpp_serialization.so.0d => /usr/lib/x86_64-linux-gnu/libroscpp_serialization.so.0d (0x00007f3e159a9000)
librostime.so.0d => /usr/lib/x86_64-linux-gnu/librostime.so.0d (0x00007f3e15789000)
libconsole_bridge.so.0.4 => /usr/lib/x86_64-linux-gnu/libconsole_bridge.so.0.4 (0x00007f3e15584000)
libOgreOverlay.so.1.9.0 => /usr/lib/x86_64-linux-gnu/libOgreOverlay.so.1.9.0 (0x00007f3e15324000)
libOgreMain.so.1.9.0 => /usr/lib/x86_64-linux-gnu/libOgreMain.so.1.9.0 (0x00007f3e14bad000)
libGL.so.1 => /usr/lib/x86_64-linux-gnu/libGL.so.1 (0x00007f3e14921000)
libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f3e145e9000)
libassimp.so.4 => /usr/lib/x86_64-linux-gnu/libassimp.so.4 (0x00007f3e13c1e000)
libyaml-cpp.so.0.3 => /usr/lib/x86_64-linux-gnu/libyaml-cpp.so.0.3 (0x00007f3e139ae000)
libQt5Gui.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5 (0x00007f3e13245000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3e12ea7000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f3e12c8a000)
libicui18n.so.60 => /usr/lib/x86_64-linux-gnu/libicui18n.so.60 (0x00007f3e127e9000)
libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f3e12432000)
libdouble-conversion.so.1 => /usr/lib/x86_64-linux-gnu/libdouble-conversion.so.1 (0x00007f3e12221000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3e1201d000)
libglib-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x00007f3e11d06000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3e19ae9000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f3e11afe000)
libPocoFoundation.so.50 => /usr/lib/libPocoFoundation.so.50 (0x00007f3e11755000)
libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f3e114d6000)
librospack.so.0d => /usr/lib/x86_64-linux-gnu/librospack.so.0d (0x00007f3e11293000)
libtf2_ros.so.0d => /usr/lib/x86_64-linux-gnu/libtf2_ros.so.0d (0x00007f3e10fdd000)
libtf2.so.0d => /usr/lib/x86_64-linux-gnu/libtf2.so.0d (0x00007f3e10da5000)
libxmlrpcpp.so.1d => /usr/lib/x86_64-linux-gnu/libxmlrpcpp.so.1d (0x00007f3e10b89000)
libcpp_common.so.0d => /usr/lib/x86_64-linux-gnu/libcpp_common.so.0d (0x00007f3e10980000)
librosconsole_log4cxx.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole_log4cxx.so.2d (0x00007f3e10765000)
librosconsole_backend_interface.so.2d => /usr/lib/x86_64-linux-gnu/librosconsole_backend_interface.so.2d (0x00007f3e10563000)
liblog4cxx.so.10 => /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10 (0x00007f3e1019a000)
libboost_regex.so.1.65.1 => /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.65.1 (0x00007f3e0fe92000)
libfreetype.so.6 => /usr/lib/x86_64-linux-gnu/libfreetype.so.6 (0x00007f3e0fbde000)
libXt.so.6 => /usr/lib/x86_64-linux-gnu/libXt.so.6 (0x00007f3e0f975000)
libXaw.so.7 => /usr/lib/x86_64-linux-gnu/libXaw.so.7 (0x00007f3e0f701000)
libfreeimage.so.3 => /usr/lib/x86_64-linux-gnu/libfreeimage.so.3 (0x00007f3e0f451000)
libzzip-0.so.13 => /usr/lib/x86_64-linux-gnu/libzzip-0.so.13 (0x00007f3e0f24a000)
libGLX.so.0 => /usr/lib/x86_64-linux-gnu/libGLX.so.0 (0x00007f3e0f019000)
libGLdispatch.so.0 => /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0 (0x00007f3e0ed63000)
libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f3e0eb3b000)
libminizip.so.1 => /usr/lib/x86_64-linux-gnu/libminizip.so.1 (0x00007f3e0e930000)
libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f3e0e6fe000)
libharfbuzz.so.0 => /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0 (0x00007f3e0e460000)
libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f3e0c8b7000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f3e0c645000)
libnghttp2.so.14 => /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f3e0c420000)
libidn2.so.0 => /usr/lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f3e0c203000)
librtmp.so.1 => /usr/lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f3e0bfe7000)
libpsl.so.5 => /usr/lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f3e0bdd9000)
libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f3e0bb4c000)
libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f3e0b681000)
libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f3e0b436000)
libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f3e0b1e4000)
liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f3e0afd6000)
libtinyxml2.so.6 => /usr/lib/x86_64-linux-gnu/libtinyxml2.so.6 (0x00007f3e0adc2000)
libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 (0x00007f3e0a845000)
libactionlib.so.0d => /usr/lib/x86_64-linux-gnu/libactionlib.so.0d (0x00007f3e0a623000)
libb64.so.0d => /usr/lib/x86_64-linux-gnu/libb64.so.0d (0x00007f3e0a420000)
libapr-1.so.0 => /usr/lib/x86_64-linux-gnu/libapr-1.so.0 (0x00007f3e0a1eb000)
libaprutil-1.so.0 => /usr/lib/x86_64-linux-gnu/libaprutil-1.so.0 (0x00007f3e09fc0000)
libSM.so.6 => /usr/lib/x86_64-linux-gnu/libSM.so.6 (0x00007f3e09db8000)
libICE.so.6 => /usr/lib/x86_64-linux-gnu/libICE.so.6 (0x00007f3e09b9d000)
libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f3e0998b000)
libXmu.so.6 => /usr/lib/x86_64-linux-gnu/libXmu.so.6 (0x00007f3e09772000)
libXpm.so.4 => /usr/lib/x86_64-linux-gnu/libXpm.so.4 (0x00007f3e09560000)
libjxrglue.so.0 => /usr/lib/x86_64-linux-gnu/libjxrglue.so.0 (0x00007f3e09340000)
libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f3e090d8000)
libopenjp2.so.7 => /usr/lib/x86_64-linux-gnu/libopenjp2.so.7 (0x00007f3e08e82000)
libraw.so.16 => /usr/lib/x86_64-linux-gnu/libraw.so.16 (0x00007f3e08baf000)
libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f3e08938000)
libwebpmux.so.3 => /usr/lib/x86_64-linux-gnu/libwebpmux.so.3 (0x00007f3e0872e000)
libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f3e084c5000)
libIlmImf-2_2.so.22 => /usr/lib/x86_64-linux-gnu/libIlmImf-2_2.so.22 (0x00007f3e08002000)
libHalf.so.12 => /usr/lib/x86_64-linux-gnu/libHalf.so.12 (0x00007f3e07dbf000)
libIex-2_2.so.12 => /usr/lib/x86_64-linux-gnu/libIex-2_2.so.12 (0x00007f3e07ba1000)
libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f3e0799d000)
libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f3e07797000)
libgraphite2.so.3 => /usr/lib/x86_64-linux-gnu/libgraphite2.so.3 (0x00007f3e0756a000)
libunistring.so.2 => /usr/lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f3e071ec000)
libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f3e06e86000)
libhogweed.so.4 => /usr/lib/x86_64-linux-gnu/libhogweed.so.4 (0x00007f3e06c52000)
libnettle.so.6 => /usr/lib/x86_64-linux-gnu/libnettle.so.6 (0x00007f3e06a1c000)
libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f3e0679b000)
libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f3e064c5000)
libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f3e06293000)
libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f3e0608f000)
libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f3e05e84000)
libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f3e05c69000)
libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f3e05a4e000)
libgssapi.so.3 => /usr/lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007f3e0580d000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3e0560a000)
libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f3e05403000)
libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f3e051cb000)
libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f3e04f99000)
libbsd.so.0 => /lib/x86_64-linux-gnu/libbsd.so.0 (0x00007f3e04d84000)
libjpegxr.so.0 => /usr/lib/x86_64-linux-gnu/libjpegxr.so.0 (0x00007f3e04b50000)
liblcms2.so.2 => /usr/lib/x86_64-linux-gnu/liblcms2.so.2 (0x00007f3e048f8000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f3e046c9000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f3e044a3000)
libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f3e04295000)
libIlmThread-2_2.so.12 => /usr/lib/x86_64-linux-gnu/libIlmThread-2_2.so.12 (0x00007f3e0408e000)
libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f3e03d5f000)
libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f3e03b4c000)
libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f3e03948000)
libheimntlm.so.0 => /usr/lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007f3e0373f000)
libkrb5.so.26 => /usr/lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007f3e034b2000)
libasn1.so.8 => /usr/lib/x86_64-linux-gnu/libasn1.so.8 (0x00007f3e03210000)
libhcrypto.so.4 => /usr/lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007f3e02fda000)
libroken.so.18 => /usr/lib/x86_64-linux-gnu/libroken.so.18 (0x00007f3e02dc4000)
libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f3e02bbc000)
libwind.so.0 => /usr/lib/x86_64-linux-gnu/libwind.so.0 (0x00007f3e02993000)
libheimbase.so.1 => /usr/lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007f3e02784000)
libhx509.so.5 => /usr/lib/x86_64-linux-gnu/libhx509.so.5 (0x00007f3e0253a000)
libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f3e02231000)
That’s 133 libs. Sure, one .so
file doesn’t necessarily
mean one build dependency, but still, what are they all doing?
Some of them are pretty obvious: SQLite is basically everywhere (for
good reason), some things like libc
and
linux-vdso
are just part of the process of doing business
on an Ubuntu system, and then there’s a few like libGLX
and
libxcb
that are graphics/GUI libraries that I’d expect a
graphics/GUI program to use. Okay, great, now why the heck is all that
Kerberos stuff in there? And why are there both
libheimbase
, which is from the Heimdal Kerberos
implementation, and libkrb5
, which is the MIT
implementation? What the heck is libutil
?
libIlmThread-2_2
(not to be confused with 2_1 or 2_3
despite the shared object versioning, I expect!)? What about
libapr-1
?
Note this is not a contrived example. I literally asked myself
“what’s a non-trivial C or C++ program that isn’t a web browser?” and
RViz was the first thing that popped into my head. Now let’s see what
its actual dependencies are. You can get the source for it here, the most
recent stable branch is melodic-devel
. tokei
puts it at about 75k lines of C++, which in my mind classifies it nicely
towards the small end of “medium sized”. And according to the
CMakeLists
find_package
directives, its direct
dependencies are:
Boost: filesystem, program_options, system, thread
urdfdom_headers
PkgConfig
OGRE
OpenGL
Qt5: QtCore, QtGui, QtOpenGL
About 22 ROS libs
Python
Eigen3 (Optional)
TinyXML2
So, MOST of those DLL’s are coming from transitive dependencies, not
things that it depends on directly. There’s two mega-libraries (Boost
and Qt), and OGRE and Eigen are pretty chunky, but there’s still
lots of various DLL’s that don’t obviously come from any of
those. What is libHalf.so
, and where does it come from?
Let’s try building this and see how it goes! …wait, there’s no build
instructions that I can find. It looks like it just uses CMake, so let’s
try that? It apparently uses Catkin, ROS’s pile of custom CMake build
scripts, so you can’t just do the usual
cd build; cmake ..; make
but need some other stuff too.
…You know, let’s not try building it. I’ve used Catkin just enough to know that it’s a great way to convert time into high blood pressure.
Great, what’s next?
Now, maybe ROS’s toolbase isn’t the greatest, cleanest code in the world. I can certainly believe that. Let’s try a few other real programs from different places. What’s another nontrivial C/C++ program I use a lot? Uh, the Evolution email client, why not:
$ ldd /usr/bin/evolution | wc -l
192
…Yeah I don’t even want to look at that. Why not something that isn’t part of GNOME? OBS Studio?
$ ldd /usr/bin/obs | wc -l
151
Ummmmm. VLC?
$ ldd /usr/bin/vlc
linux-vdso.so.1 (0x00007ffe586d8000)
libvlc.so.5 => /usr/lib/x86_64-linux-gnu/libvlc.so.5 (0x00007fe5cc6dc000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe5cc4bd000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe5cc2b9000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe5cbec8000)
libvlccore.so.9 => /usr/lib/x86_64-linux-gnu/libvlccore.so.9 (0x00007fe5cbbb8000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe5cb81a000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe5ccb06000)
libidn.so.11 => /lib/x86_64-linux-gnu/libidn.so.11 (0x00007fe5cb5e7000)
libdbus-1.so.3 => /lib/x86_64-linux-gnu/libdbus-1.so.3 (0x00007fe5cb39a000)
libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x00007fe5cb116000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe5caf0e000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fe5cace8000)
liblz4.so.1 => /usr/lib/x86_64-linux-gnu/liblz4.so.1 (0x00007fe5caacc000)
libgcrypt.so.20 => /lib/x86_64-linux-gnu/libgcrypt.so.20 (0x00007fe5ca7b0000)
libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007fe5ca59b000)
There we go! Now that’s actually small enough to be interesting,
especially contrasted with those other programs. There’s some system
stuff (why is libsystemd
in there? For DBus stuff?), some
compression libs and other miscellaneous stuff, GPG for some damn
reason, and libvlc
and libvlccore
.
Now this should be an interesting contrast. Let’s try building it and see what happens.
Okay, first off, it has an actual source code release, with a tarball to download, not just a Github page saying “git clone from the master branch”. Take a few moments and think about what that actually implies. In 2010 this was the norm, back when Sourceforge didn’t quite completely suck yet, and now in 2020 it’s uncommon enough to find as part of my normal dev workflow that it deserves mention. Think about how much smaller that makes the divide between the upstream developer making a thing and the developer using the thing as part of another thing. Second, it has build instructions (unlike RViz) and they start with the disclaimer “This guide is intended for developers and power users. Compiling VLC is not an easy task.”
I’m building on Ubuntu, so following the instructions I start off
with
sudo apt-get install git build-essential pkg-config libtool automake autopoint gettext
That’s basically just tooling: gettext
is the GNU i8n tools
and libraries, and autopoint
which I’ve never seen before
is some peripheral tool for working with it. Then there’s a TON of
3rd-party plugins and stuff, but I don’t have all day so let’s just use
the stock Ubuntu install’s build deps. Turns out there’s a cool program
named debfoster
that lists all the transitive dependencies
that a package requires:
$ debfoster -d vlc
Package vlc depends on:
adduser adwaita-icon-theme apt apt-utils at-spi2-core bzip2 ca-certificates coreutils cpp cpp-7
dbus dconf-gsettings-backend dconf-service debconf debconf-i18n dpkg fdisk file fontconfig
fontconfig-config fonts-dejavu-core fonts-freefont-ttf gcc-7-base gcc-8-base glib-networking
glib-networking-common glib-networking-services gpgv gsettings-desktop-schemas
gtk-update-icon-cache hicolor-icon-theme humanity-icon-theme i965-va-driver init-system-helpers
krb5-locales liba52-0.7.4 libaa1 libaacs0 libacl1 libapparmor1 libapt-inst2.0 libapt-pkg5.0
libarchive13 libaribb24-0 libasn1-8-heimdal libasound2 libasound2-data libass9 libasyncns0
libatk-bridge2.0-0 libatk1.0-0 libatk1.0-data libatomic1 libatspi2.0-0 libattr1 libaudit-common
libaudit1 libauthen-sasl-perl libavahi-client3 libavahi-common-data libavahi-common3 libavc1394-0
libavcodec57 libavformat57 libavutil55 libbasicusageenvironment1 libbdplus0 libblkid1 libbluray2
libbsd0 libbz2-1.0 libc6 libcaca0 libcairo-gobject2 libcairo2 libcap-ng0 libcap2 libcddb2
libchromaprint1 libcolord2 libcom-err2 libcomerr2 libcroco3 libcrystalhd3 libcups2
libdata-dump-perl libdatrie1 libdb5.3 libdbus-1-3 libdc1394-22 libdca0 libdconf1
libdouble-conversion1 libdrm-amdgpu1 libdrm-common libdrm-intel1 libdrm-nouveau2 libdrm-radeon1
libdrm2 libdvbpsi10 libdvdnav4 libdvdread4 libebml4v5 libedit2 libegl-mesa0 libegl1 libelf1
libencode-locale-perl libepoxy0 libevdev2 libexpat1 libfaad2 libfdisk1 libffi6 libfile-basedir-perl
libfile-desktopentry-perl libfile-listing-perl libfile-mimeinfo-perl libflac8 libfont-afm-perl
libfontconfig1 libfontenc1 libfreetype6 libfribidi0 libgbm1 libgcc1 libgcrypt20 libgdbm-compat4
libgdbm5 libgdk-pixbuf2.0-0 libgdk-pixbuf2.0-common libgl1 libgl1-mesa-dri libgl1-mesa-glx
libglapi-mesa libgles2 libglib2.0-0 libglib2.0-data libglvnd0 libglx-mesa0 libglx0 libgme0 libgmp10
libgnutls30 libgomp1 libgpg-error0 libgpm2 libgraphite2-3 libgroupsock8 libgsm1 libgssapi-krb5-2
libgssapi3-heimdal libgtk-3-0 libgtk-3-bin libgtk-3-common libgudev-1.0-0 libharfbuzz0b
libhcrypto4-heimdal libheimbase1-heimdal libheimntlm0-heimdal libhogweed4 libhtml-form-perl
libhtml-format-perl libhtml-parser-perl libhtml-tagset-perl libhtml-tree-perl libhttp-cookies-perl
libhttp-daemon-perl libhttp-date-perl libhttp-message-perl libhttp-negotiate-perl
libhx509-5-heimdal libice6 libicu60 libidn11 libidn2-0 libinput-bin libinput10 libio-html-perl
libio-socket-ssl-perl libipc-system-simple-perl libisl19 libjansson4 libjbig0 libjpeg-turbo8
libjpeg8 libjson-glib-1.0-0 libjson-glib-1.0-common libk5crypto3 libkate1 libkeyutils1 libkmod2
libkrb5-26-heimdal libkrb5-3 libkrb5support0 liblcms2-2 libldap-2.4-2 libldap-common libldb1
liblirc-client0 liblivemedia62 libllvm9 liblocale-gettext-perl liblua5.2-0 liblwp-mediatypes-perl
liblwp-protocol-https-perl liblz4-1 liblzma5 liblzo2-2 libmad0 libmagic-mgc libmagic1
libmailtools-perl libmatroska6v5 libmicrodns0 libmount1 libmp3lame0 libmpc3 libmpcdec6 libmpeg2-4
libmpfr6 libmpg123-0 libmtdev1 libmtp-common libmtp-runtime libmtp9 libncurses5 libncursesw5
libnet-dbus-perl libnet-http-perl libnet-libidn-perl libnet-smtp-ssl-perl libnet-ssleay-perl
libnettle6 libnfs11 libnotify4 libnuma1 libogg0 libopenjp2-7 libopenmpt-modplug1 libopenmpt0
libopus0 libp11-kit0 libpam-modules libpam-modules-bin libpam0g libpango-1.0-0 libpangocairo-1.0-0
libpangoft2-1.0-0 libpciaccess0 libpcre3 libperl5.26 libpixman-1-0 libplacebo4 libpng16-16 libpopt0
libpostproc54 libprocps6 libprotobuf-lite10 libproxy-tools libproxy1v5 libpulse0 libpython2.7
libpython2.7-minimal libpython2.7-stdlib libqt5core5a libqt5dbus5 libqt5gui5 libqt5network5
libqt5svg5 libqt5widgets5 libqt5x11extras5 libraw1394-11 libreadline7 libresid-builder0c2a
librest-0.7-0 libroken18-heimdal librsvg2-2 librsvg2-common libsamplerate0 libsasl2-2
libsasl2-modules libsasl2-modules-db libsdl-image1.2 libsdl1.2debian libseccomp2 libsecret-1-0
libsecret-common libselinux1 libsemanage-common libsemanage1 libsensors4 libsepol1 libshine3
libshout3 libsidplay2 libslang2 libsm6 libsmartcols1 libsmbclient libsnappy1v5 libsndfile1
libsndio6.1 libsoup-gnome2.4-1 libsoup2.4-1 libsoxr0 libspeex1 libspeexdsp1 libsqlite3-0
libssh-gcrypt-4 libssh2-1 libssl1.1 libstdc++6 libswresample2 libswscale4 libsystemd0 libtag1v5
libtag1v5-vanilla libtalloc2 libtasn1-6 libtdb1 libtevent0 libtext-charwidth-perl
libtext-iconv-perl libtext-wrapi18n-perl libthai-data libthai0 libtheora0 libtie-ixhash-perl
libtiff5 libtimedate-perl libtinfo5 libtry-tiny-perl libtwolame0 libudev1 libunistring2 libupnp6
liburi-perl libusageenvironment3 libusb-1.0-0 libuuid1 libva-drm2 libva-wayland2 libva-x11-2 libva2
libvdpau1 libvlc-bin libvlc5 libvlccore9 libvorbis0a libvorbisenc2 libvorbisfile3 libvpx5
libvulkan1 libwacom-bin libwacom-common libwacom2 libwavpack1 libwayland-client0 libwayland-cursor0
libwayland-egl1 libwayland-egl1-mesa libwayland-server0 libwbclient0 libwebp6 libwebpmux3
libwind0-heimdal libwrap0 libwww-perl libwww-robotrules-perl libx11-6 libx11-data
libx11-protocol-perl libx11-xcb1 libx264-152 libx265-146 libxau6 libxaw7 libxcb-dri2-0
libxcb-dri3-0 libxcb-glx0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-present0 libxcb-randr0
libxcb-render-util0 libxcb-render0 libxcb-shape0 libxcb-shm0 libxcb-sync1 libxcb-util1
libxcb-xfixes0 libxcb-xinerama0 libxcb-xkb1 libxcb-xv0 libxcb1 libxcomposite1 libxcursor1
libxdamage1 libxdmcp6 libxext6 libxfixes3 libxft2 libxi6 libxinerama1 libxkbcommon-x11-0
libxkbcommon0 libxml-parser-perl libxml-twig-perl libxml-xpathengine-perl libxml2 libxmu6 libxmuu1
libxpm4 libxrandr2 libxrender1 libxshmfence1 libxt6 libxtst6 libxv1 libxvidcore4 libxxf86dga1
libxxf86vm1 libzstd1 libzvbi-common libzvbi0 lsb-base mesa-va-drivers mesa-vdpau-drivers
mime-support multiarch-support netbase openssl passwd perl perl-base perl-modules-5.26
perl-openssl-defaults procps psmisc python-talloc qttranslations5-l10n readline-common samba-libs
sensible-utils shared-mime-info tar ubuntu-keyring ubuntu-mono ucf udev util-linux uuid-runtime
va-driver-all vdpau-driver-all vlc-bin vlc-data vlc-l10n vlc-plugin-base vlc-plugin-notify
vlc-plugin-qt vlc-plugin-samba vlc-plugin-skins2 vlc-plugin-video-output vlc-plugin-video-splitter
vlc-plugin-visualization x11-common x11-utils x11-xserver-utils xdg-user-dirs xdg-utils xkb-data
xz-utils zlib1g
…welp, in for a penny I guess. Just run
apt-get build-dep vlc
, and let’s gooooooo!
…Well, I have to admit, half a million lines of C compiles a lot
faster than half a million lines of Rust generally does. It took about 2
minutes on one core, and produced a binary and libvlc.so
that are together under a meg in size. Where’s the rest of it? Oh, it
also spat out another 110 MB of dynamically-loaded libraries. So, the
reason the ldd
output is small is now clear: VLC includes a
module loading system, and all the actual media code lives in modules
that are loaded on demand instead of by the system loader.
Also, contrary to the official build docs, building VLC is in fact pretty easy… if you’re using Ubuntu and the libs it packages.
Yeah, but those are all big complex GUI programs!
Obviously, if we want a look at how many dependencies are used by
Real Nontrivial C Programs, looking at media-rich, network-touching GUI
programs is going to get us a pretty biased view. Okay, what about
non-GUI stuff? That’s bound to be simpler, right? Let’s take something
command-line only, definitely nontrivial but not too big, preferably not
something interactive or media-heavy. lighttpd
maybe?
$ debfoster -d lighttpd
Package lighttpd depends on:
adduser apt apt-utils bzip2 ca-certificates debconf debconf-i18n dpkg file gcc-8-base gpgv libacl1
libapt-inst2.0 libapt-pkg5.0 libattr1 libaudit-common libaudit1 libbz2-1.0 libc6 libcap-ng0
libdb5.3 libfam0 libffi6 libgcc1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libhogweed4
libidn2-0 liblocale-gettext-perl liblz4-1 liblzma5 libmagic-mgc libmagic1 libnettle6 libp11-kit0
libpam-modules libpam-modules-bin libpam0g libpcre3 libseccomp2 libselinux1 libsemanage-common
libsemanage1 libsepol1 libssl1.1 libstdc++6 libsystemd0 libtasn1-6 libtext-charwidth-perl
libtext-iconv-perl libtext-wrapi18n-perl libudev1 libunistring2 libzstd1 lsb-base mime-support
openssl passwd perl-base spawn-fcgi tar ubuntu-keyring xz-utils zlib1g
That’s not small, but it’s far smaller than anything else we’ve
looked at so far. In fact, it seems pretty reasonable to my eye. There’s
the usual industry-standard libraries that you would expect, like
pcre3
and libbz2
, some domain-specific stuff
like libtasn1
and libmagic
, and some
miscellanious other bits and pieces. There’s still some weirdness
though; what does a web server need with libgmp10
or
libseccomp2
, anyway? Maybe those are just needed by
something like ubuntu-keyring
, which appears to basically
just be administrative? But dash
, below, doesn’t need
ubuntu-keyring
…
For the sake of SCIENCE, let’s take a look at the ldd
output for lighttpd
as well, to get a guess for how good
our proxies of “apt dependencies” and “dynamically linked libraries”
lines up with each other:
$ ldd $(which lighttpd)
linux-vdso.so.1 (0x00007ffd443fe000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f756969e000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f756949a000)
libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007f7569295000)
libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f7569008000)
libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f7568b3d000)
libfam.so.0 => /usr/lib/x86_64-linux-gnu/libfam.so.0 (0x00007f7568934000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7568543000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7568324000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7569b4f000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7567f9b000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7567d83000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f75679e5000)
It looks like, similar to vlc
, lighttpd
manages a lot of modules at runtime instead of specifying them all at
compile time. So debfoster
would be generally a better
guess at a program’s full dependencies, though it may end up saying a
program needs more than it does – I rather doubt that you absolutely
require debconf
to build lighttpd
, but you
need it to build the Ubuntu package as distributed.
Now I want to see how low we can go. Let’s take a look at a small,
simple command-line tool that does one thing well… heck, why not
debfoster
itself?
$ debfoster -d debfoster
Package debfoster depends on:
adduser apt apt-utils ca-certificates debconf debconf-i18n dpkg gcc-8-base gpgv libacl1
libapt-inst2.0 libapt-pkg5.0 libattr1 libaudit-common libaudit1 libbz2-1.0 libc6 libcap-ng0
libdb5.3 libffi6 libgc1c2 libgcc1 libgcrypt20 libgmp10 libgnutls30 libgpg-error0 libhogweed4
libidn2-0 liblocale-gettext-perl liblz4-1 liblzma5 libnettle6 libp11-kit0 libpam-modules
libpam-modules-bin libpam0g libpcre3 libseccomp2 libselinux1 libsemanage-common libsemanage1
libsepol1 libssl1.1 libstdc++6 libsystemd0 libtasn1-6 libtext-charwidth-perl libtext-iconv-perl
libtext-wrapi18n-perl libudev1 libunistring2 libzstd1 openssl passwd perl-base tar ubuntu-keyring
zlib1g
Now that’s weird, that’s almost as many dependencies as
lighttpd
. Let’s see… I’ll give libc
and
libgcc
and such a pass for being system libraries, but
apart from those there’s 40 libraries by my count. Some, like
libsystemd0
and libselinux1
are probably
everywhere whether we like it or not, but others like
libhogweed4
are a little more exotic (and bewildering).
This is a program that is not even 2500 significant lines of C code, by
tokei
’s count. Looking at its configure.in
and
trying to decipher as much as I can, all it actually tries to
use directly is libgettext
, libavl
, and
libgc
. Weird.
“Yeah but that’s GNU software” I hear you say, “GNU tools are
bloat-y”. All right, let’s look at something that’s designed to
be minimal, the dash
shell. 13k SLOC, all written in plain
C, no frills.
$ debfoster -d dash
Package dash depends on:
debianutils dpkg gcc-8-base libacl1 libattr1 libbz2-1.0 libc6 libgcc1 liblzma5 libpcre3 libselinux1
libzstd1 tar zlib1g
Okay, that’s actually minimal. System stuff, some compression stuff, a little shell-y stuff, regexes, that’s it. Let’s build it.
$ ./autogen.sh; ./configure; time make
...
real 0m0.734s
user 0m0.628s
sys 0m0.118s
FINALLY, reasonable software! Builds fast, bare minimum of deps, nothin’ but what’s absolutely necessary. And it only took… how much work to find? Awesome, let’s try it out!
fish ~ > ~/tmp/dash-0.5.10.2/src/dash
$ cd cd ~/tmp/dash-0.5.10.2█
./src/dash: 1: cd: can't cd to cd
shit a typo, hang on
$ ^[[A^[[A^[[A█
wait, arrow keys don't work
$ cd ~/tmp/dash-0. █
oh yeah no tab-complete
$ dc ~/tmp/dash-0.5.10.2^A^A█
crap no line-editing at all? uh...
$ ^D
dash
is designed to do one single thing with no frills:
run shell scripts. If you’re not using it to run shell scripts, it’s not
very useful.
There’s an interesting observation here: The biggest programs so far are the ones designed for humans to interact with. Turns out humans are complicated and making computers do things they want is hard. There’s the argument that humans need to learn computers better, with simpler and more composable primitives, and it will result in great benefits in smaller and more powerful programs. Then there’s the counter-argument that if a program is written once and used many, many times, the extra time a programmer puts into making the learning curve shallower will rapidly be made up for by the time saved by the users. Both these arguments are valid. “Design” is the art of striking a balance between them, and different people will want different balances.
Gotta go deeper
While we’re here, let’s look a little deeper into some of these programs. Part of my theory is that C programs commonly omit deps by re-implementing the bits they need themselves, because for small stuff it’s far easier to just write or copy-paste your own code than to actually use a library. Let’s rummage through some of these programs, briefly, and see if that’s the case.
dash
: Well, there’s a handful of linked lists and a simple memory allocator, but no gratuitous re-implementations ofmemcpy
or anything like that.lighttpd
: Aha, here we are. There’s a SHA1 impl, a base64 impl, a resizeable array, a URL parser, a CRC32 implementation, a layer over thepoll(2)
equivalent on various platforms, a LALR parser generator called LEMON (specifically designed to be vendor’ed into other code bases), an MD5 impl, a RNG that tries to get secure randomness from several different sources depending on the system, asafe_memclear()
that hopefully hasn’t been broken by compiler changes since it was written, a splay tree, and another, different, resizeable array. Or, in terms of Rust crates:sha1
,base64
,std::vec::Vec
,url
,crc
,mio
,lalrpop
or something,md5
,rand
, actually I can’t find a simple safe way to zero memory in Rust,std::collections::BTreeSet
, and againstd::vec::Vec
. These add up to about 8000 significant lines of code, roughly 15% oflighttpd
.vlc
: Why not, I ain’t scared! This won’t be a complete list, but a skim finds a command line parser, some chunks oflibc
that are apparently commonly missing on some platforms (mostly Unicode stuff), what looks like a threadpool, a block memory allocator, a thread-safe FIFO, a subsystem for recognizing file magic numbers, a HTTP cookie library, an MD5 implementation, a small MIME type guesser, a bunch of time parsing stuff, a thread-safe wrapper around the POSIXdrand48(3)
function, a pile of PGP key stuff, an XML parser, a bunch of string functions, a base64 decoder… You get the idea.
So, yeah. An argument I’ve heard against the Go/Rust paradigm of
statically linking everything is “you end up with bunches of copies of
the same library code compiled into each different program!” All I can
say to that is… lol. That said, with these vendored utility libraries
there’s a strong incentive to keep them small, simple and task-specific,
which is good. On the flip side, I wonder how many times
vlc
’s XML parser has been fuzzed?
Conclusions
Okay, so what have we learned? Well, first off, my thesis of “it isn’t just Rust or JS that has this problem, you know”… I’m not going to call it conclusively demonstrated, but I’ve found some strong support and a couple decent counterpoints. There are potentially a lot of unexpected dependencies hiding in even a quite small C program. Linux package managers do hide the complexity from you by making it all just “part of what the computer does anyway”, and sometimes that involves a staggering amount of STUFF. A medium-sized Rust project can easily tip the scales at 2-300 crates, which is still rather more dependencies than anything I’ve looked at here, but that’s explained by the simple fact that using libraries in C is such a monumental pain in the ass that it’s not worth trying for anything unless it’s bigger than… well, a base64 parser or a hash function.
The real thing with toools like go
, cargo
and npm
is they move that library management out of the
distro’s domain and into the programmer’s. If I’m writing software in C,
I’m almost certainly just going to be using the system packages that
Ubuntu or Debian provide for all the deps I need, so really it’s a
question of “does compiling this program need a lot of deps” or “does
compiling this program on a recent Debian system need a
lot of deps”. I consider myself a fairly typical programmer, because I’m
lazy and if I need to do something that isn’t directly associated with
the actual problem I’m trying to solve then I don’t want to spend a lot
of time on it. If a Debian package doesn’t exist for a C lib I want, or
it’s not up to date enough, I have a 90% chance of saying “well it’s not
that important anyway”, an 8% chance of vendoring it into my own code,
and a 2% chance of saying “Well I guess I can try to make a real apt
package for it”. Debian is big enough that most things you would want
are pretty up to date and are in there somewhere. Then, if
someone wants to run my C program on Red Hat, I’ll say “good luck, I
tried to make it portable but no promises”. Maybe it works and maybe it
doesn’t but either way it’s their problem, not mine, unless I’m being
paid for it to be my problem. And if a computer doesn’t run Debian, Red
Hat, Windows, or a close derivative of one of those things, then I’m
just not going to write complex C software for it. The amount of work it
will take to get anything done will far outweigh the novelty. Especially
when I can use a different language instead, one that’s more capable and
takes less work to deal with.
For an example, just look at my reaction to the programs I tried
building. RViz: “Heck, I’m not spending 30 minutes just figuring out how
to compile it, I’ll pass”. VLC: “Oh, there’s an apt
command
to just install everything it might need in one go, let’s just do that
instead of spending literal hours getting the source needed for each
plugin”. By doing that, building a very complex C program becomes almost
as easy as using cargo
.
go
-style package management tools are designed to
build programs that work. Not programs that work on Debian if
you have the right libs installed, or on Windows 7 or newer, or
whatever, but “if you compile it on an OS, it works on that OS”. Rust
doesn’t have compiler targets for Debian, or Red Hat, or Arch Linux: it
has x86_64-unknown-linux-gnu
. If you actually want
reproducable software which can use libraries that other people write,
and which runs on any platform it builds on, then you
have only a few realistic options:
- One, you can control all the software on the computer that your software might need to talk with. Make sure you have all the right libs installed, in the right places, with the right versions, and your program will work fine. This is what Debian and Red Hat do, this is what Windows does from a different direction. When you don’t have mechanisms for handling this then DLL Hell starts, you can’t move binaries from one computer to another, and this is why installing packages from source all the time on your dev machine is a great way to result in programs with Funky Behavior that break whenever you touch stuff. This is why devops is now a profession, because to make a complex program work you have to be able to control the system outside the program itself. Being able to do this and not need permission to manage the whole computer is also exactly what Docker and other container solutions do. They do other things too, but that automated install and configuration of complex software, without needing root on the target system and without the programs being able to accidentally step on each other, is their killer feature.
- Two, you can control all the dependencies of your software
as part of your build process. This is what
go
-style package managers do. This is also one reason whygo
,npm
andcargo
all statically compile stuff, and recompile everything from scratch at every opportunity: it removes a variable from the process, which isn’t strictly necessary but which simplifes things a lot. This is also exactly what systems like Nix and Guix do, they just take the “manage all dependencies for all programs independently and from source” portion and apply it to the whole OS environment. - Or, I suppose, number three, write as much as possible yourself, vendor everything else, and you have 100% control of all dependencies and can do whatever you want. The cost of this is duplicating a lot of code, and making it hard to update any 3rd party code you do use. It also means that you are in charge of testing, verifying and maintaining all this code, which is no small task. Sometimes this is feasible, but it still takes a lot of work that is often tangential to the problem you actually want to solve. Building large, complex systems this way often takes a kinda special type of person.
(Note that containers are another rant: what we actually want for robust, reliable infrastructure is an environment and API that deals with all the above stuff one way or another, and provides explicit, deny-by-default control of everything programs are allowed to do, a la a capability system. Linux is not this, but people keep trying to turn Linux into that with tools like Docker. I think that if an API like WASI or such ever gets popular, that has these properties designed in from the start, life will become much better. Making such a system that is actually nice for humans to use interactively is still a problem, but a different problem.)
If you take a complicated Go or Rust program that doesn’t depend on
OS-level details and statically compile it with the same compiler on a
Debian system and a Red Hat system, you will get the same
program to within some Sufficiently Small delta. There’s always caveats
of course, software is complicated, but let’s pretend it’s not actively
trying to be complicated. You will get a binary that will not
freak out because it wanted libPocoFoundation.so.50
but the
latest version it can find is libPocoFoundation.so.48
, or
because libPocoFoundation.so.50
is in /usr/lib
instead of /usr/local/lib
. It will not suddenly
crash because a system’s copy of libjpeg.so.8
was compiled
with a different set of features than it expected. That sort of
nonsense is the problem these systems are trying to solve, and the three
solutions above are the ones we’ve discovered so far. None is perfect,
with all of them you’re making a tradeoff between things they handle
well and things they don’t.
There’s undoubtedly other useful to be made though, with different sets of tradeoffs. So, what other solutions can we invent?