November 26, 2013

Third party libraries in Chromium sources

One of the most common obstacles or issues related to inclusion of Chromium in Linux distro repositories are bundled libraries. Last attempt to blog about it I know about is Evan Martin's Forking upstream software post. I decided to take another look.

It is important to know that even if something appears in a third_party directory in Chromium codebase, it is not necessarily a bundled library. Third party code - yes, but not necessarily a bundled library. What's the difference? Well, even Fedora in its excellent No Bundled Libraries article lists e.g. copylibs as a possible exception. What about code that was never intended to be used as a shared library, is part of larger codebase, but is still useful? This will come up in some examples below.

Here is my list of third_party code still present in Gentoo's Chromium packages as of version 33.0.1711.3 (dev channel). This means that the libraries that have been successfully unbundled are not included. Similarly, code that is not used on Linux is not included. This takes into account intended audience - mostly Linux users, some of them packagers. A star (*) means that the code was already there in 2009.
  1. base/third_party/dmg_fp (*) - David M. Gay's floating point routines (dtoa, g_fmt). I don't think there is a shared library for these. There is also crbug.com/95729 about using V8's routines.
  2. base/third_party/dynamic_annotations - single .c file with corresponding header containing annotations for dynamic tools like valgrind, tsan. Doesn't seem to be worth extracting, which could likely add unwanted dependencies on these tools.
  3. base/third_party/icu (*) - this could be extracted to use system icu; supporting nacl may be a challenge there (base is most likely also compiled by the nacl toolchain, and it's not obvious to me how shared libraries would work there - if at all, or whether it would make sense).
  4. base/third_party/nspr (*) - it may be possible to remove it now that Gentoo dropped nacl support for other reasons (crbug.com/269560).
  5. base/third_party/symbolize - part of another Google's project, google-glog. Technically should be possible to extract, and glog even made a release with a tarball.
  6. base/third_party/valgrind (*) - bundled to avoid depending on valgrind just for build... IMHO fine.
  7. base/third_party/xdg_mime (*) - looks like the code was not intended to be used as a library, but maybe the intention was to avoid forking a process. Probably worth a closer look.
  8. base/third_party/xdg_user_dirs (*) - see this comment in the source code:
    /*
      This file is not licenced under the GPL like the rest of the code.
      Its is under the MIT license, to encourage reuse by cut-and-paste.
    
      Copyright (c) 2007 Red Hat, inc  ...*/
    
    
    
  9. breakpad/src/third_party/curl - great candidate for unbundling (or just disabling breakpad for Chromium builds in a way that doesn't try to touch curl even when disabled).
  10. chrome/third_party/mozilla_security_manager - parts of Mozilla code; doesn't seem to be designed as a shared library; has local modifications.
  11. crypto/third_party/nss - selected files extracted from NSS; there are some modifications, but with enough effort it may be possible to unbundle.
  12. net/third_party/mozilla_security_manager - parts of Mozilla code, different from the chrome bits above.
  13. net/third_party/nss (*) - parts of Mozilla's NSS (libssl) with experimental patches. Note that NSS developer is working there, so this can be seen as even more bleeding-edge than NSS trunk.
  14. third_party/WebKit (*) - now Blink, developed as part of the same Chromium project, but a fork of third party code. Not designed to be used as a shared library.
  15. third_party/angle_dx11 - developed by Google/Chromium developers; doesn't seem to be designed to be used as a shared library, but with enough effort it should be possible.
  16. third_party/cacheinvalidation - same as above.
  17. third_party/cld (*) - developed by Google/Chromium developers, probably to be replaced with cld2, which will hopefully be closer to a shared library design.
  18. third_party/cros_system_api - related to ChromeOS, not really bundled but rather just part of the project.
  19. third_party/ffmpeg (*) - Chrome uses very recent ffmpeg; I think the local modifications status has improved greatly since 2009: looks like patches make it upstream pretty quickly.
  20. third_party/flot - JS library, AFAIK there isn't really a concept of having system JS libraries. It could actually be useful to have one, but it's not obvious.
  21. third_party/hunspell (*) - modified to support running under sandbox and loading dictionaries in a different format; maintainers do respond but are very busy. This is doable but requires a fair amount of effort to figure out what to do with the different dictionary format.
  22. third_party/iccjpeg - taken out of lcms library, and the maintainers don't want to expose it.
  23. third_party/jstemplate (*) - Google's JS templating library.
  24. third_party/khronos - GL headers, unfortunately with local modifications.
  25. third_party/leveldatabase - needs a redesign to allow applying Chromium-specific behavior (env_chromium.cc) at run-time instead of at compile time. I've seen a Debian package for leveldb, looks like there is some interest in using it as a library.
  26. third_party/libjingle (*) - used to have semi-inactive upstream, now seems to become a part of WebRTC. When things stabilize more, worth another look.
  27. third_party/libphonenumber - upstream seems to be more focused on Java version of it, which actually has releases; C++ version doesn't seem to be designed to be used as a shared library.
  28. third_party/libsrtp - used to be inactive but now has a new home at https://github.com/cisco/libsrtp and there are Googlers helping out with it. Worth taking another look when things stabilize. Note that even if it compiles it doesn't mean it works, see bug #459932.
  29. third_party/libusb - locally made incompatible change needs to be upstreamed (crbug.com/266149).
  30. third_party/libvpx - waiting for upstream release supporting vp9, see bug #487926.
  31. third_party/libwebp - waiting for upstream release supporting APIs Chromium depends on, see http://crbug.com/288019.
  32. third_party/libxml/chromium - this is ugly: code is actually part of Chromium codebase; at least it's not really bundled.
  33. third_party/libXNVCtrl - part of nvidia-settings. Not sure if it's intended to be used as a shared library, but it seems totally possible technically, and I even remember some success reports with it.
  34. third_party/libyuv - Google/Chromium project. Should be possible to use as a shared library, but doesn't seem to make releases.
  35. third_party/lss - Linux Syscall Support; a header based on Linux kernel headers.
  36. third_party/lzma_sdk (*) - lzma library from 7-zip.org ; it would be great to replace it with xz-utils which distros package.
  37. third_party/mesa - I think only headers are used, but it's complicated.
  38. third_party/modp_b64 (*) - README.chromium points to https://code.google.com/p/stringencoders/. Doesn't seem to be design to be used as a shared library, but it seems possible.
  39. third_party/mt19937ar - not designed as a shared library, rather small; can be removed after move to C++11 (looks like <random> would support needed functionality).
  40. third_party/npapi (*) - NPAPI headers with modifications.
  41. third_party/ots (*) - OpenType sanitizer, may be possible to package as a shared library, although it doesn't seem to have releases.
  42. third_party/polymer - JS library by Google, see polymer-project.org.
  43. third_party/pywebsocket (*) - Python WebSocket server used for testing. Should be possible to package it separately.
  44. third_party/qcms - color management library. Last upstream commits seem to be over a year ago, but the bundled copy continued to receive various updates, at least for more recent toolchain support.
  45. third_party/sfntly - font-related library; doesn't seem to have releases, doesn't seem to be designed to be a shared library.
  46. third_party/skia (*) - graphics library, changes very often.
  47. third_party/smhasher - hash function library - doesn't seem to have releases or be designed to be a shared library.
  48. third_party/sqlite (*) - available as a package, the biggest obstacle is lack of a good API to use it in a multi-process sandboxed context and also test it. See http://crbug.com/22208. That obstacle would disappear however when Chromium drops support for abandoned webdatabase spec.
  49. third_party/tcmalloc (*) - although theoretically available separately, the Chromium copy is heavily modified, and that includes hardening changes important for security.
  50. third_party/tlslite (*) - Python crypto library, only used for testing but appears to be modified in a non-compatible way.
  51. third_party/trace-viewer - not obvious what it really is, and it contains several more bundled libraries inside.
  52. third_party/undoview - code extracted from gtksourceview.
  53. third_party/usrsctp - user-space SCTP implementation with local changes.
  54. third_party/webdriver - mostly some minified JS embedded in C++ code.
  55. third_party/webrtc - Real-Time Communications library - doesn't seem to have releases, and seems to be moving pretty fast.
  56. third_party/widevine - stubs for proprietary content distribution module.
  57. third_party/x86inc - asm code extracted from x264 with local modifications; I don't really see a good way to provide that as a system package.
  58. third_party/zlib/google - this is ugly: code is actually part of Chromium codebase; at least it's not really bundled.
  59. url/third_party/mozilla - parts of Mozilla code; doesn't seem to be designed as a shared library; has local modifications.
  60. v8 (*) - although the path doesn't contain third_party, I consider it bundled code. See When the libraries you use are moving too fast for the reasons it's there. While technically not part of Evan's 2009 list, it was obviously there since the beginning.
60 entries look like a lot. I would like that number to be smaller. On the other hand, note that many of these codebases were not designed to be used as shared libraries, some were developed as part of Chromium project, and that the project is very careful to put code it borrows from outside in third_party directories, whereas it's not uncommon for open source projects in general to incorporate such code directly into their codebases. In Chromium it's just much more visible.

Also note that while 23 of these items still exist, for some entries from 2009 we're now using system libraries, at least in Gentoo. Just to give you a few examples (the list is not necessarily complete - star means it's present on the 2009 list):
  1. flac
  2. harfbuzz (*)
  3. icu (*)
  4. jsoncpp
  5. libevent (*)
  6. libjpeg (*)
  7. libpng (*)
  8. libxml (*)
  9. libxslt (*)
  10. minizip
  11. nspr (*)
  12. openssl
  13. opus
  14. protobuf (*)
  15. re2
  16. snappy
  17. speex
  18. xdg-utils (*)
  19. yasm (*)
  20. zlib (*)
I'm interested in your opinions, so feel free to add your comment below. If you liked this post, you may also like State of Chromium Open Source packages.

No comments:

Post a Comment