Smaller Clojure Docker builds with multi-stage builds

A common pattern in Docker is to use a separate build environment from the runtime environment. Many platforms have different requirements when you're generating a runnable artifact than when you're running it.

In languages like Go, Rust or C, where the most common implementations produce native binaries, the resulting artifact may require nothing from the environment at all, or perhaps as little as a C standard library. Even in languages like Python that don't typically have a build step, you might indirectly use code that still requires compilation. Common examples include OpenSSL with pyca/cryptography or NETLIB and other numerical libraries with numpy/scipy.

In Clojure, you can easily build "uberjars" with both lein and boot. These are jars (the standard JVM deployable artifact) that come with all dependencies prepackaged, requiring nothing beyond what's in the Java standard library (rt.jar). While this still requires a JRE to run, that is still much smaller than the full development environment.

There are a few advantages to separating environments. It all boils down to them not having anything in them they don't need. That has clear performance advantages, although Docker has historically mitigated this problem with layered pulls. It can have security benefits as well: you can't have bugs in software you don't ship. Even software that isn't directly used in the build process can be affected: some build environments will contain plenty of software that is never used that would normally carry over into your production environments.

Historically, most users of Docker haven't bothered. Even if there are advantages, they aren't worth the hassle of having separate Docker environments and ferrying data between them. While different ways of effectively sharing data between containers have been available for years, people who wanted a shared build step have mostly had to write their own tooling. For example, my icecap project has a batch file with an embedded Dockerfile that builds libsodium debs.

The upcoming release of Docker will add support for a new feature called multi-stage builds, where this pattern is much simpler. Dockerfiles themselves know about your precursor environments now, and future containers have full access to previous containers for copying build artifacts around. This requires Docker 17.05 or newer.

Here's an example Dockerfile that builds an uberjar from a standard lein-based app, and puts it in a new JRE image:

FROM clojure AS build-env
WORKDIR /usr/src/myapp
COPY project.clj /usr/src/myapp/
RUN lein deps
COPY . /usr/src/myapp
RUN mv "$(lein uberjar | sed -n 's/^Created \(.*standalone\.jar\)/\1/p')" myapp-standalone.jar

FROM openjdk:8-jre-alpine
WORKDIR /myapp
COPY --from=build-env /usr/src/myapp/myapp-standalone.jar /myapp/myapp.jar
ENTRYPOINT ["java", "-jar", "/myapp/myapp.jar"]

This captures the uberjar name from the lein uberjar output. If your uberjar name doesn't end in .standalone.jar, that won't work. You can change the name of the uberjar with the :uberjar-name setting in project.clj. If you set it to myapp-standalone.jar, you don't need the gnarly sed expression anymore at all, and can just call lein uberjar. (Thanks to Łukasz Korecki for the suggestion!)

The full clojure base image is a whopping 629MB (according to docker images), whereas openjdk:8-jre-alpine clocks in at 81.4MB. That's a little bit of an unfair comparison: clojure also has an alpine-based image. However, this still illustrates the savings compared to the most commonly used Docker image.

There are still good reasons for not using multi-stage builds. In the icecap example above, the entire point is to use Docker as a build system to produce a deb artifact outside of Docker. However, that's a pretty exotic use case: for most people this will hopefully make smaller Docker images an easy reality.

Edited: The original blog post said that the Docker version to support this feature was in beta at time of writing. That was/is correct, but it's since been released, so I updated the post.

Edited:* Łukasz Korecki pointed out that project.clj has an :uberjar-name parameter which can be used to avoid the gnarly sed expression. Thanks Łukasz!

2016 rMBP caveats

I bought the 2016 15" retina MacBook Pro as soon as it became available. I've had it for a week now, and there have been some issues you might want to be aware of if you'd like to get one.

(There are a bunch of links to Amazon in this article. They're not affiliate links.)

System Integrity Protection is often disabled

I noticed via Twitter that some people were reporting that System Integrity Protection (SIP) was disabled by default on their Macs. SIP is a mechanism via which macOS protects critical system files from being overwritten.

You can check if SIP is enabled on your system by running csrutil status in a terminal. Sure enough, SIP was disabled for both me and my wife's new rMBPs. To enable SIP, boot into the recovery mode (hold ⌘-R when booting), open a terminal, type csrutil enable and reboot.

Perhaps unrelatedly, different out-of-the-box rMBPs appear to have different builds of OS X Sierra 10.12.1.

Thunderbolt 2 dongle doesn't work with external screens

I have a Dell 27" 4k montior (P2715Q). I used it with my previous-generation rMBP with a DisplayPort-to-mDP2 cable to connect it to its Thunderbolt 2 port. When buying my laptop, it suggested I get a Thunderbolt 3 to Thunderbolt 2 dongle. I was expecting to get a Thunderbolt 2 port like the one on my previous Mac. When I plugged it in to my monitor, it told me that there was a cable plugged in, but no signal coming from the computer.

My understanding was that the Thunderbolt spec implies PCIe lanes and other protocols over the same port. Specifically, Thunderbolt 2 means 4 PCI Express 2.0 lanes with DisplayPort 1.2; at a cursory glance, Wikipedia agrees. (Thunderbolt 3 adds HDMI 2.0 and USB 3.1 gen 2.)

I spent about an hour and a half on the phone with AppleCare folks. The Apple support people were very friendly. (I'm guessing their instructions tell them to never, under any circumstances, interrupt a customer. It was a little weird.) I was redirected a few times. They had a variety of suggestions, including:

  • Changing my monitor to MST mode, which shouldn't be necessary for DisplayPort 1.2-supporting devices, and did nothing but make my monitor not work with my old rMBP either. Fortunately I was able to recover via HDMI to my old laptop.
  • Buying the Apple Digital AV Adapter instead. That adapter used HDMI instead of mDP2. That's a significant downgrade; my use of DisplayPort was intentional, because DisplayPort 1.2 is the only way I can power the 4K display at 60Hz. (The new adapter does not support HDMI 2.0, which is necessary for [email protected])
  • Buying a third-party DisplayPort adapter or dock. This is precarious at best. Most existing devices don't work with the new rMBP, because they use a previous-generation TI chip. There are plenty of docks that wont work, by StarTech, Dell, Kensington and Plugable. I found one Dock by CalDigit that will ostensibly work with the new rMBP, but doesn't supply enough power to charge it.

Eventually, we found a KB article that spells out that the Thunderbolt dongle doesn't work for DisplayPort displays:

The Thunderbolt 3 (USB-C) to Thunderbolt 2 Adapter doesn't support connections to these devices:

  • Apple DisplayPort display
  • DisplayPort devices or accessories, such as Mini DisplayPort to HDMI or Mini DisplayPort to VGA adapters
  • 4K Mini DisplayPort displays that don’t have Thunderbolt

I'm a little vindicated by the Mac Store review page for the dongle; apparently I wasn't the only person to expect that. (I was unable to see the reviews before my purchase, because I purchased it with my Mac, which doesn't show reviews. Also, the product was brand new at the time, and didn't have these reviews yet.)

Belkin and OWC will be shipping docks that allegedly work with the new rMBP, but Belkin's is currently unavailable with no ship date mentioned, and OWC claims February 2017.

WiFi failing with USB-C devices plugged in

Just as I was going to start writing this post, I noticed that I wasn't able to sync my blog repository from GitHub:

Get https://api.github.com/repos/lvh/lvh.github.io: dial tcp 192.30.253.116:443: connect: network is unreachable

It didn't click at first what was going on. I restarted my router, connected to different networks, tried a different machine -- all telling me it was this laptop that was misbehaving. I started trying everything, and realized I had recently plugged in my WD backup drive from which I was copying over an SSH key. It's a USB 3.0 drive that I'm connecting via an AUKEY USB 3 to USB-C converter. I removed the drive, and my WiFi starts working again. Plugging it back in does not instantly, but eventually, break WiFi again.

After searching, I was able to find someone with the same problem. It is unclear to me if this issue is related to the first-gen TI chip issue mentioned above. In that video, the authors are also using a USB 3.0 to USB-C plug, albeit a different one from mine. I don't have a reference USB-C machine that isn't a new 2016 rMBP to test with. However, this seems plausible, because the USB 3.0 dongle I purchased from Apple ostensibly works fine.

This does not seem like a reasonable failure mode.

The escape key, and the new keyboard

I spend most of my day in Emacs. I'm perfectly happy with the new keyboard. I've also used the regular MacBook butterfly keyboard, and the new version is significantly better. I've never had a problem with not having an escape key; every app where I would've cared to press it had an escape key drawn on the new Touch Bar. However, not having tactile feedback for the escape key is annoying. When I was setting up my box and quickly editing a file in vim, I successfully pressed Escape to exit insert mode -- but I ended up pressing it five times because I thought I didn't hit it. Apparently the visual feedback vim gives me that I've exited insert mode is not, actually, what my brain relies on. I'll let you know if I get used to it.

Charging

I'll miss the safety of Magsafe, but being able to plug in your charger on either side is an unexpected nice benefit.

Conclusion

I was ready to accept a transition period of dongles; I bought into it, literally and figuratively. However, most of the dongles don't actually work, and that sucks. So, maybe wait for the refresh, or at least until the high-quality docks are available.