转自:http://whitfin.io/speeding-up-rust-docker-builds/

This post will be the first of several addressing Docker image optimizations for different project types. It stems from my recent experiences with badly written Dockerfiles, which result in sitting around for 10 minutes every time you build, only to then need to upload images of over 1GB in size. These are extreme examples, but it's likely in the best interest of any developer to have an optimally written Dockerfile (if only because time is precious). What follows is mainly a collection of notes picked up for those working with Rust and Cargo builds. Other posts will follow for other languages as I learn the necessary practices.

Base Dockerfile

Let's look at a base Dockerfile for a typical Cargo project. I have opted to use a real project I work on as an example (although I can't go into too much detail). This project is relatively small in terms of code space, but does have several dependencies on projects such as futures-rs, tokio, etc.

A very simple Dockerfile (and I'm sure one we have all written) could look something like this:

# select image
FROM rust:1.23 # copy your source tree
COPY ./ ./ # build for release
RUN cargo build --release # set the startup command to run your binary
CMD ["./target/release/my_project"]

It's minimal, but works just fine - you copy across your project and build it ready for release. Let's look at how a build like this fares (assuming you have the base image rust:1.23 already downloaded):

Cargo build time: 227.7s
Total time taken: 266.6s
Final image size: 1.66GB

That's roughly a 4.5 minute build, and a very large image (I was genuinely suprised by that number). But now we have our baseline! Let's see what we can do to improve it.

Optimizing Build Times

The first aspect we're going to look at is the amount of time taken to build an image (partly because otherwise I'll have to suffer 5 minute builds for the rest of this post).

It might surprise you to learn that given the base Dockerfile above, the Docker cache is almost totally useless. Every time you copy over your ./project, if anything in there has changed, the cache is invalidated. This means you'll have to sit through that build all over again, disaster!

So the first (and most important) thing we need to do is avoid that cache invalidation. Part of the reason the build is so long is that Cargo is building all of your dependencies, as they're pulled in at the same time as your source is being compiled. Lucky for us, there's a neat little trick which can get your dependencies into the cache (and therefore speed up your builds).

In essence, you need to create a new Cargo project inside the image, with all of the same dependencies as you, and compile that before you move your source code across. An example could look something like this:

# select image
FROM rust:1.23 # create a new empty shell project
RUN USER=root cargo new --bin my-project
WORKDIR /my-project # copy over your manifests
COPY ./Cargo.lock ./Cargo.lock
COPY ./Cargo.toml ./Cargo.toml # this build step will cache your dependencies
RUN cargo build --release
RUN rm src/*.rs # copy your source tree
COPY ./src ./src # build for release
RUN rm ./target/release/deps/my_project*
RUN cargo build --release # set the startup command to run your binary
CMD ["./target/release/my_project"]

This image will build all dependencies before you introduce your source code, which means they'll be cached most of the time. Only when you change your actual dependencies will they need to be recompiled (if you change Cargo.toml or Cargo.lock). Make sure to note that we've also changed to copy a specific set of files to avoid accidentally invalidating the cache. Let's see how this fares:

Cargo build time: 191.6s + 30.6s (222.2s)
Total time taken: 262.3s
Final image size: 1.67GB

This looks almost the same, and is expected because there's nothing in the cache yet. On the first build you'll see almost no improvement, it's the next build where these changes really shine (to test this, make sure to change a file in your src tree):

Cargo build time: 0s + 30.9s (30.9s)
Total time taken: 33.6s
Final image size: 1.67GB

The dependencies are the same so your cache is hit, and bang, you're now building at speeds 15% of the original time taken. As far as I know (at this point), there's very little else we can do to gain a faster build time.

Update: in later versions of Cargo it's necessary to remove the build artifacts from target/release/deps. Cargo seems to skip rebuilding in some cases if this is not included. I'm not entirely sure when this was introduced, but it appears to have been occurring since around 1.28?

Optimizing Build Sizes

Even with our changes to speed up builds, the image sizes are still large. There are a couple of things we can do at this point, with the major one being to make use of a multi-stage Docker build. These builds allow you to "chain" builds to copy artifacts from one build to another, thus lowering the amount of churn in your final image.

It's super simple to change, too. Here's our Dockerfile from before, but with an extra stage added to hold only the build artifact (no Cargo caches, etc):

# select build image
FROM rust:1.23 as build # create a new empty shell project
RUN USER=root cargo new --bin my_project
WORKDIR /my_project # copy over your manifests
COPY ./Cargo.lock ./Cargo.lock
COPY ./Cargo.toml ./Cargo.toml # this build step will cache your dependencies
RUN cargo build --release
RUN rm src/*.rs # copy your source tree
COPY ./src ./src # build for release
RUN rm ./target/release/deps/my_project*
RUN cargo build --release # our final base
FROM rust:1.23 # copy the build artifact from the build stage
COPY --from=build /my_project/target/release/my_project . # set the startup command to run your binary
CMD ["./my_project"]

At this point, our image has cut down from 1.67GB to 1.45GB, which is an improvement but still not what we're looking for. This is actually the base size of rust:1.23, which gives us a hint of where to look next. The tag we're using is pulling in the stretch version of the image. We can optimize here by moving to their jessie version, since we don't care about the extra stuff. Simply changing "rust:1.23" to "rust:1.23-jessie" in the Dockerfile above gets us from 1.45GB to 1.23GB, so we're about ~75% from where we started.

For some projects (i.e. not Rust), this is as far as you can get - we're basically the same size as the official builds. However we have one last trick up our sleeve, and it's related to the fact that these Rust projects compile to a single (executable) binary. Once we have this binary, we don't actually need any of the Rust toolchains, nor Cargo, etc. This means that we can actually just use a raw jessie image as the source image of the second stage in our Dockerfile, so let's just substitute for jessie-slim. The results are amazing:

REPOSITORY      TAG       IMAGE ID          CREATED              SIZE
my-project dev 9bc6aeae4190 50 seconds ago 86.5MB
my-project base b881102368b6 37 minutes ago 1.66GB

That's not a typo, our image is functionally the same (for the purposes of our project), and yet it's only 87MB in size. Fantastic!

Optimization Results

When we first looked at the build results for our base Dockerfile, it didn't seem so bad (but I bet it does now!). Below is a comparison of the base and the final build times and sizes:

# base statistics
Cargo build time: 227.7s
Total time taken: 266.6s
Final image size: 1.66GB # optimized statistics
Cargo build time: 30.9s (~13.5%)
Total time taken: 33.6s (~12.6%)
Final image size: 86.5MB (~5.2%)

Perhaps the best part here is that all of this can be changed in a half hour (in fact, this blog post took me ~45 minutes whilst also working back through the changes myself). We're not building anything new, we're just making good use of the existing Docker tooling. I highly recommend that everyone take some time out to think about how they can do similar practices for their projects (there are similar tactics for things like Maven). Taking a half hour will be paid back after building the base image above another 8 times, after all.

Please reach out if you have any questions, or anything needs clarification. If you have actual build improvements, please also reach out (most of this stuff is things I've stumbled over so I'm sure there's other neat stuff).

Optimizing Docker Images for Rust Projects的更多相关文章

  1. Docker Resources

    Menu Main Resources Books Websites Documents Archives Community Blogs Personal Blogs Videos Related ...

  2. 24 week 4 安装 docker

    安装docker 出现问题 解决办法https://blog.csdn.net/VOlsenBerg/article/details/70140211 发现链接超时,然后就https://blog.c ...

  3. rust 入门

    hello rust fn main() { println!("Hello, world!"); } 从hello world入手,rust的语法是比较简洁. 在mac os中, ...

  4. rust cargo 一些方便的三方cargo 子命令扩展

    内容来自cargo 的github wiki,记录下,方便使用 可选的列表 cargo-audit - Audit Cargo.lock for crates with security vulner ...

  5. NodeJS 服务 Docker 镜像极致优化指北

    这段时间在开发一个腾讯文档全品类通用的 HTML 动态服务,为了方便各品类接入的生成与部署,也顺应上云的趋势,考虑使用 Docker 的方式来固定服务内容,统一进行制品版本的管理.本篇文章就将我在服务 ...

  6. swoole和erlang通信测试

    直接用docker跑环境 docker pull xlight/docker-php7-swoole docker run -it -v ~/Projects/php/swoole:/workdir ...

  7. 【转帖】Service Discovery: 6 questions to 4 experts

    https://highops.com/insights/service-discovery-6-questions-to-4-experts/ What’s Service Discovery? I ...

  8. 1.3 guessing game

    创建项目 [root@itoracle test]# cargo new guessing_game Created binary (application) `guessing_game` pack ...

  9. EOS基础全家桶(十四)智能合约进阶

    简介 通过上一期的学习,大家应该能写一些简单的功能了,但是在实际生产中的功能需求往往要复杂很多,今天我就继续和大家分享下智能合约中的一些高级用法和功能. 使用docker编译 如果你需要使用不同版本的 ...

随机推荐

  1. Map集合遍历的四种方式理解和简单使用-----不能for循环遍历

    Map集合遍历的四种方式理解和简单使用   ~Map集合是键值对形式存储值的,所以遍历Map集合无非就是获取键和值,根据实际需求,进行获取键和值 1:无非就是通过map.keySet()获取到值,然后 ...

  2. 第二篇 界面开发 (Android学习笔记)

    第二篇 界面开发 第5章 探索界面UI元素 ●The Android View Class     ●△Widget设计步骤 需要修改三个XML,以及一个class: 1)第一个xml是布局XML文件 ...

  3. Cracking The Coding Interview3.3

    //Imagine a (literal) stack of plates. If the stack gets too high, it might topple. Therefore, in re ...

  4. html回顾随笔1(*^__^*)

    1.text—align 与float 区别: float是针对div一类的容器来说.text-align是对于容器里的文本或者图片来说靠左或靠右水平对齐(vlign 竖直方向) 要注意以下几点:   ...

  5. mysql修改lower_case_table_names产生的问题

    1.参数含义: lower_case_table_names: 此参数不可以动态修改,必须重启数据库 lower_case_table_names = 1 表名存储在磁盘是小写的,但是比较的时候是不区 ...

  6. 装饰器-wrapper

    我跟别人说我精通python,别人问我wrapper是啥,我说不知道,尼玛,原来wrapper就是装饰器,熟的不得了啊,英语真是我的克星啊. 闭包 closure 在认识装饰器之前先认识下闭包 闭包, ...

  7. redis 五大数据类型之hash篇

    1.hset/hget/hmset/hmget/hgetall/hdel --hgetall 是以截图中 key-value 分别一一显示出来,k1对应v1 ,k2对应v2 2.hlen 3.hexi ...

  8. <HBase><Scan>

    Overview The Scan operation for HBase. Scan API All operations are identical to Get with the excepti ...

  9. Git客户端的安装与配置入门

    GitLab与Git客户端的安装与配置入门,每次配置完一段时间,就忘记配置过程了,为了自己和同学们以后有所参照,特记录了本次下载和配置,其实Git就是一个版本控制系统,类似于SVN,CVS等 下载:W ...

  10. JS数据的基本类型

    字符串   String 数字    Number 布尔    Boolean Null     空 Undefined Object   对象  Array 数组   json   function ...