0%

Fix Docker Startup Failures with "layer does not exist"

I recently ran into a server where Docker would not start at all. The log message was:

msg=”Error starting daemon: layer does not exist”

Symptom

Restarting the service repeatedly did not help. The same error kept appearing:

msg=”Error starting daemon: layer does not exist”

Sometimes there may also be other filesystem-related errors in the logs.

Fix

After searching around, I found that there often is not a particularly graceful recovery path. In practice, the usual answer is to clear /var/lib/docker.

The Docker project also published a slightly safer cleanup script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#!/bin/sh
set -e

dir="$1"

if [ -z "$dir" ]; then
{
echo 'This script is for destroying old /var/lib/docker directories more safely than'
echo ' "rm -rf", which can cause data loss or other serious issues.'
echo
echo "usage: $0 directory"
echo " ie: $0 /var/lib/docker"
} >&2
exit 1
fi

if [ "$(id -u)" != 0 ]; then
echo >&2 "error: $0 must be run as root"
exit 1
fi

if [ ! -d "$dir" ]; then
echo >&2 "error: $dir is not a directory"
exit 1
fi

dir="$(readlink -f "$dir")"

echo
echo "Nuking $dir ..."
echo ' (if this is wrong, press Ctrl+C NOW!)'
echo

( set -x; sleep 10 )
echo

dir_in_dir() {
inner="$1"
outer="$2"
[ "${inner#$outer}" != "$inner" ]
}

for mount in $(awk '{ print $5 }' /proc/self/mountinfo); do
mount="$(readlink -f "$mount" || true)"
if dir_in_dir "$mount" "$dir"; then
( set -x; umount -f "$mount" )
fi
done

if command -v btrfs > /dev/null 2>&1; then
root="$(df "$dir" | awk 'NR>1 { print $NF }')"
root="${root#/}"
for subvol in $(btrfs subvolume list -o "$root/" 2>/dev/null | awk -F' path ' '{ print $2 }' | sort -r); do
subvolDir="$root/$subvol"
if dir_in_dir "$subvolDir" "$dir"; then
( set -x; btrfs subvolume delete "$subvolDir" )
fi
done
fi

( set -x; rm -rf "$dir" )

After saving and running that script, clearing /var/lib/docker, and restarting Docker, the daemon came back successfully.

References:

如果我的文字帮到了您,那么可不可以请我喝罐可乐?