While we can re-create the kernel source code we don't have it
handily available in one place. This commit stashes the kernel
and the WireGuard source as /src/linux.tar.xz and
/src/wireguard.tar.xz in the kernel package.
This increases the size of the hub image by around 100MB.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
This microcode bundle comes with a file called "list"
which seems to confuse the 'iucode_tool', so we just
remove it.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
For example kernel module signatures if you do not provide a key. So add
to the dependencies for kernel builds.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Update building process to add s390 support.
The patch serial-forbid-8250-on-s390.patch has been added to disable
8250 serial for s390.
The patch is available upstream https://patchwork.kernel.org/patch/10106437/
but it is not backported.
Signed-off-by: Alice Frosi <alice@linux.vnet.ibm.com>
* receive: treat packet checking as irrelevant for timers
Small simplification to the state machine, as discussed with Mathias
Hall-Andersen.
* socket: check for null socket before fishing out sport
* wg-quick: ifnames have max len of 15
* tools: plug memleak in config error path
Important bug fixes.
* external-tests: add python implementation
Piotr Lizonczyk has contributed a test vector written in Python.
* poly1305: remove indirect calls
From Samuel Neves, we now are in a better position to mitigate speculative
execution attacks.
* curve25519: modularize implementation
* curve25519: import 32-bit fiat-crypto implementation
* curve25519: import 64-bit hacl-star implementation
* curve25519: resolve symbol clash between fe types
* curve25519: wire up new impls and remove donna
* tools: import new curve25519 implementations
* contrib: keygen-html: update curve25519 implementation
Two of our Curve25519 implementations now use formally verified C. Read this
mailing list post for more information:
https://lists.zx2c4.com/pipermail/wireguard/2018-January/002304.html
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
DOwnload and verify the Intel microcode package and convert it
to a cpio archive which can be prepended to the initrd.
It also adds the license file to the kernel package.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
* curve25519: explictly depend on AS_AVX
* curve25519: modularize dispatch
It's now much cleaner to see which implementation we're calling, and it will
be simpler to add more implementations in the future.
* compat: support RAP in assembly
This should fix PaX/Grsecurity support.
* device: do not clear keys during sleep on Android
While we want to clear keys when going to sleep on ordinary Linux, this
doesn't make sense in the Android world, where phones often sleep but are
woken up every few milliseconds by the radios to process packets.
* compat: fix 3.10 backport
Important compat fixes for non-x86.
* device: clear last handshake timer on ifdown
When bringing up an interface, we don't want the rate limiting to handshakes
to apply.
* netlink: rename symbol to avoid clashes
Allows coexistance with horrible Android drivers.
* kernel-tree: jury rig is the more common spelling
* tools: no need to put this on the stack
* blake2s-x86_64: fix spacing
Small fixes.
* contrib: keygen-html for generating keys in the browser
This was covered here:
https://lists.zx2c4.com/pipermail/wireguard/2017-December/002127.html
* tools: remove undocumented unused syntax
Not only did nobody know about this or use it, but the implementation actually
exposed compiler bugs in Qualcomm's "Snapdragon Clang".
* poly1305: update x86-64 kernel to AVX512F only
From Samuel Neves, this pulls in Andy Polyakov's changes to only require F and
not VL for the Poly implementation.
* chacha20-arm: fix with clang -fno-integrated-as.
This pulls in David Benjamin's clang fix.
* global: add SPDX tags to all files
From Greg KH, we now have SPDX annotations on all files, matching upstream
kernel's new approach to file licenses.
* chacha20poly1305: cleaner generic code
This entirely removes the last remains of Martin Willi's ChaCha
implementation, and now the generic C implementation is extremely small and
clearly written, while delivering a small performance boost too.
* poly1305: fix avx512f alignment bug
Unlucky people may have had their linkers misalign a constant. This fixes that
potential.
* chacha20: avx512vl implementation
From Samuel Neves, this imports Andy Polyakov's AVX512VL implementation of
ChaCha which should have a ~50% performance improvement over AVX2, though it
is still much slower than our AVX512F implementation.
* chacha20poly1305: wire up avx512vl for skylake-x
Some Skylake machines do not have two FMA units (though others do), so we
prefer the AVX512VL implementation over the should-be-faster AVX512F
implementation on those machines. What's needed now is to read the PIROM in
order to determine at runtime whether the particular Skylake-X machine
actually has the second FMA unit or not, but until that happens, we just fall
back to the VL implementation for all Skylake-X.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This is a double bump.
Changes 0.0.20171122:
* chacha20poly1305: fast primitives from Andy Polyakov
Samuel Neves and I have spent considerable time and headaches porting,
reworking, and partially rewriting Andy's optimized implementations of
ChaCha20 and Poly1305. We now support the following:
On x86_64:
- Poly1305: integer unit
- ChaCha20: SSSE3
- HChaCha20: SSSE3
- Poly1305: AVX
- ChaCha20: AVX2
- Poly1305: AVX2
- ChaCha20: AVX512
- Poly1305: AVX512
On ARM:
- Poly1305: integer unit
- ChaCha20: NEON
- Poly1305: NEON
On ARM64:
- Poly1305: integer unit
- ChaCha20: NEON
- Poly1305: NEON
On MIPS64:
- Poly1305: integer unit
All others:
- ChaCha20: generic C
- Poly1305: generic C
This is a pretty substantial amount of new handrolled assembly. It will
perhaps MURDER KITTENS, so please tread lightly with this snapshot and adjust
expectations accordingly. I'm looking forward to quickly fixing any issues
folks find while testing.
Performance-wise, this should see increases all around. The biggest speedups
will be on ARM and ARM64, but x86_64 and MIPS64 should also see modest speed
improvements too, especially on Skylake systems supporting AVX512.
* chacha20poly1305: add more test vectors, some of which are weird
Test vectors are pretty important, so we added more to catch odd edge cases
using the following butcher's code:
from cryptography.hazmat.primitives.ciphers.aead import ChaCha20Poly1305
import os
def encode_blob(blob):
a = ""
for i in blob:
a += "\\x" + hex(i)[2:]
return a
enc = [ ]
dec = [ ]
def make_vector(plen, adlen):
key = os.urandom(32)
nonce = os.urandom(8)
p = os.urandom(plen)
ad = os.urandom(adlen)
c = ChaCha20Poly1305(key).encrypt(nonce=bytes(4) + nonce, data=p, associated_data=ad)
out = "{\n"
out += "\t.key\t= \"" + encode_blob(key) + "\",\n"
out += "\t.nonce\t= \"" + encode_blob(nonce) + "\",\n"
out += "\t.assoc\t= \"" + encode_blob(ad) + "\",\n"
out += "\t.alen\t= " + str(len(ad)) + ",\n"
out += "\t.input\t= \"" + encode_blob(p) + "\",\n"
out += "\t.ilen\t= " + str(len(p)) + ",\n"
out += "\t.result\t= \"" + encode_blob(c) + "\"\n"
out += "}"
enc.append(out)
out = "{\n"
out += "\t.key\t= \"" + encode_blob(key) + "\",\n"
out += "\t.nonce\t= \"" + encode_blob(nonce) + "\",\n"
out += "\t.assoc\t= \"" + encode_blob(ad) + "\",\n"
out += "\t.alen\t= " + str(len(ad)) + ",\n"
out += "\t.input\t= \"" + encode_blob(c) + "\",\n"
out += "\t.ilen\t= " + str(len(c)) + ",\n"
out += "\t.result\t= \"" + encode_blob(p) + "\"\n"
out += "}"
dec.append(out)
make_vector(0, 0)
make_vector(0, 8)
make_vector(1, 8)
make_vector(1, 0)
make_vector(129, 7)
make_vector(256, 0)
make_vector(512, 0)
make_vector(513, 9)
make_vector(1024, 16)
make_vector(1933, 7)
make_vector(2011, 63)
print("======== encryption vectors ========")
print(", ".join(enc))
print("\n\n\n======== decryption vectors ========")
print(", ".join(dec))
* wg-quick: document localhost exception and v6 rule
Probably a "kill switch" wants this too:
-m addrtype ! --dst-type LOCAL
so that basic local services can continue to work.
* selftest: allowedips: randomized test mutex update
* allowedips: do not write out of bounds
* device: uninitialize socket first in destruction
* tools: tighten up strtoul parsing
Small fixups.
* qemu: update kernel
* qemu: use unprefixed strip when not cross-compiling
Fedora/Redhat doesn't ship with a prefixed strip, and we don't need
to use it anyway when we're not cross compiling, so don't.
* compat: 3.16.50 got proper rt6_get_cookie
* compat: stable finally backported fix
* compat: new kernels have netlink fixes
* compat: fix compilation with PaX
Usual set of compatibility updates.
* curve25519-neon: compile in thumb mode
In thumb mode, it's not possible to use sp as an operand of and, so
we have to muck around with r3 as a scratch register.
* socket: only free socket after successful creation of new
When an interface is down, the socket port can change freely. A socket
will be allocated when the interface comes up, and if a socket can't be
allocated, the interface doesn't come up.
However, a socket port can change while the interface is up. In this
case, if a new socket with a new port cannot be allocated, it's
important to keep the interface in a consistent state. The choices are
either to bring down the interface or to preserve the old socket. This
patch implements the latter.
* global: switch from timeval to timespec
This gets us nanoseconds instead of microseconds, which is better, and
we can do this pretty much without freaking out existing userspace,
which doesn't actually make use of the nano/microseconds field. The below
test program shows that this won't break existing sizes:
zx2c4@thinkpad ~ $ cat a.c
void main()
{
puts(sizeof(struct timeval) == sizeof(struct timespec) ?
"success" : "failure");
}
zx2c4@thinkpad ~ $ gcc a.c -m64 && ./a.out
success
zx2c4@thinkpad ~ $ gcc a.c -m32 && ./a.out
success
Changes 0.0.20171127:
* compat: support timespec64 on old kernels
* compat: support AVX512BW+VL by lying
* compat: fix typo and ranges
* compat: support 4.15's netlink and barrier changes
* poly1305-avx512: requires AVX512F+VL+BW
Numerous compat fixes which should keep us supporting 3.10-4.15-rc1.
* blake2s: AVX512F+VL implementation
* blake2s: tweak avx512 code
* blake2s: hmac space optimization
Another terrific submission from Samuel Neves: we now have an implementation
of Blake2s using AVX512, which is extremely fast.
* allowedips: optimize
* allowedips: simplify
* chacha20: directly assign constant and initial state
Small performance tweaks.
* tools: fix removing preshared keys
* qemu: use netfilter.org https site
* qemu: take shared lock for untarring
Small bug fixes.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>