The peralsm in aesni-xts-avx512 currently checks for GNU assembler 2.26
or higher. According to reporters it looks like we need 2.30.
This PR just attempts fix version check so people with older
tool chains can build OpenSSL.
Fixes#27049
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/27078)
We were using the first (or second) argument containing a '.' as the
output name file, but it may be incorrect as -march=la64v1.0 may be in
the command line. If the builder specifies -march=la64v1.0 in the
CFLAGS, the script will write to a file named "-march=la64v1.0" and
cause a build error with cryptic message:
ld: crypto/pem/loader_attic-dso-pvkfmt.o: in function `i2b_PVK':
.../openssl-3.4.1/crypto/pem/pvkfmt.c:1070:(.text+0x11a8): undefined reference to `OPENSSL_cleanse'
Adapt the approach of ARM and RISC-V (they have similar flags like
-march=v8.1-a or -misa-spec=2.2) to fix the issue.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26717)
Commit 1d1ca79fe3 introduced
save and restore for the registers, saving them as
stp d8,d9,[sp, #16]
stp d10,d11,[sp, #32]
stp d12,d13,[sp, #48]
stp d14,d15,[sp, #64]
But the restore code was inadvertently typoed:
ldp d8,d9,[sp, #16]
ldp d10,d11,[sp, #32]
ldp d12,d13,[sp, #48]
ldp d15,d16,[sp, #64]
Restoring [sp, #64] into d15,d16 instead of d14,d15.
Fixes: #26466
CLA: trivial
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26469)
This changeset brings a finishing touch to stuff we got from botovoq@
Changes to `crypto/perlasm/arm-xlate.pl` deal with verious assembler
flavours to keep various assembler compilers happy.
We also need to keep original code for 32-bit flavour in
`crypto/aes/asm/aesv8-armx.pl`.
Reviewed-by: Hugo Landau <hlandau@devever.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24137)
In order to get asm code running on OpenBSD we must place
all constants into .rodata sections.
The change to crypto/perlasm/arm-xlate.pl adjusts changes
from Theo for additional assembler variants/flavours we
use for building OpenSSL.
Fixes#23312
Reviewed-by: Hugo Landau <hlandau@devever.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24137)
Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24518)
In order to get asm code running on OpenBSD we must place
all constants into .rodata sections.
davidben@ also pointed out we need to adjust `x86_64-xlate.pl` perlasm
script to adjust read-olny sections for various flavors (OSes). Those
changes were cherry-picked from boringssl.
closes#23312
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23997)
Reviewed-by: Neil Horman <nhorman@openssl.org>
Release: yes
(cherry picked from commit 0ce7d1f355)
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24034)
The following files referred to ../liblegacy.a when they should have
referred to ../../liblegacy.a. This cause the creation of a mysterious
directory 'crypto/providers', and because of an increased strictness
with regards to where directories are created, configuration failure
on some platforms.
Fixes#23436
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/23452)
(cherry picked from commit 667b45454a)
Fixes#22818
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22860)
The AES-CTR assembly code uses v8-v15 registers, they are
callee-saved registers, they must be preserved before the
use and restored after the use.
Change-Id: If9192d1f0f3cea7295f4b0d72ace88e6e8067493
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23233)
This typedef is already created in aes_local.h as `typedef uint64_t u64;`.
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22969)
This patch provides stream and multi-block implementations for
AES-128-ECB, AES-192-ECB, and AES-256-ECB to accelerate AES-ECB.
Also, refactor functions to share the same variable
declaration in aes-riscv64-zvkned.pl.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
To accelerate the performance of the AES-XTS mode, in this patch, we
have the specialized multi-block implementation for AES-128-XTS and
AES-256-XTS.
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Support zvbb-zvkned based rvv AES-128/192/256-CTR encryption.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
To accelerate the performance of the AES-128/192/256-CBC block cipher
encryption, we used the vaesz, vaesem and vaesef instructions, which
implement a single round of AES encryption.
Similarly, to optimize the performance of AES-128/192/256-CBC block
cipher decryption, we have utilized the vaesz, vaesdm, and vaesdf
instructions, which facilitate a single round of AES decryption.
Furthermore, we optimize the key and initialization vector (IV) step by
keeping the rounding key in vector registers.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Interleave key loading and aes decrypt computing for single block aes.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Interleave key loading and aes encrypt computing for single block aes.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Even though the RISC-V vector instructions only support AES-128 and
AES-256 for key generation, the round instructions themselves can
easily be used to implement AES-192 too - we just need to fallback to
the generic key generation routines in this case.
Note that the vector instructions use the encryption key schedule (but
in reverse order) so we need to generate the encryption key schedule
even when doing decryption using the vector instructions.
Signed-off-by: Ard Biesheuvel <ardb@google.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions provide
the Zvkned extension, that provides a AES-specific instructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
In the definition of the latest revised LoongArch64 vector instruction manual,
it is clearly pointed out that the undefined upper three bits of each byte in
the control register of the vshuf.b instruction should not be used, otherwise
uncertain results may be obtained. Therefore, it is necessary to correct the
use of the vshuf.b instruction in the existing vpaes-loongarch64.pl code to
avoid erroneous calculation results in future LoongArch64 processors.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21530)
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21467)
The original text for the Apache + BSD dual licensing for riscv GCM and AES
perlasm was taken from other openSSL users like crypto/crypto/LPdir_unix.c .
Though Eric pointed out that the dual-licensing text could be read in a
way negating the second license [0] and suggested to clarify the text
even more.
So do this here for all of the GCM, AES and shared riscv.pm .
We already had the agreement of all involved developers for the actual
dual licensing in [0] and [1], so this is only a better clarification
for this.
[0] https://github.com/openssl/openssl/pull/20649#issuecomment-1589558790
[1] https://github.com/openssl/openssl/pull/21018
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21357)
To allow re-use of the already reviewed openSSL crypto code for RISC-V in
other projects - like the Linux kernel, add a second license (2-clause BSD)
to the 32+64bit aes implementations using the Zkn extension.
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/21018)
Signed-off-by: Liu-ErMeng <liuermeng2@huawei.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20797)
Original author: Nevine Ebeid (Amazon)
Fixes: CVE-2023-1255
The buffer overread happens on decrypts of 4 mod 5 sizes.
Unless the memory just after the buffer is unmapped this is harmless.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/20759)
The mention of the GPL shouldn't have been there.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20517)
Move helper functions and instruction encoding functions
into a riscv.pm Perl module to avoid pointless code duplication.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
"adrl" is a pseudo-instruction used to calculate an address relative
to PC. It's not recognized by clang resulting in a compilation error.
I've stumbled upon it when trying to integrate the bsaes-armv7 assmebly
logic into FreeBSD kernel, which uses clang as it's default compiler.
Note that this affect the build only if BSAES_ASM_EXTENDED_KEY is
defined, which is not the default option in OpenSSL.
The solution here is to replace it with an add instruction.
This mimics what has already been done in !BSAES_ASM_EXTENDED_KEY logic.
Because of that I've marked this as trivial CLA.
CLA: trivial
Signed-off-by: Kornel Dulęba <mindal@semihalf.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20458)
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.
Signed-off-by: zhuchen <zhuchen@loongson.cn>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19364)
Also fix conditional branch out of range when using sanitisers.
Fixes#18813
Signed-off-by: Tom Cosgrove <tom.cosgrove@arm.com>
Change-Id: Ic543885091ed3ef2ddcbe21de0a4ac0bca1e2494
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18816)
This restores the implementation prior to
commit 2621751 ("aes/asm/aesv8-armx.pl: avoid 32-bit lane assignment in CTR mode")
for 64bit targets only, since it is reportedly 2-17% slower,
and the silicon errata only affects 32bit targets.
Only for 32bit targets the new algorithm is used.
Fixes#18445
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18581)
aesni_ocb_encrypt and aesni_ocb_decrypt operate by having a fast-path
that performs operations on 6 16-byte blocks concurrently (the
"grandloop") and then proceeds to handle the "short" tail (which can
be anywhere from 0 to 5 blocks) that remain.
As part of initialization, the assembly initializes $len to the true
length, less 96 bytes and converts it to a pointer so that the $inp
can be compared to it. Each iteration of "grandloop" checks to see if
there's a full 96-byte chunk to process, and if so, continues. Once
this has been exhausted, it falls through to "short", which handles
the remaining zero to five blocks.
Unfortunately, the jump at the end of "grandloop" had a fencepost
error, doing a `jb` ("jump below") rather than `jbe` (jump below or
equal). This should be `jbe`, as $inp is pointing to the *end* of the
chunk currently being handled. If $inp == $len, that means that
there's a whole 96-byte chunk waiting to be handled. If $inp > $len,
then there's 5 or fewer 16-byte blocks left to be handled, and the
fall-through is intended.
The net effect of `jb` instead of `jbe` is that the last 16-byte block
of the last 96-byte chunk was completely omitted. The contents of
`out` in this position were never written to. Additionally, since
those bytes were never processed, the authentication tag generated is
also incorrect.
The same fencepost error, and identical logic, exists in both
aesni_ocb_encrypt and aesni_ocb_decrypt.
This addresses CVE-2022-2097.
Co-authored-by: Alejandro Sedeño <asedeno@google.com>
Co-authored-by: David Benjamin <davidben@google.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
This implementation is based on the four-table approach, along the same
lines as the non-constant-time implementation in aes_core.c The
implementation is in perlasm.
Utility functions are defined to automatically stack/unstack registers
as needed for prologues and epilogues. See riscv-elf-psabi-doc at
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/ for ABI details.
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Henry Brausen <henry.brausen@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17640)
gcc6.3 doesn't seem to support the register aliases fp and lr for x29 and x30,
so use the x names.
Fixes#18114
Change-Id: I077edda42af4c7cdb7b24f28ac82d1603f550108
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18127)