AES-NI support for OpenSSL

Asked by Jason Campbell

I'm not sure where the right place to ask this is, but would it be possible to have AES-NI support added to the OpenSSL package? This greatly accelerates AES ciphers on Intel processors Westmere or newer, making it faster than RC4.

Supported processors:
http://en.wikipedia.org/wiki/AES_instruction_set#Supporting_CPUs

Unscientific benchmark here to show the difference:
http://zombe.es/post/4078724716/openssl-cipher-selection

It is supported in the official RHEL package, but unfortunately it doesn't support TLSv1.1/1.2 like the IUS one does.

Thanks

Question information

Language:
English Edit question
Status:
Solved
For:
IUS Community Project Edit question
Assignee:
No assignee Edit question
Solved by:
Jason Campbell
Solved:
Last query:
Last reply:
Revision history for this message
Jeffrey Ness (jeffrey-ness) said :
#1

Hello Jason,

Thank you for taking the time to post your question.

The IUS OpenSSL package was built using the same SPEC as Redhat,
but then increased in version. You will notice the Redhat changelogs
do mention AES-NI support is available.

     $ rpm -qp http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/openssl10-1.0.1e-1.ius.el6.x86_64.rpm --changelog | grep -B5 AES-NI

     * Wed Sep 07 2011 Tomas Mraz <email address hidden> 1.0.0e-1
     - new upstream release fixing CVE-2011-3207 (#736088)

     * Wed Aug 24 2011 Tomas Mraz <email address hidden> 1.0.0d-8
     - drop the separate engine for Intel acceleration improvements
       and merge in the AES-NI, SHA1, and RC4 optimizations
     - add support for OPENSSL_DISABLE_AES_NI environment variable
       that disables the AES-NI support
     --
     * Wed Jul 20 2011 Tomas Mraz <email address hidden> 1.0.0d-6
     - add support for newest Intel acceleration improvements backported
       from upstream by Intel in form of a separate engine

     * Thu Jun 09 2011 Tomas Mraz <email address hidden> 1.0.0d-5
     - allow the AES-NI engine in the FIPS mode
     --
     - fix CVE-2009-4355 - leak in applications incorrectly calling
       CRYPTO_free_all_ex_data() before application exit (#546707)
     - upstream fix for future TLS protocol version handling

     * Wed Jan 13 2010 Tomas Mraz <email address hidden> 1.0.0-0.18.beta4
     - add support for Intel AES-NI

When building this package I do not recall disabling features, and AES-NI doesn't sound familiar (as something I would of removed).

Can you post the output you are seeing on a Intel machine, and show that AES-NI is not present?

Thanks

Revision history for this message
Jason Campbell (j-campbell7) said :
#2

From RHEL:

$ rpm -qa | grep openssl
openssl-1.0.0-27.el6.x86_64

$ openssl version
OpenSSL 1.0.0-fips 29 Mar 2010

$ openssl engine
(aesni) Intel AES-NI engine
(dynamic) Dynamic engine loading support

From IUS:

$ rpm -qa | grep openssl
openssl10-libs-1.0.1e-1.ius.el6.x86_64
openssl10-1.0.1e-1.ius.el6.x86_64

$ openssl version
OpenSSL 1.0.1e 11 Feb 2013

$ openssl engine
(rsax) RSAX engine support
(dynamic) Dynamic engine loading support

Revision history for this message
Jason Campbell (j-campbell7) said :
#3

After finding a Ubuntu thread, it seems the optimizations are built-in now. Here is benchmark comparison:

RHEL:

$ openssl speed -engine aesni aes-256-cbc
engine "aesni" set.
OpenSSL 1.0.0-fips 29 Mar 2010
built on: Fri Oct 12 05:52:01 EDT 2012
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wa,--noexecstack -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 74433.36k 80299.29k 81754.11k 172597.93k 173405.53k

IUS:

$ openssl speed aes-256-cbc
OpenSSL 1.0.1e 11 Feb 2013
built on: Wed Feb 13 11:31:32 EST 2013
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 74276.77k 78848.39k 81812.65k 173653.67k 171010.73k

I think that clearly falls into the margin of error, this can be considered closed. Thanks.

Revision history for this message
Jeffrey Ness (jeffrey-ness) said :
#4

I'm glad to hear!

Thanks for updating us :)

Jeffrey-

Revision history for this message
Sandy (san-patil) said :
#5

The above output is confusing and the performance enhancement is not coming out very clear to conclude if it is enabled or not ?

openssl speed -engine aesni aes-256-cbc
aes-256 cbc 74433.36k 80299.29k 81754.11k 172597.93k 173405.53k

Vs

$ openssl speed aes-256-cbc
aes-256 cbc 74276.77k 78848.39k 81812.65k 173653.67k 171010.73k

Should one be using
openssl speed -evp aes-256-cbc

to check if the speed is good and if so conclude that the Intel engine is enabled ?

I learnt from https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1001424 thread that the engine is embedded into evp and above should be the command. But dont know if its true for OpenSSL 1.0.0-fips 29 Mar 2010 version ?

Thoughts ?

Revision history for this message
Jason Campbell (j-campbell7) said :
#6

You're right, sorry. Neither of those tests was using AES-NI, getting much better numbers now. Here are the updated numbers I'm getting:

RHEL:

$ openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 111448922 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 64 size blocks: 30659103 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 7787212 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1959846 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 244826 aes-256-cbc's in 2.99s
OpenSSL 1.0.0-fips 29 Mar 2010
built on: Wed Aug 15 12:48:02 EDT 2012
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wa,--noexecstack -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 596382.19k 654060.86k 664508.76k 668960.77k 670774.11k

IUS:

$ openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 111595718 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 29942257 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 256 size blocks: 7799239 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1958848 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 239345 aes-256-cbc's in 3.00s
OpenSSL 1.0.1e 11 Feb 2013
built on: Wed Feb 13 11:31:32 EST 2013
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 595177.16k 640904.50k 665535.06k 668620.12k 653571.41k

So to summarise:

Without AES-NI:

RHEL: openssl speed -engine aesni aes-256-cbc
aes-256 cbc 74433.36k 80299.29k 81754.11k 172597.93k 173405.53k
IUS: openssl speed aes-256-cbc
aes-256 cbc 74276.77k 78848.39k 81812.65k 173653.67k 171010.73k

With AES-NI:

RHEL: openssl speed -evp aes-256-cbc
aes-256-cbc 596382.19k 654060.86k 664508.76k 668960.77k 670774.11k
IUS: openssl speed -evp aes-256-cbc
aes-256-cbc 595177.16k 640904.50k 665535.06k 668620.12k 653571.41k

So about a 4-8x improvement in throughput depending on block size. However, the tests are still the same between the RHEL version and IUS, so this bug can stay closed.