Debugging DataPower TLS / SSL Errors

Executive Summary

SSL / TLS is a core requirement for a secure infrastructure. Often, the first time that two systems need to communicate will result in the handshake failing. In older firmware versions there was little information logged as to the specific cause of the problem. This caused both sides of the connection to extensively walk their respective configurations, usually after a round of blaming each other. In the newer firmware, this has gotten much better by allowing the SSL Library to emit errors into the DataPower log.

This article goes into depth about common handshaking errors through examples. It should help to reduce the amount of time spent debugging so that the project can move on to higher value business testing.

Introduction

SSL / TLS is the most popular method of securing a connection between two endpoints. It allows the client and server to verify identity and then encrypt the subsequent data exchange. In an IBM DataPower Gateway environment, this is the most common integration point and also the most common location for trouble.

In the older firmware, you would have to walk a checklist of the entire SSL Proxy Profile to guess at what had gone wrong. It was a tedious process. Luckily, since firmwares 5.0.0.18, 6.0.0.10, 6.0.1.6, 7.0.0.3, 7.1.0.0 and 7.2.0.0 DataPower writes SSL library errors to the log. These errors occur at a lower abstraction level and therefore provide better granularity on the specific cause of the failure. This improves the debugging process substantially but still does not allieviate the DataPower administrator of a solid knowledge of TLS.

In this article, we issue deficient TLS requests to a DataPower endpoint and show what is captured in the log. To reproduce these tests, simply grab a copy of openssl and use the ‘s_client’ option.

The Red Herring Error

Before we begin, we need to point out this error message that will almost always occur in the DataPower logs when SSL goes wrong:

A junior DataPower administrator will hone in on this message and proudly declare that the client was misconfigured because it did not send a certificate. This sends the client team scurrying to their server and staring at their certificate. While the DataPower admin is technically correct that a certificate never arrived, it is more likely that the handshake failed long before the point where certificate presentation occurs.

It would be nice if the DataPower development team stopped logging this message, or at least changed the text for accuracy. It’s written too authoritatively. It’s almost never the root cause and it wastes critical debugging time. This waste gets much worse when the client is an external vendor and inexperienced in SSL troubleshooting.

In all of our examples, you will see this message written even though it had nothing to do with the certificate itself. Be aware.

Client Sends HTTP Request

In this scenario, the client forgot that we’re an HTTPS endpoint and hit it with plain ol’ HTTP. In the old firmware, you would see an error about writing a descriptor(8). Today, you clearly see that the SSL library identified the client as HTTP. This error can be common when network health checkers are misconfigured and don’t use half-open to determine service availability. Effort should be made to fix the extraneous errors from occurring in the log.

TLS 1.0 when only 1.1/1.2  allowed

In this scenario, the client sent a TLS 1.0 connection but DataPower is configured to only accept TLS1.1 or TLS1.2 connections. This is common when integrating with legacy environments that may not support the latest TLS protocols. DataPower writes that an unknown protocol was encountered.

TLS 1.2 when only 1.1 allowed

In this scenario, the client send a TLS 1.2 connection but DataPower is configured to only accept TLS 1.1. This error message occurs when a higher TLS version arrives than what the server supports. For completeness, the same error occurred when the client sent TLS 1.1 but DataPower only accepted 1.0.

SSLv3 when only TLS 1.0+ allowed

In this scenario, the client sends an SSLv3 connection but DataPower is configured to only accept TLS 1.0, 1.1 and 1.2. The same error message occurs if the client sends SSLv2. SSLv3 and SSLv2 are ancient versions and any occurrence of this error in the DataPower environment should be forwarded to the enterprise security team to ensure that its still suitable for the client to be so backleveled.

Cipher mismatch

In this scenario, the client allowable cipher list and the DataPower allowable cipher list have no matching entry. This should be an extremely rare scenario and the resolution will require the two administrators to talk, as it’s difficult to know the list of ciphers that the client supports.

No Client certificate

In the scenario, the client legitimately did not present a certificate to the DataPower endpoint. We can tell that this is the root cause error as the SSL library error confirmed a lacking certificate.

Expired certificate

In this scenario, the client has presented an expired certificate to DataPower. This will happen more often than it should, as certificate management is a critical component of a secure infrastructure but not enough attention is given to it in the enterprise.

DataPower unable to verify client certificate

In this scenario, DataPower was unable to validate the certificate presented by the client. This configuration is stored in the Validation Credentials associated with the SSL Server Profile. Unfortunately, this error will occur for a broad range of verification errors, such as:

  • The client certificate is not in the certificate list when the certificate validation mode is ‘Exact Certificate’.
  • An intermediate signer chain certificate is missing when the certificate validation mode is  ‘Full Certificate Chain Checking (PKIX)’ or ‘Match Exact Certificate or Immediate Issuer’
  • The client certificate is self-signed and is not in the certificate list

Debugging this error will require a copy of the client certificate, either to inspect the signer chain or to add the certificate to the validation credentials.

Client Certificate Validated

In this scenario, the client certificate was successfully validated by DataPower. The log message includes the DN and the validation credential used. This message will only be seen in the logs when the log target is at the ‘info’ level for the ‘crypto’ object.

Client cannot verify DataPower Server certificate

In this scenario, it’s the client who was unable to verify the certificate presented by DataPower via the identification credentials. Certificate verification is a two way street and if the client doesn’t verify the server cert then they will close the connection and this error will occur. In previous firmware levels, you would just see the connection close. The rule of thumb was that the side of the connection that closed the connection will also be the one that writes the error in the local logs. Luckily in the new firmware, we get a hint that the client didn’t like our cert.

Testing

At the practical-datapower GitHub page, you can download the OSC_PD_SSLError domain and install it into your own device. This will save you the time of creating the keys and certs to test the various scenarios. It contains an XML Firewall configured with SSL with all of the certs and keys uploaded to the domain.

The client certificates and corresponding private keys can be downloaded as well.

Conclusion

In this article, we ran real world scenarios of the common reasons that TLS handshakes fail. The TLS errors in the latest firmware levels has greatly improved the error messages and should be the first place to look to determine why a connection failed. If you are stuck on a lower firmware level, then the SSL library errors won’t be seen and you will have to inspect the config manually.

About the Author

Dan Zrobok

Twitter

Dan is the owner of Orange Specs Consulting, with over 14 years of experience working in enterprise systems integration. He is an advocate of the IBM DataPower Gateway platform and looks to improve environments that have embraced it. He also occasionally fights dragons with his three year old daughter Ruby, and newborn Clementine.

Share this Post