Blog
Challenges and Misconceptions of Certificate Revocation in PKI
By Qamar Peer Bellary Sadiq, CISSP, CCSP
Public Key Infrastructure is the most commonly used technology in security space for the purpose of establishing Authentication, Data Integrity, Non-Repudiation, email encryption, SSL/TLS with X.509 Certificates (also known as Digital Certificate). Digital Certificate is a form of a digital identity document in the digital world and helps identify users, entities and servers.
PKI is an amalgamation of a suite of protocols, people, processes and technologies that must work in a synchronized manner to create, store, distribute, manage and revoke digital identities. However, there exists real world challenges, pitfalls and misconceptions around Certificate Status validation in the PKI technology space that need to be highlighted.
Misconceptions about Certificate Revocation
- Revocation of digital certificates is for expired certificates
This is the most commonly misunderstood concept. Revocation is only for valid certificates that have to be revoked prior to their expiry for various reasons.
- Revocation of digital certificates is not needed for unused certificates
Many certificate owners assume that “unused certificates” are not worthy enough for revocation. However, unused certificates are the riskiest to be exploited, hence unused certificates must be revoked without any delays by certificate owners. Certificate owners can exercise “Exception” to this approach only for short-lived certificates whose validity is less than 90 days as a general principle subject to the risk appetite & business requirements of their organization.
- Revocation of digital certificates is seldom a real-time aspect
Another major misconception about Certificate Revocation is that it is “automatic” and “immediate.” In fact, it is neither automatic, as it is based on the certificate owner’s initiative to call for revocation, nor immediate, since the Certificate Authority (CA) must follow certain mandatory steps to revoke a certificate. This consumes procedural time, before the CA can publish the new Certificate Revocation List (CRL) and all the consumers can download the CRLs again or refresh their Online Certificate Status Protocol (OCSP) cache again to avoid stale status information.
Challenges in Consuming CRL/OCSP for validation
- CRL Downloading & CRL File Size
Downloading a CRL from a central repository has its own challenges, especially if the CRL size grows in size which introduces delays. Some applications or products cannot consume CRL files which has more than specific number of entries or large file sizes and this point drives the CRL publishing design and its housekeeping activities to manage/reduce CRL file size.
- Managing Stale CRLs or Refreshing CRLs
Applications that rely on CRLs may have issues validating certificate status if the CRLs are not refreshed periodically. For a CRL based revocation check, a certificate is always considered good enough if its serial number is not found in the CRL.
- Immediate CRL Publish vs. Scheduled CRL Publish
A revoked certificate will not go immediately into the CRL unless the solution is designed & implemented to publish the CRL immediately after every revocation event. This can introduce a time avenue wherein, the certificate can be used for malicious purposes from the time, the certificate is revoked, to the time revocation becomes effective.
- Validation Challenges of OCSP Service
If an OCSP solution is based on CRLs, this might have same issues as explained in stale CRLs, where until the OCSP is not refreshed and reloaded with latest CRLs; the revoked certificate can still be considered good. Hence OCSP clients refreshing last cached results with latest update based on Next Update is paramount.
- Availability Challenges of Real Time OCSP Service
If the OCSP solution is a real-time based solution checking for every certificate status (not relying on CRLs) then the challenge of ensuring OCSP responder’s availability along with response times increases manifold. Applications that ignore Next Update or if applications which do not cache the last response are bound to flood the network and increase the load on OCSP Responder service.
Correct Implementation of Certificate Revocation
- Understand the business requirements clearly and perform a thorough risk analysis before deploying a particular Certificate Status Validation solution (CRL based Vs Real-time)
- Understand the differences between CRL and OCSP correctly; CRL is a kind of blacklist and OCSP is a kind of a whitelist
- Know how to calculate the CRL validity: the CRL file will have the attributes Effective Date and Next update. The difference between the Next Update and Effective Date is the validity of the CRL file
- Ensure the CRLs are published before the Next Update period to allow for sufficient time for the consuming applications to pick up the new CRL while the existing one is about to expire
- Have both CRL and OCSP URIs incorporated in the certificate as OCSP should be preferred method to check the certificate status and CRL should be the fallback mechanism
- Ensure CRL and OCSP servers are designed with High Availability in mind as the revocation providers are the most critical piece of a PKI
- Periodic housekeeping activities to keep CRL size in check is recommended
- Recommend or suggest application owners to implement caching (Refer RFC5019 Section-6 ) and ensure timely refresh of CRLs
- Recommend or suggest application owners to have CRL File download as a backup option instead of only relying on OCSP Service
- OCSP clients must have an accurate source of time or use nonces if the OCSP responder supports them to ensure that the OCSP responses they receive are sufficiently fresh
Summary
There are pros and cons of each revocation method but CRL and OCSP are the widely used methods today. OCSP is the preferred method and CRL is the fallback method. When both of these methods are used judiciously into the certificates and by the consuming applications, it can go a long way in addressing the problems of getting the revocation status when needed at the consuming applications end.