x509 vs Precertificate CT Entries: What You Can Decode and How to Normalize
Most CT ingestion pipelines break here: treating precerts like normal x509 certificates.
CT leaf types
Each CT entry has a Merkle tree leaf. The leaf indicates an entry type:
- x509: leaf contains a DER-encoded X.509 certificate
- precert: leaf contains issuer key hash + TBSCertificate bytes for a precertificate
x509 entries: decode the leaf cert, optionally decode chain
For x509 entries, you can decode the leaf DER normally (subject, issuer, SAN, validity, key algorithm, etc.).
The extra_data often contains chain certs; those are optional to decode and may fail.
- Decode leaf cert DER into normalized fields
- Store leaf DER b64 if you want verifiability
- Decode chain certs best-effort (non-fatal failures)
Precert entries: preserve TBSCertificate bytes and decode what you can
A precert leaf does not contain a normal x509 certificate. It contains:
issuer_key_hashtbs_certificateDER bytes (TBSCertificate)
Many libraries won’t parse TBSCertificate as a full certificate object. So the safe move is:
- Store
issuer_key_hash - Store
tbs_certificate_der_b64 - Decode chain certs from
extra_datato get issuer/subject fields when possible
Normalization strategy: don’t lie
Your normalized record should remain schema-stable across both entry types. That usually means:
- Top-level common metadata: log, index, CT timestamp, entry_type
- For x509: normalized x509 fields + optional chain
- For precert: issuer key hash + TBSCertificate bytes + optional chain + optional explicitly-labeled leaf guess
Downstream can decide whether to treat precert leaf guesses as usable inventory signals. Your job is to publish facts and be honest about their provenance.
Schema enforcement prevents drift
The fastest way to wreck a dataset is “just add a field.” ct-cert-feed enforces schemas by validating each normalized record against a versioned JSON Schema before it is written.
That is also why publishing manifest.json with SHA-256 hashes matters: it makes daily artifacts verifiable.