PyASN1: Building and Decoding ASN.1 Structures Step-by-Step

Advanced PyASN1 Techniques: Custom Types and Encoding RulesAbstract — This article explores advanced usage patterns in pyasn1 for defining custom ASN.1 types, implementing bespoke encoding/decoding behavior, and working reliably with different encoding rules (BER, DER, CER). It assumes familiarity with basic ASN.1 concepts and the pyasn1 library’s core classes (e.g., Integer, OctetString, Sequence, Set, ObjectIdentifier) and APIs for encoding/decoding.


Table of contents

  • Introduction: why extend pyasn1
  • Recap: pyasn1 fundamentals
  • Defining custom ASN.1 types
    • Subclassing builtin types
    • Adding constraints and tags
    • Example: fixed-format timestamp type
  • Custom constructed types (Sequences, Sets)
    • Optional and default fields
    • Explicit vs implicit tagging
    • Example: complex sequence with versioning
  • Custom codecs and codec options
    • BER vs DER vs CER differences
    • Forcing definite/indefinite forms
    • Example: encoding large constructed OCTET STRINGs with CER
  • Working with Object Identifiers and open types
    • Dynamic type maps and Any/OpenType handling
    • Example: decoding protocol TLV where payload type is OID-driven
  • Validation, constraints, and smarter error reporting
    • Using ValueSizeConstraint, SingleValueConstraint, RangeConstraint
    • Implementing custom validation logic
  • Practical examples: certificates, CRLs, and custom protocols
    • Parsing unusual X.509 extensions
    • Building compact binary protocols with ASN.1
  • Performance tips and memory considerations
  • Troubleshooting common pitfalls
  • Conclusion and further reading

Introduction: why extend pyasn1

pyasn1 is flexible: it models ASN.1 types as Python classes and supports BER/DER/CER encoding. For many real-world protocols you must extend base types (to enforce specific structure or semantics), control tagging behavior precisely, or tweak encoding rules for compatibility. This article shows how to do that safely and idiomatically.


Recap: pyasn1 fundamentals

At its core pyasn1 represents ASN.1 types as Python classes. Instances of these classes hold values and know how to be encoded/decoded through codec modules in pyasn1.codec (e.g., codec.ber, codec.der, codec.cer). Types are declared by subclassing classes from pyasn1.type.univ and applying constraints and tags via helper modules.

Basic pattern:

  • Define types by subclassing (Sequence, Integer, OctetString, etc.).
  • Use tag.Tag and tag.TagSet to control tagging.
  • Use constraint classes (constraints.ValueSizeConstraint, RangeConstraint).
  • Use encoder.encode() / decoder.decode() from the selected codec.

Defining custom ASN.1 types

There are two common reasons to create custom types:

  1. Constrain representation or permitted values (e.g., restricted string length).
  2. Alter the ASN.1 tag or class used for encoding.

Subclassing builtin types

Example — fixed-format timestamp encoded as GeneralizedTime but validated to a custom pattern and stored as Python datetime:

from datetime import datetime from pyasn1.type import univ, useful, constraint, tag from pyasn1.error import PyAsn1Error class FixedTimestamp(useful.GeneralizedTime):     subtypeSpec = useful.GeneralizedTime.subtypeSpec + constraint.ValueSizeConstraint(15, 15)     # Optionally alter tag class/number — here we keep standard GeneralizedTime tag.     def prettyIn(self):         # attempt to parse value into datetime; raise PyAsn1Error if fails         s = str(self)         try:             # YYYYMMDDHHMMSSZ   (14 digits + Z) as length 15             dt = datetime.strptime(s, "%Y%m%d%H%M%SZ")             return dt         except Exception as e:             raise PyAsn1Error(f"Invalid FixedTimestamp: {e}") 

Notes:

  • Use subtypeSpec to add constraints.
  • Methods like prettyIn or custom accessors can present Python-native representations.

Adding constraints and tags

To change the tag (for example, to make an INTEGER encoded with application class tag 3):

from pyasn1.type import univ, tag class AppSpecificInt(univ.Integer):     tagSet = univ.Integer.tagSet.tagExplicitly(         tag.Tag(tag.tagClassApplication, tag.tagFormatSimple, 3)     ) 

Use tagImplicitly when replacing the existing tag rather than wrapping it in an explicit tag.


Custom constructed types (Sequences, Sets)

A Sequence is defined by components and their position, optionality, and default values.

Example — versioned message:

from pyasn1.type import univ, namedtype, namedval, tag, constraint class Metadata(univ.Sequence):     componentType = namedtype.NamedTypes(         namedtype.NamedType('version', univ.Integer(namedValues=namedval.NamedValues(('v1',1), ('v2',2)))),         namedtype.OptionalNamedType('comment', univ.UTF8String().subtype(subtypeSpec=constraint.ValueSizeConstraint(0, 256))),     ) class Payload(univ.OctetString):     pass class VersionedMessage(univ.Sequence):     componentType = namedtype.NamedTypes(         namedtype.NamedType('meta', Metadata()),         namedtype.NamedType('payload', Payload()),     ) 

Optional and default fields

  • Use OptionalNamedType for optional.
  • Use DefaultNamedType to define defaults; pyasn1 omits default fields when encoding DER.

Explicit vs implicit tagging

  • Explicit tagging adds an extra tag wrapper; implicit replaces the existing tag.
  • Common API: .tagExplicitly(tag.Tag(…)) and .tagImplicitly(…).

Example: give ‘payload’ an explicit context-specific tag 0:

from pyasn1.type import tag payload_tagged = Payload().subtype(explicitTag=tag.Tag(tag.tagClassContext, tag.tagFormatConstructed, 0)) 

Custom codecs and codec options

pyasn1 provides separate codec modules: pyasn1.codec.ber, .der, .cer. Use pyasn1.codec..encoder/decoder for the chosen rule set.

BER vs DER vs CER differences

  • BER: flexible — allows definite/indefinite lengths, multiple encodings.
  • DER: canonical subset of BER — definite lengths, definite primitive/constructed rules, strict ordering for sets.
  • CER: canonical encoding for large constructed strings — enforces constructed form with definite chunking rules.

Forcing definite/indefinite forms

Encoder APIs accept options. For BER you can get indefinite lengths if encoding constructed types manually or using encoder settings (implementation details can vary by pyasn1 version).

Example: encoding large OCTET STRING with CER chunking

from pyasn1.type import univ from pyasn1.codec.cer import encoder large = univ.OctetString(b'A' * 20000) encoded = encoder.encode(large)  # CER encoder will chunk constructed OCTET STRINGs per CER rules 

If you need explicit control of chunking size or to force constructed/primitive, you can manually build a constructed OCTET STRING of SequenceOf OCTET STRING chunks.


Working with Object Identifiers and open types

ObjectIdentifier values often dictate how an openAny/ANY type should be decoded.

Dynamic type maps and Any/OpenType handling

pyasn1 supports decoding with open types by supplying a map from OID to ASN.1 type. Example common when parsing X.509 extensions:

from pyasn1.codec.der import decoder from pyasn1.type import univ from pyasn1.type import opentype # Example map: OID -> ASN.1 type class ext_map = {     '1.2.840.113549.1.9.14': univ.Sequence(),  # example only } # When defining an extension type container, provide openTypes parameter in decoder.decode() 

Example: decoding TLV where payload type depends on OID

  • First decode an outer Sequence to get the OID.
  • Lookup OID in mapping; then call decoder.decode(payload_bytes, asn1Spec=mapped_type).

Validation, constraints, and smarter error reporting

Use constraint classes to enforce sizes and value ranges:

  • ValueSizeConstraint(min, max)
  • SingleValueConstraint(*allowed_values)
  • RangeConstraint(min, max)

Attach constraints via subtypeSpec or by calling .subtype(subtypeSpec=…).

Custom validation logic

For complex rules (cross-field validation, dependent constraints) validate after full decode by inspecting fields and raising ValueError or PyAsn1Error. Example: ensure timestamp <= current time or version-dependent mandatory fields.


Practical examples

Parsing X.509 extensions (unusual encodings)

  • Some CAs use non-standard tagging or OCTET STRING wrapping in extensions. Use explicit/implicit tag adjustments and double-decode inner OCTET STRINGs if necessary.

Building compact binary protocols

  • ASN.1’s tagging and definite/indefinite length forms let you craft compact wire formats. Use implicit tags to minimize bytes and DER to ensure canonical form for signatures.

Example: custom protocol message (compact):

class MsgHeader(univ.Sequence):     componentType = namedtype.NamedTypes(         namedtype.NamedType('msgId', univ.Integer()),         namedtype.DefaultedNamedType('flags', univ.BitString("'0'B").subtype(value=0)),     ) class Msg(univ.Sequence):     componentType = namedtype.NamedTypes(         namedtype.NamedType('hdr', MsgHeader()),         namedtype.NamedType('body', univ.OctetString().subtype(implicitTag=tag.Tag(tag.tagClassContext, tag.tagFormatSimple, 0))),     ) 

Performance tips and memory considerations

  • Reuse type instances for asn1Spec where possible to avoid repeated class construction overhead.
  • For large binary data, prefer streaming approaches (chunking with CER or manual chunking) to reduce peak memory.
  • Avoid excessive Python-side conversion (e.g., converting giant octet strings to Python bytes unnecessarily).

Troubleshooting common pitfalls

  • Tagging confusion: check whether a tag is explicit or implicit — mismatch causes decoding failure.
  • Unexpected extra bytes after decode: ensure you examine decoder returns (value, remainder) and that asn1Spec matches structure.
  • DER vs BER mismatches in signatures: use DER for canonical signed data.
  • OpenType decoding: if OID mapping is missing, decoder will return raw OctetString; double-decode inner bytes.

Conclusion and further reading

pyasn1 is powerful for both protocol parsing and construction when you need precise control over types and encodings. Mastering subtypeSpec, tagging, and codec selection (BER/DER/CER) is essential for robust implementations.

Further reading: pyasn1 API docs, ASN.1 syntax and encoding rules (X.690), and examples from X.509 / PKCS specifications.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *