=head1 NAME

Convert::BER::XS - I<very> low level BER en-/decoding

=head1 SYNOPSIS

 use Convert::BER::XS ':all';

 my $ber = ber_decode $buf
    or die "unable to decode SNMP message";

 # The above results in a data structure consisting of
 # (class, tag, # constructed, data)
 # tuples. Below is such a message, SNMPv1 trap
 # with a Cisco mac change notification.
 # Did you know that Cisco is in the news almost
 # every week because # of some backdoor password
 # or other extremely stupid security bug?

 [ ASN_UNIVERSAL, ASN_SEQUENCE, 1,
   [
      [ ASN_UNIVERSAL, ASN_INTEGER32, 0, 0 ], # snmp version 1
      [ ASN_UNIVERSAL, 4, 0, "public" ], # community
      [ ASN_CONTEXT, 4, 1, # CHOICE, constructed - trap PDU
         [
            [ ASN_UNIVERSAL, ASN_OBJECT_IDENTIFIER, 0, "1.3.6.1.4.1.9.9.215.2" ], # enterprise oid
            [ ASN_APPLICATION, 0, 0, "\x0a\x00\x00\x01" ], # SNMP IpAddress, 10.0.0.1
            [ ASN_UNIVERSAL, ASN_INTEGER32, 0, 6 ], # generic trap
            [ ASN_UNIVERSAL, ASN_INTEGER32, 0, 1 ], # specific trap
            [ ASN_APPLICATION, ASN_TIMETICKS, 0, 1817903850 ], # SNMP TimeTicks
            [ ASN_UNIVERSAL, ASN_SEQUENCE, 1, # the varbindlist
               [
                  [ ASN_UNIVERSAL, ASN_SEQUENCE, 1, # a single varbind, "key value" pair
                     [
                        [ ASN_UNIVERSAL, ASN_OBJECT_IDENTIFIER, 0, "1.3.6.1.4.1.9.9.215.1.1.8.1.2.1" ],
                        [ ASN_UNIVERSAL, ASN_OCTET_STRING, 0, "...data..." # the value
                        ]
                     ]
                  ],
                  ...

 # let's decode it a bit with some helper functions

 my $msg = ber_is_seq $ber
    or die "SNMP message does not start with a sequence";

 ber_is $msg->[0], ASN_UNIVERSAL, ASN_INTEGER32, 0
    or die "SNMP message does not start with snmp version\n";

 # message is SNMP v1 or v2c?
 if ($msg->[0][BER_DATA] == 0 || $msg->[0][BER_DATA] == 1) {

    # message is v1 trap?
    if (ber_is $msg->[2], ASN_CONTEXT, 4, 1) {
       my $trap = $msg->[2][BER_DATA];

       # check whether trap is a cisco mac notification mac changed message
       if (
          (ber_is_oid $trap->[0], "1.3.6.1.4.1.9.9.215.2") # cmnInterfaceObjects
          and (ber_is_i32 $trap->[2], 6)
          and (ber_is_i32 $trap->[3], 1) # mac changed msg
       ) {
          ... and so on

 # finally, let's encode it again and hope it results in the same bit pattern

 my $buf = ber_encode $ber;

=head1 DESCRIPTION

WARNING: Before release 1.0, the API is not considered stable in any way.

This module implements a I<very> low level BER/DER en-/decoder.

If is tuned for low memory and high speed, while still maintaining some
level of user-friendlyness.

=head2 ASN.1/BER/DER/... BASICS

ASN.1 is a strange language that can be used to describe protocols and
data structures. It supports various mappings to JSON, XML, but most
importantly, to a various binary encodings such as BER, that is the topic
of this module, and is used in SNMP or LDAP for example.

While ASN.1 defines a schema that is useful to interpret encoded data,
the BER encoding is actually somewhat self-describing: you might not know
whether something is a string or a number or a sequence or something else,
but you can nevertheless decode the overall structure, even if you end up
with just a binary blob for the actual value.

This works because BER values are tagged with a type and a namespace,
and also have a flag that says whether a value consists of subvalues (is
"constructed") or not (is "primitive").

Tags are simple integers, and ASN.1 defines a somewhat weird assortment of
those - for example, you have 32 bit signed integers and 16(!) different
string types, but there is no unsigned32 type for example. Different
applications work around this in different ways, for example, SNMP defines
application-specific Gauge32, Counter32 and Unsigned32, which are mapped
to two different tags: you can distinguish between Counter32 and the
others, but not between Gause32 and Unsigned32, without the ASN.1 schema.

Ugh.

=head2 DECODED BER REPRESENTATION

This module represents every BER value as a 4-element tuple (actually an
array-reference):

   [CLASS, TAG, CONSTRUCTED, DATA]

To avoid non-descriptive hardcoded array index numbers, this module
defines symbolic constants to access these members: C<BER_CLASS>,
C<BER_TAG>, C<BER_CONSTRUCTED> and C<BER_DATA>.

Also, the first three members are integers with a little caveat: for
performance reasons, these are readonly and shared, so you must not modify
them (increment, assign to them etc.) in any way. You may modify the
I<DATA> member, and you may re-assign the array itself, e.g.:

   $ber = ber_decode $binbuf;

   # the following is NOT legal:
   $ber->[BER_CLASS] = ASN_PRIVATE; # ERROR, CLASS/TAG/CONSTRUCTED are READ ONLY(!)

   # but all of the following are fine:
   $ber->[BER_DATA] = "string";
   $ber->[BER_DATA] = [ASN_UNIVERSAL, ASN_INTEGER32, 0, 123];
   @$ber = (ASN_APPLICATION, SNMP_TIMETICKS, 0, 1000);

I<CLASS> is something like a namespace for I<TAG>s - there is the
C<ASN_UNIVERSAL> namespace which defines tags common to all ASN.1
implementations, the C<ASN_APPLICATION> namespace which defines tags for
specific applications (for example, the SNMP C<Unsigned32> type is in this
namespace), a special-purpose context namespace (C<ASN_CONTEXT>, used e.g.
for C<CHOICE>) and a private namespace (C<ASN_PRIVATE>).

The meaning of the I<TAG> depends on the namespace, and defines a
(partial) interpretation of the data value. For example, right now, SNMP
application namespace knowledge ix hardcoded into this module, so it
knows that SNMP C<Unsigned32> values need to be decoded into actual perl
integers.

The most common tags in the C<ASN_UNIVERSAL> namespace are
C<ASN_INTEGER32>, C<ASN_BIT_STRING>, C<ASN_NULL>, C<ASN_OCTET_STRING>,
C<ASN_OBJECT_IDENTIFIER>, C<ASN_SEQUENCE>, C<ASN_SET> and
C<ASN_IA5_STRING>.

The most common tags in SNMP's C<ASN_APPLICATION> namespace
are C<SNMP_IPADDRESS>, C<SNMP_COUNTER32>, C<SNMP_UNSIGNED32>,
C<SNMP_TIMETICKS>, C<SNMP_OPAQUE> and C<SNMP_COUNTER64>.

The I<CONSTRUCTED> flag is really just a boolean - if it is false, the
the value is "primitive" and contains no subvalues, kind of like a
non-reference perl scalar. IF it is true, then the value is "constructed"
which just means it contains a list of subvalues which this module will
en-/decode as BER tuples themselves.

The I<DATA> value is either a reference to an array of further tuples (if
the value is I<CONSTRUCTED>), some decoded representation of the value,
if this module knows how to decode it (e.g. for the integer types above)
or a binary string with the raw octets if this module doesn't know how to
interpret the namespace/tag.

Thus, you can always decode a BER data structure and at worst you get a
string in place of some nice decoded value.

See the SYNOPSIS for an example of such an encoded tuple representation.

=head2 DECODING AND ENCODING

=over

=item $tuple = ber_decoded $bindata

Decodes binary BER data in C<$bindata> and returns the resulting BER
tuple. Croaks on any decoding error, so the returned C<$tuple> is always
valid.

=item $bindata = ber_encode $tuple

Encodes the BER tuple into a BER/DER data structure.

=back

=head2 HELPER FUNCTIONS

Working with a 4-tuple for every value can be annoying. Or, rather, I<is>
annoying. To reduce this a bit, this module defines a number of helper
functions, both to match BER tuples and to conmstruct BER tuples:

=head3 MATCH HELPERS

Thse functions accept a BER tuple as first argument and either paertially
or fully match it. They often come in two forms, one which exactly matches
a value, and one which only matches the type and returns the value.

They do check whether valid tuples are passed in and croak otherwise. As
a ease-of-use exception, they usually also accept C<undef> instead of a
tuple reference. in which case they silently fail to match.

=over

=item $bool = ber_is $tuple, $class, $tag, $constructed, $data

This takes a BER C<$tuple> and matches its elements agains the privded
values, all of which are optional - values that are either missing or
C<undef> will be ignored, the others will be matched exactly (e.g. as if
you used C<==> or C<eq> (for C<$data>)).

Some examples:

   ber_is $tuple, ASN_UNIVERSAL, ASN_SEQUENCE, 1
      orf die "tuple is not an ASN SEQUENCE";

   ber_is $tuple, ASN_UNIVERSAL, ASN_NULL
      or die "tuple is not an ASN NULL value";

   ber_is $tuple, ASN_UNIVERSAL, ASN_INTEGER32, 0, 50
      or die "BER integer must be 50";

=item $seq = ber_is_seq $tuple

Returns the sequence members (the array of subvalues) if the C<$tuple> is
an ASN SEQUENCE, i.e. the C<BER_DATA> member. If the C<$tuple> is not a
sequence it returns C<undef>. For example, SNMP version 1/2c/3 packets all
consist of an outer SEQUENCE value:

   my $ber = ber_decode $snmp_data;

   my $snmp = ber_is_seq $ber
      or die "SNMP packet invalid: does not start with SEQUENCE";

   # now we know $snmp is a sequence, so decode the SNMP version

   my $version = ber_is_i32 $snmp->[0]
      or die "SNMP packet invalid: does not start with version number";

=item $bool = ber_is_i32 $tuple, $i32

Returns a true value if the C<$tuple> represents an ASN INTEGER32 with
the value C<$i32>.

=item $i32 = ber_is_i32 $tuple

Returns true (and extracts the integer value) if the C<$tuple> is an ASN
INTEGER32. For C<0>, this function returns a special value that is 0 but
true.

=item $bool = ber_is_oid $tuple, $oid_string

Returns true if the C<$tuple> represents an ASN_OBJECT_IDENTIFIER
that exactly matches C<$oid_string>. Example:

   ber_is_oid $tuple, "1.3.6.1.4"
      or die "oid must be 1.3.6.1.4";

=item $oid = ber_is_oid $tuple

Returns true (and extracts the OID string) if the C<$tuple> is an ASN
OBJECT IDENTIFIER. Otherwise, it returns C<undef>.

=back

=head3 CONSTRUCTION HELPERS

=over

=item $tuple = ber_i32 $value

Constructs a new C<ASN_INTEGER32> tuple.

=back

=head2 RELATIONSHIP TO L<Convert::BER> and L<Convert::ASN1>

This module is I<not> the XS version of L<Convert::BER>, but a different
take at doing the same thing. I imagine this module would be a good base
for speeding up either of these, or write a similar module, or write your
own LDAP or SNMP module for example.

=cut

package Convert::BER::XS;

use common::sense;

use XSLoader ();
use Exporter qw(import);

our $VERSION;

BEGIN {
   $VERSION = 0.8;
   XSLoader::load __PACKAGE__, $VERSION;
}

our %EXPORT_TAGS = (
   const => [qw(
      BER_CLASS BER_TAG BER_CONSTRUCTED BER_DATA

      ASN_BOOLEAN ASN_INTEGER32 ASN_BIT_STRING ASN_OCTET_STRING ASN_NULL ASN_OBJECT_IDENTIFIER
      ASN_OBJECT_DESCRIPTOR ASN_OID ASN_EXTERNAL ASN_REAL ASN_SEQUENCE ASN_ENUMERATED
      ASN_EMBEDDED_PDV ASN_UTF8_STRING ASN_RELATIVE_OID ASN_SET ASN_NUMERIC_STRING
      ASN_PRINTABLE_STRING ASN_TELETEX_STRING ASN_T61_STRING ASN_VIDEOTEX_STRING ASN_IA5_STRING
      ASN_ASCII_STRING ASN_UTC_TIME ASN_GENERALIZED_TIME ASN_GRAPHIC_STRING ASN_VISIBLE_STRING
      ASN_ISO646_STRING ASN_GENERAL_STRING ASN_UNIVERSAL_STRING ASN_CHARACTER_STRING ASN_BMP_STRING

      ASN_UNIVERSAL ASN_APPLICATION ASN_CONTEXT ASN_PRIVATE

      BER_TYPE_BYTES BER_TYPE_UTF8 BER_TYPE_UCS2 BER_TYPE_UCS4 BER_TYPE_INT
      BER_TYPE_OID BER_TYPE_RELOID BER_TYPE_NULL BER_TYPE_BOOL BER_TYPE_REAL
      BER_TYPE_IPADDRESS BER_TYPE_CROAK
   )],
   const_snmp => [qw(
      SNMP_IPADDRESS SNMP_COUNTER32 SNMP_UNSIGNED32 SNMP_TIMETICKS SNMP_OPAQUE SNMP_COUNTER64
   )],
   encode => [qw(
      ber_decode
      ber_is ber_is_seq ber_is_i32 ber_is_oid
   )],
   decode => [qw(
      ber_encode
      ber_i32
   )],
);

our @EXPORT_OK = map @$_, values %EXPORT_TAGS;

$EXPORT_TAGS{all} = \@EXPORT_OK;

=head1 PROFILES

While any BER data can be correctly encoded and decoded out of the box, it
can be inconvenient to have to manually decode some values into a "better"
format: for instance, SNMP TimeTicks values are decoded into the raw octet
strings of their BER representation, which is quite hard to decode. With
profiles, you can change which class/tag combinations map to which decoder
function inside C<ber_decode> (and of course also which encoder functions
are used in C<ber_encode>).

This works by mapping specific class/tag combinations to an internal "ber
type".

The default profile supports the standard ASN.1 types, but no
application-specific ones. This means that class/tag combinations not in
the base set of ASN.1 are decoded into their raw octet strings.

C<Convert::BER::XS> defines two profile variables you can use out of the box:

=over

=item C<$Convert::BER::XS::DEFAULT_PROFILE>

This is the default profile, i.e. the profile that is used when no
profile is specified for de-/encoding.

You can modify it, but remember that this modifies the defaults for all
callers that rely on the default profile.

=item C<$Convert::BER::XS::SNMP_PROFILE>

A profile with mappings for SNMP-specific application tags added. This is
useful when de-/encoding SNMP data.

Example:

   $ber = ber_decode $data, $Convert::BER::XS::SNMP_PROFILE;

=back

=head2 The Convert::BER::XS::Profile class

=over

=item $profile = new Convert::BER::XS::Profile

Create a new profile. The profile will be identical to the default
profile.

=item $profile->set ($class, $tag, $type)

Sets the mapping for the given C<$class>/C<$tag> combination to C<$type>,
which must be one of the C<BER_TYPE_*> constants.

Note that currently, the mapping is stored in a flat array, so large
values of C<$tag> will consume large amounts of memory.

Example:

   $profile = new Convert::BER::XS::Profile;
   $profile->set (ASN_APPLICATION, SNMP_COUNTER32, BER_TYPE_INT);
   $ber = ber_decode $data, $profile;

=item $type = $profile->get ($class, $tag)

Returns the BER type mapped to the given C<$class>/C<$tag> combination.

=back

=head2 BER TYPES

This lists the predefined BER types - you can map any C<CLASS>/C<TAG>
combination to any C<BER_TYPE_*>.

=over

=item C<BER_TYPE_BYTES>

The raw octets of the value. This is the default type for unknown tags and
de-/encodes the value as if it were an octet string, i.e. by copying the
raw bytes.

=item C<BER_TYPE_UTF8>

Like C<BER_TYPE_BYTES>, but decodes the value as if it were a UTF-8 string
(without validation!) and encodes a perl unicode string into a UTF-8 BER
string.

=item C<BER_TYPE_UCS2>

Similar to C<BER_TYPE_UTF8>, but treats the BER value as UCS-2 encoded
string.

=item C<BER_TYPE_UCS4>

Similar to C<BER_TYPE_UTF8>, but treats the BER value as UCS-4 encoded
string.

=item C<BER_TYPE_INT>

Encodes and decodes a BER integer value to a perl integer scalar. This
should correctly handle 64 bit signed and unsigned values.

=item C<BER_TYPE_OID>

Encodes and decodes an OBJECT IDENTIFIER into dotted form without leading
dot, e.g. C<1.3.6.1.213>.

=item C<BER_TYPE_RELOID>

Same as C<BER_TYPE_OID> but uses relative object identifier
encoding: ASN.1 has this hack of encoding the first two OID components
into a single integer in a weird attempt to save an insignificant amount
of space in an otherwise wasteful encoding, and relative OIDs are
basically OIDs without this hack. The practical difference is that the
second component of an OID can only have the values 1..40, while relative
OIDs do not have this restriction.

=item C<BER_TYPE_NULL>

Decodes an C<ASN_NULL> value into C<undef>, and always encodes a
C<ASN_NULL> type, regardless of the perl value.

=item C<BER_TYPE_BOOL>

Decodes an C<ASN_BOOLEAN> value into C<0> or C<1>, and encodes a perl
boolean value into an C<ASN_BOOLEAN>.

=item C<BER_TYPE_REAL>

Decodes/encodes a BER real value. NOT IMPLEMENTED.

=item C<BER_TYPE_IPADDRESS>

Decodes/encodes a four byte string into an IPv4 dotted-quad address string
in Perl. Given the obsolete nature of this type, this is a low-effort
implementation that simply uses C<sprintf> and C<sscanf>-style conversion,
so it won't handle all string forms supported by C<inet_aton> for example.

=item C<BER_TYPE_CROAK>

Always croaks when encountered during encoding or decoding - the
default behaviour when encountering an unknown type is to treat it as
C<BER_TYPE_BYTES>. When you don't want that but instead prefer a hard
error for some types, then C<BER_TYPE_CROAK> is for you.

=back

=cut

our $DEFAULT_PROFILE = new Convert::BER::XS::Profile;
our $SNMP_PROFILE    = new Convert::BER::XS::Profile;

$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_IPADDRESS , BER_TYPE_IPADDRESS);
$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_COUNTER32 , BER_TYPE_INT);
$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_UNSIGNED32, BER_TYPE_INT);
$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_TIMETICKS , BER_TYPE_INT);
$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_OPAQUE    , BER_TYPE_IPADDRESS);
$SNMP_PROFILE->set (ASN_APPLICATION, SNMP_COUNTER64 , BER_TYPE_INT);

$DEFAULT_PROFILE->_set_default;

1;

=head2 LIMITATIONS

This module can only en-/decode 64 bit signed and unsigned integers, and
only when your perl supports those.

OBJECT IDENTIFIEERs cannot have unlimited length, although the limit is
much larger than e.g. the one imposed by SNMP or other protocols.

REAL values are not supported and will croak.

This module has undergone little to no testing so far.

=head2 ITHREADS SUPPORT

This module is unlikely to work when the (officially discouraged) ithreads
are in use.

=head1 AUTHOR

 Marc Lehmann <schmorp@schmorp.de>
 http://software.schmorp.de/pkg/Convert-BER-XS

=cut

