NAME
    Unicode::UTF8 - Encoding and decoding of UTF-8 encoding form.

SYNOPSIS
        use Unicode::UTF8 qw[decode_utf8 encode_utf8];
    
        $string = decode_utf8($octets);
        $octets = encode_utf8($string);

DESCRIPTION
    This module provides functions to encode and decode UTF-8 encoding form
    as defined by Unicode and ISO/IEC 10646:2011.

FUNCTIONS
  decode_utf8($octets)
    Returns an decoded representation of $octets in UTF-8 encoding as a
    character string.

    Throws an exception if $octets contains an ill-formed UTF-8 sequence or
    an encoded code point which can't be interchanged.

  encode_utf8($string)
    Returns an encoded representation of $string in UTF-8 encoding as an
    octet string.

    Throws an exception if $string contains a code point which can't be
    interchanged or represented in UTF-8 encoding form.

EXPORTS
    None by default. All functions can be exported using the ":all" tag or
    individually.

DIAGNOSTICS
    Can't decode a wide character string
        (F) Wide character in octets.

    Can't decode ill-formed %s octet sequence <%s> in position %u
        (F) Encountered an ill-formed octet sequence.

    Can't interchange noncharacter code point U+%.4X
        (F) Noncharacters is permanently reserved for internal use and that
        should never be interchanged. Noncharacters consist of the values
        U+nFFFE and U+nFFFF (where n is from 0 to 10^16) and the values
        U+FDD0..U+FDEF.

    Can't represent surrogate code point U+%.4X in UTF-8 encoding form
        (F) Surrogate code points are designated only for surrogate code
        units in the UTF-16 character encoding form. Surrogates consist of
        code points in the range U+D800 to U+DFFF.

    Can't represent restricted code point U-%.8X in UTF-8 encoding form
        (F) Code points in the range U-00110000 to U-7FFFFFFF.

        ISO/IEC 10646 originally defined codespace up to U-7FFFFFFF. This
        was restricted by JTC1/SC2/WG2 Resolution M38.6 (Restriction of
        encoding space) to U-0010FFFF in 2000.

    Can't represent super code point \x{%X} in UTF-8 encoding form
        (F) Code points in the range 2^31 to 2^64-1. Perl's extended
        codespace.

SEE ALSO
    Encode

SUPPORT
  Bugs / Feature Requests
    Please report any bugs or feature requests by email to "bug-unicode-utf8
    at rt.cpan.org", or through the web interface at
    <http://rt.cpan.org/Public/Dist/Display.html?Name=Unicode-UTF8>. You
    will be automatically notified of any progress on the request by the
    system.

  Source Code
    This is open source software. The code repository is available for
    public review and contribution under the terms of the license.

    <http://github.com/chansen/p5-unicode-utf8>

        git clone http://github.com/chansen/p5-unicode-utf8

AUTHOR
    Christian Hansen "chansen@cpan.org"

COPYRIGHT
    Copyright 2011 by Christian Hansen.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.

