NAME
    HTML::Microformats - parse microformats in HTML

SYNOPSIS
     use HTML::Microformats;
 
     my $doc = HTML::Microformats
                 ->new_document($html, $uri)
                 ->assume_profile(qw(hCard hCalendar));
     print $doc->json(pretty => 1);
 
     use RDF::TrineShortcuts qw(rdf_query);
     my $results = rdf_query($sparql, $doc->model);

VERSION
    0.00_00

DESCRIPTION
    The HTML::Microformats module is a wrapper for parser and handler
    modules of various individual microformats (each of those modules has a
    name like HTML::Microformats::Foo).

    The general pattern of usage is to create an HTML::Microformats object
    (which corresponds to an HTML document) using the "new_document" method;
    then ask for the data, as a Perl hashref, a JSON string, or an
    RDF::Trine model.

  Constructor
    "$doc = HTML::Microformats->new_document($html, $uri, %opts)"
        Constructs a document object.

        $html is the HTML or XHTML source (string) or an
        XML::LibXML::Document.

        $uri is the document URI, important for resolving relative URL
        references.

        %opts are additional parameters; currently only one option is
        defined: $opts{'type'} is set to 'text/html' or
        'application/xhtml+xml', to control how $html is parsed.

  Profile Management
    HTML::Microformats uses HTML profiles (i.e. the profile attribute on the
    HTML <head> element) to detect which Microformats are used on a page.
    Any microformats which do not have a profile URI declared will not be
    parsed.

    Because many pages fail to properly declare which profiles they use,
    there are various profile management methods to tell HTML::Microformats
    to assume the presence of particular profile URIs, even if they're
    actually missing.

    "$doc->profiles"
        This method returns a list of profile URIs declared by the document.

    "$doc->has_profile(@profiles)"
        This method returns true if and only if one or more of the profile
        URIs in @profiles is declared by the document.

    "$doc->add_profile(@profiles)"
        Using "add_profile" you can add one or more profile URIs, and they
        are treated as if they were found on the document.

        For example:

         $doc->assume_profile('http://microformats.org/profile/rel-tag')

        This is useful for adding profile URIs declared outside the document
        itself (e.g. in HTTP headers).

    "$doc->assume_profile(@microformats)"
        For example:

         $doc->assume_profile(qw(hCard adr geo))

        This method acts similarly to "add_profile" but allows you to use
        names of microformats rather than URIs.

        Microformat names are case sensitive, and must match
        HTML::Microformats::Foo module names.

    "$doc->assume_all_profiles"
        This method is equivalent to calling "assume_profile" for all known
        microformats.

  Parsing Microformats
    Generally speaking, you can skip this. The "data", "json" and "model"
    methods will automatically do this for you.

    "$doc->parse_microformats"
        Scans through the document, finding microformat objects.

        On subsequent calls, does nothing (as everything is already parsed).

    "$doc->clear_microformats"
        Forgets information gleaned by "parse_microformats" and thus allows
        "parse_microformats" to be run again. This is useful if you've
        modified added some profiles between runs of "parse_microformats".

  Retrieving Data
    These methods allow you to retrieve the document's data, and do things
    with it.

    "$doc->objects($format);"
        $format is, for example, 'hCard', 'adr' or 'RelTag'.

        Returns a list of objects of that type. (If called in scalar
        context, returns an arrayref.)

        Each object is, for example, an HTML::Microformat::hCard object, or
        an HTML::Microformat::RelTag object, etc. See the relevent
        documentation for details.

    "$doc->all_objects"
        Returns a hashref of data. Each hashref key is the name of a
        microformat (e.g. 'hCard', 'RelTag', etc), and the values are
        arrayrefs of objects.

        Each object is, for example, an HTML::Microformat::hCard object, or
        an HTML::Microformat::RelTag object, etc. See the relevent
        documentation for details.

    "$doc->json(%opts)"
        Returns data roughly equivalent to the "all_objects" method, but as
        a JSON string.

        %opts is a hash of options, suitable for passing to the JSON
        module's to_json function.

    "$doc->model"
        Returns data as an RDF::Trine::Model, suitable for serialising as
        RDF or running SPARQL queries.

    "$doc->add_to_model($model)"
        Adds data to an existing RDF::Trine::Model.

WHY ANOTHER MICROFORMATS MODULE?
    There already exist two microformats packages on CPAN (see
    Text::Microformat and Data::Microformat), so why create another?

    Firstly, HTML::Microformats isn't being created from scratch. It's
    actually a fork/clean-up of a non-CPAN application (Swignition), and in
    that sense predates Text::Microformat (though not Data::Microformat).

    It has a number of other features that distinguish it from the existing
    packages:

    *   It supports more formats.

        Swignition (and eventually HTML::Microformats) supports hCard,
        hCalendar, rel-tag, geo, adr, rel-enclosure, rel-license, hReview,
        hResume, hRecipe, xFolk, XFN and more.

    *   It supports more patterns.

        HTML::Microformats supports the include pattern, abbr pattern, table
        cell header pattern, value excerpting and other intricacies of
        microformat parsing better than the other modules on CPAN.

    *   It offers RDF support.

        One of the key features of HTML::Microformats is that it makes data
        available as RDF::Trine models. This allows your application to
        benefit from a rich, feature-laden Semantic Web toolkit. Data
        gleaned from microformats can be stored in a triple store; output in
        RDF/XML or Turtle; queried using the SPARQL or RDQL query languages;
        and more.

        If you're not comfortable using RDF, HTML::Microformats also makes
        all its data available as native Perl objects.

BUGS
    Please report any bugs to <http://rt.cpan.org/>.

SEE ALSO
    RDF::RDFa::Parser, HTML::HTML5::Microdata::Parser.

    <http://www.perlrdf.org/>.

    Individual microformat modules:

    *   HTML::Microformats::adr

    *   HTML::Microformats::geo

    *   HTML::Microformats::hCard

    *   HTML::Microformats::RelTag

AUTHOR
    Toby Inkster <tobyink@cpan.org>.

COPYRIGHT
    Copyright 2010 Toby Inkster

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

