NAME

    WWW::YaCyBlacklist - a Perl module to parse and execute YaCy blacklists

VERSION

    version 0.4

SYNOPSIS

        use WWW::YaCyBlacklist;
    
        my $ycb = WWW::YaCyBlacklist->new( { 'use_regex' => 1 } );
        $ycb->read_from_array(
            'test1.co/fullpath',
            'test2.co/.*',
        );
        $ycb->read_from_files(
            '/path/to/1.black',
            '/path/to/2.black',
        );
    
        print "Match!" if $ycb->check_url( 'http://test1.co/fullpath' );
        my @urls = (
            'https://www.perlmonks.org/',
            'https://metacpan.org/',
        );
        my @matches = $ycb->find_matches( @urls );
        my @nonmatches = $ycb->find_non_matches( @urls );
    
        $ycb->sortorder( 1 );
        $ycb->sorting( 'alphabetical' );
        $ycb->store_list( '/path/to/new.black' );

METHODS

 new(%options)

 use_regex => 0|1 (default 1)

    Can only be set in the constructor and never be changed any later. If
    false, the pattern will not get checked if the host part is a regular
    expression (but the patterns remain in the list).

 filename => '/path/to/file.black' (default ycb.black)

    This is the file printed by store_list

 sortorder => 0|1 (default 0)

    0 ascending, 1 descending Configures sort_list

 sorting => 'alphabetical|length|origorder|random|reverse_host' (default
 'origorder)

    Configures sort_list

 void read_from_array( @patterns )

    Reads a list of YaCy blacklist patterns.

 void read_from_files( @files )

    Reads a list of YaCy blacklist files.

 int length( )

    Returns the number of patterns in the current list.

 bool check_url( $URL )

    1 if the URL was matched by any pattern, 0 otherwise.

 @URLS_OUT find_matches( @URLS_IN )

    Returns all URLs which was matches by the current list.

 @URLS_OUT find_non_matches( @URLS_IN )

    Returns all URLs which was not matches by the current list.

 void delete_pattern( $pattern )

    Removes a pattern from the current list.

 @patterns sort_list( )

    Returns a list of patterns configured by sorting and sortorder.

 void store_list( )

    Prints the current list to a file. Executes sort_list( ).

OPERATIONAL NOTES

    The error

        ^* matches null string many times in regex; marked by <-- HERE in m/^^* <-- HERE

    is probably caused by a corrupted path part of a pattern in your list
    (* instead of .*).

BUGS

    YaCy does not allow host patterns with to stars at the time being.
    WWW::YaCyBlacklist does not check for this but simply executes. This is
    rather a YaCy bug.

    If there is something you would like to tell me, there are different
    channels for you:

      * GitHub issue tracker
      <https://github.com/CarlOrff/WWW-YaCyBlacklist/issues>

      * CPAN issue tracker
      <https://rt.cpan.org/Public/Dist/Display.html?WWW-YaCyBlacklist>

      * Project page on my homepage
      <https://ingram-braun.net/erga/the-www-yacyblacklist-module/>

      * Contact form on my homepage
      <https://ingram-braun.net/erga/legal-notice-and-contact/>

SOURCE

      * De:Blacklists <https://wiki.yacy.net/index.php/De:Blacklists>
      (German).

      * Dev:APIlist <https://wiki.yacy.net/index.php/Dev:APIlist>

SEE ALSO

      * YaCy homepage <https://yacy.net/>

      * YaCy community <https://community.searchlab.eu/>

AUTHOR

    Ingram Braun <carlorff1@gmail.com>

COPYRIGHT AND LICENSE

    This software is copyright (c) 2025 by Ingram Braun.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.

