Bio::DB::USeq - An adaptor for parsing USeq data files

INTRODUCTION

Bio::DB::USeq is a BioPerl style adaptor for reading useq files. Useq files are 
compressed, indexed, binary data files supporting modern bioinformatic datasets, 
including genomic points, scores, and intervals. As such, they can be used as a 
replacement for text Wig and BED file formats. They may be used natively by the 
Integrated Genome Browser (IGB) and DAS/2 servers.

More information about the format can be found at 
http://useq.sourceforge.net/useqArchiveFormat.html. 

Useq files may be generated using tools in the USeq package, available at 
http://useq.sourceforge.net. They may be generated from native Bar files,
text Wig files, text Bed files, and UCSC bigWig and bigBed file formats.


COMPATIBILITY

The adaptor follows most conventions of other BioPerl-style Bio::DB 
adaptors. Observations or features in the useq file archive are 
returned as SeqFeatureI compatible objects. 

Coordinates consumed and returned by the adaptor are 1-based, consistent 
with BioPerl convention. This is not true of the useq file itself, which 
uses the interbase coordinate system.

Unlike wig and bigWig files, useq file archives support stranded data, 
which can make data collection simpler for complex experiments.

GBrowse compatibility is limited. It will support generic glyphs 
for interval or segmented data. For graphs, it supports the 
wiggle_xyplot glyph. Data may be returned in bins for large queries, 
simplifying and reducing server load. It supports local autoscale, 
but not chromosomal or genome autoscale, nor z-score scaling.


LIMITATIONS

This adaptor is read only. USeq files are not modified or written.

No support for genomic sequence is included. Users who need access to 
genomic sequence should seek an alternative BioPerl adaptor, such as 
Bio::DB::Fasta.

Useq files do not have the concept of type, primary_tag, or source 
attributes, as expected with GFF-based database adaptors. However, 
special feature types are supported, including binned wiggle data, 
for data access.

Currently, Bio::DB::USeq can only parse files that do not contain 
text strings. This is a technical limitation in unpacking the binary Java 
text strings. Hence, useq files derived from Bed files with names cannot be 
parsed, but wig, bedGraph, and simple bed files may be parsed, as well as 
those derived from native USeq Bar files.


INSTALLATION

Bio::DB::USeq requires the installation of BioPerl and Archive::Zip.

Install Bio::DB::USeq using the standard incantation.
    
    perl ./Build.PL
    ./Build
    ./Build test
    ./Build install


IMPLEMENTATION

Read the Bio::DB::USeq POD documentation for usage, API details, and 
GBrowse configuration.

To see a practical implementation of Bio::DB::USeq, see the collection of 
data analysis scripts in the BioToolBox package, available at 
http://biotoolbox.googlecode.com.




