Bio::ViennaNGS
==============

Bio::ViennaNGS is a collection of Perl modules for Next-Generation
Sequencng (NGS) analysis. It comes with a set of utility programs for
accomplishing routine tasks often required in NGS data processing. The
utilities serve as reference implementation of the routines implemented in
the library and reside within the 'scripts' folder:

bam_split.pl: 
Split (paired-end and single-end) BAM alignment files by strand and compute
statistics. Optionally create BED output, as well as normalized bedGraph
and bigWig files for coverage visualization in genome browsers (see
dependencies on third-patry tools below).

gff2bed.pl:
Convert RefSeq GFF3 annotation files to BED12 format. Individual BED12
files are created for each feature type (CDS/tRNA/rRNA/etc.). Tested with
RefSeq bacterial GFF3 annotation.  

motiffinda.pl:
Find motifs in annotated sequences. The motif can be provied as regex.

DEXSeq_gff2bed.pl: 
Convert flat GFF file produced by DEXSeq's dexseq_prepare_annotation.py to
BED6 for easy UCSC integration.

normalize_multicov.pl:
Compute normalized expression data in TPM/RPKM from (raw) read counts in
bedtools multicov format. TPM reference: Wagner et al, Theory
Biosci. 131(4), pp 281-85 (2012)

splice_site_summary.pl:
Identify and characterize splice junctions from RNA-seq data by
intersecting them with annotated splice junctions.

newUCSCdb.pl:
Create a new genome database for a locally installed instance of the UCSC
genome browser. Based on
http://genomewiki.ucsc.edu/index.php/Building_a_new_genome_database


INSTALLATION

To install this module type the following:

   perl Makefile.PL
   make
   make test
   make install

DEPENDENCIES

This module requires these other modules and libraries:

  Bio::Perl >= 1.006924
  Bio::DB::Sam >= 1.39
  File::Basename
  File::Temp
  IPC::Cmd
  Pod::Usage
  Carp

The following modules are required for the utility programs inside the
'scripts' folder:

  Bio::ViennaNGS::AnnoC
  Bio::ViennaNGS::SpliceJunc
  Bio::ViennaNGS::UCSC

Computation of BigWig files from BAM is accomplished via two calls to
third-party tools internally. genomeCoverageBed from the BEDtools suite
(http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html) is
used to create a BedGraph coverage file at first place. In a second step,
the BedGraph is converted to BigWig by bedGraphToBig, a UCSC Genome Browser
utility (http://hgdownload.cse.ucsc.edu/admin/exe/).  Make sure that both
third-party utilities (genomeCoverageBed and bedGraphToBigWig) are
available on your system.
 
AUTHORS

Michael T. Wolfinger <michael@wolfinger.eu>
Florian Eggenhofer <egg@tbi.univie.ac.at>
Joerg Fallmann <fall@tbi.univie.ac.at>

COPYRIGHT AND LICENCE

Copyright (C) 2014 Michael T. Wolfinger <michael@wolfinger.eu>

This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself, either Perl version 5.12.4 or, at your
option, any later version of Perl 5 you may have available.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.

