.TH NOTEANAL 1carl CARL
.SH NAME
noteanal 
\- note event parser for sound files
.SH SYNOPSIS
.B noteanal 
[flags] < floatsams > floatsams,
.nf
Input must be a file or pipe.
.RI 
.nf
Flags:
All time values are in seconds.  Use 'S' postop for sample times.
 -wN = simple average using window size of N (128S)
 -mN = mean squared using window size of N (128S)
 -bN = set segment begin threshold to N (.01)
 -eN = set segment end threshold to N (.005); -eN must be <= -bN
 -uN = set minimum segment duration (1024S)
 -lF = set log file for segmentation statistics
 -s =  turn on segmentation (b, e, l and u automatically set this)
 -uN = set min. interesting utterance size to N (1024S),
 -x =  skip output by window-size
 -z =  output the input sample rather than the average
 -v = verbose: print summary of noteanal on stderr
-RN = print segmentation statistics using sampling rate N (48K).
-h = help.  Prints this synopsis.
.RE
.SH DESCRIPTION
.B noteanal
reads 32 bit floating point sound sample data (floatsams) from the standard 
input.  It produces two different outputs depending on what it is told to do.
It's default mode is to simply be an envelope follower, writing on the
standard output the mean squared envelope of the entire file.
Simple averaging is also available with -w.
.PP
If prodded, it will instead try to segment the file
by looking for amplitude divisions between events.  The way it does this
is to form a running average of so many contiguous samples of the file.
The size of the window is set by the argument to the -m (and -w) flags.
If this average goes above a settable beginning threshold (-b), a segment
is started.  It then begins looking for the average to drop below a minimum
(-e), indicating the end of the segment.  It repeats the process for
the entire file.
.PP
The result of segmenting depends on the destination of the output. If the
standard output is a file or pipe, the samples of each segment are written
out as floatsams.  If it is a terminal, the sample number (relative
to the beginning of the file), time (calculated
at the current sampling rate, settable by -R) and current 
average value are printed.  Samples in between segments are suppressed.
.PP
It is possible to get a terse version of the envelope with the -x
flag, which skips the output over the number of samples in the window.
The number of output points is thus divided by the size of the window.
This is good for previewing or quick analysis.
.PP
Segmentation is enabled explicitly with the -s flag, which will do segmentation
using all defaults.  The -b flag sets the beginning threshold, -e the ending.
The -u flag determines the minimum length segment worth handling and is used
to eliminate false starts and hiccups.  Mentioning any of the -e, -b, -u
or -l flags automatically turns on -s.
.PP
You can get 
.B noteanal 
to output the input samples instead of the envelope
of the samples with the -z flag.  This, in conjunction with
-s and friends, effectively compresses inter-event
silence from the sound file.
.PP
A summary of the segmentation activity can be obtained with the -l flag.
It prints in the file named as its option, and provides the following
info for each segment:
segment number beginning sample, 
(followed by beginning time in seconds, re. the current
sampling rate, printed in parenthesis),  the maximum average amplitude within
the segment, the sample (and time) where this maximum occurrs, and the
sample (and time) of the end of the segment.
.PP
These segmentation statistics can be written to the standard error output
with the -v flag.
.SH Example
Imagine the file in this example contains speech.
.sp
.nf
% sndin speechfile | noteanal -u.1 -z -llogfile | sndout ...
.PP
Segmentation is turned on by the -u flag being mentioned.  Segments
less than .1 sec (at the default sampling rate) will be ignored.
For most speech, the default value of -u works fine.  The -z flag
causes \fBnoteanal\fR
to write the actual samples read in, rather than writing the envelopes
of the segments.  The file "logfile" is written, containing the text
description of the segments.
.SH Tweaking The Parameters
In plain envelope mode, only -w affects anything.  It changes the number
of samples which are averaged.  The greater the number, the smoother the
envelope, and the greater the phase lag of the envelope behind the signal.
.PP
In segmentation mode, increasing -w makes a smoother average.  The results
are to slightly retard the onset of a segment (phase lag), and to allow
only broader changes in the envelope to trigger a segment boundary.
.PP
Increasing -b makes it harder for a bump in the envelope
to qualify as the beginning of a segment.  Decreasing it makes it more
sensitive.  Increasing -e makes it easier to 
.B end 
a segment, since a higher threshold will presumably be reached quicker
in a note's decay than a lower one.  
The -e value must be less than the -b value, or
segments will end as soon as they begin.
.PP
Increasing -u causes a segment which has a duration less than the value
set to be ignored.  Decreasing it has little influence if it is decreased
down below the size segment that is being found with -w, -b and -e.
.SH EXPRESSIONS
A modified version of the CARL expr() routine evaluates all numeric arguments,
so that expressions involving "+-*/()", the postoperators 'K', 'S' and 's',
are available.  'K' multiplies the result by 1024.  For flags that
set time values, 'S' treats the result of an expression as number of samples,
while no postoperator, or the 's' postoperator treats time in seconds at
the prevailing sampling rate.
.SH AUTHOR
Gareth Loy
.SH DIAGNOSTICS
A limit of 2048 segments maximum is enforced.  The program prints
"too many segments" and proceeds to overstrike the last segment recorded
with the next and all subsequent segments.
.PP
A 0 or negative sampling rate quits with an error.
