October 1992 -- README for Filtering, Segmentation, and Depth code package Copyright (c) 1990,1,2 Mark Nitzberg and President and Fellows of Harvard College All rights reserved. The code in this directory is provided with all the usual disclaimers. It depends on a vision utilities package distributed by the Harvard Robotics Laboratory called HVision. For a copy, contact George Thomas, thomas@metatron.harvard.edu. Like all vision utilities built from scratch, it has its own file format and coding idioms that are hardly self-evident. The header file "hvision.h" should explain some. FILTERING The pseudo-C code for the nonlinear filter is in filter.pseudo-c. You will have to make serious additions (#define Width and Height and typedef double IMAGE[Width][Height], input the image, deal with border issues) before it will run. The C code for the actual experimental filter, which will run about 5 times slower than the pseudo-code can be made to run, is in filter.c. SEGMENTATION The programs are curves -- find contours, jumping small gaps, produce an edge list smooth -- smooth an edge list contin -- find an optimal set of continuations for disrupted contours from an edge list and image, producing a smoothed edge list and region map pscurves -- produce a postscript file from an edge list DEPTH RECOVERY sp -- produce a segmentation with overlaps given a smoothed edge list and region map and input image. Produces postscript. The programs explain themselves when invoked with no arguments, or with the -help argument. Test images from the book are in the "im" directory. Questions and comments to Mark Nitzberg, nitzberg@math.harvard.edu. Good luck. A "Demo" file that gives the invocation parameters for the book figures will be coming soon. Early notes: ----------------------------------------------------------------------- January 1990 -- intermediate file formats for 2.1-d system. curves foo.im ( gives foo.e ) finds high-gradient edges, jumps gaps, finds T's & corners smooth foo.e [-c foo.c] smooths (mindful of T's) & rejoins junctions. output ( foo.es ) es for edgemap smoothed. or ( foo.ec ) If -c option given, uses the contin. ( foo.r ) pairing in foo.c (see contin) for interpreting the T's, then adds the continuations and puts out a new region map as well. contin foo.es ( foo.c ) finds consistent continuation pairings -- consults foo.im for region intensities c for continuation list sp foo.r ( foo.sp ) Final segmentation. sp is for stacked pictures. Result is postscript. Print utility: pscurves foo.e* ( foo.e*.ps ) converts '.e' or '.ec' files to postscript. --------------------------------------------------------------------------- F I L E F O R M A T S 1. Input image foo.im foo.bw foo.i Image files. Any format that hvReadImage() can read. To obtain the HVision image utilities, contact George Thomas, thomas@gramian.harvard.edu. 2. Edge map foo.e, foo.es, foo.ec Lines beginning with # are comments, except "# CONTOUR" lines. First data line in the file tells the size of the pixel space width height Then point lists follow; information for 1 point per line, with blank lines between point lists. Each line gives four floating point (C scanf %g) numbers x y theta kappa (x,y) are the pixel coordinates (non-optional). theta is an optional edge tangent angle in degrees (sig. only mod 180), and kappa is an optional curvature estimate. Each point list must be preceeded by a special comment line, which begins with "# CORNER ". # CONTOUR points [ words ... ] Here, is the number of points in the contour. The optional words are: CLOSED indicates a closed contour WITHCORNER first point of the contour is a corner point (applies only to closed contours) CONTINUATION this is a continuation contour BOUNDARY this contour is actually part of the image boundary 3. Continuation list foo.c ( an optional number) Lines beginning with # are comments. Each line gives one continuation pair as four numbers, until end of file. This gives a "complete" set of continuations. c1 end1 c2 end2 c1 and c2 are contour numbers, (0 ... NContours-1) which correspond to the order in which contours appear in the corresponding .e or .es file. end1 and end2 can be 0 or 1 as the endpoint is the first or last point in the contour. Again, the first point of a contour is the one listed first in the corresponding .e or .es file. 4. Region graph foo.r Gives info on edges and regions. `nodes' are edge endpoints, implicitly numbered 1 .. N. E D G E S : i j a b length curvature o1 o2 ... one line per edge; implicitly numbered 1 ... E. This edge goes from node i to j, region a on the left and b on the right (1 ... R), with given arc length and total square of curvature. o1 and o2 are angles, in degrees, of the tangents of the contour at nodes i and j. These tangents must point outward at the contour endpoints. *** See NOTES below. A blank line separates edges from regions. R E G I O N S : area mean ... the area of the region, and its mean intensity. Regions are implicitly numbered 1 ... R. NOTES: * Region number 0 means `outside the image'. Use region 0 to identify edges that comprise the image border. * If region 0 never occurs, then region 1 is taken to be the surround. That is, region 1 is the one that touches the picture border everywhere. * Loops (edges with the same node at both ends) are forbidden. If your graph has a loop, add a node. * Regions _may_ have holes. * Several edges _may_ share the same pair of nodes. * Nodes _may_ have as few as two edges (corners). * An edge shared by two regions which are the same region will be deleted during pre-processing. 5. Final output--stacked pictures foo.sp Lines beginning with '#' are comment lines. Each data line is a list of blank-separated region numbers, denoting the regions in one layer. The first line gives the bottom layer, which is usually the whole domain, and each successive line gives a layer `closer' to the viewer. Notes on the "curves" program. Input: a monochrome image I with SNR >~ 10, pixel values 0-255 for black-white. ------------------------------------------ curve detection, "curves" program Compute Ix^2, IxIy, and Iy^2 using nearest neighbors, into 3 images. Convolve each of these images with a circular gaussian of effective diameter 8 Sigma. / Ix^2 IxIy \ Now at each point Xo, the matrix A(Xo) = ( ) \ IxIy Iy^2 / gives edge direction, strength and coherence as follows. For edge strength, we can use sqrt(Ix^2+Iy^2). Let e1,e2 (e1 < e2) be the eigenvalues of A, and v1 and v2 the corresponding eigenvectors. sqrt(e1/e2) measures edge coherence (eccentricity), as the short axis of the ellipse ~ 1/sqrt(e2), and the long axis ~ 1/sqrt(e1). v2 gives edge direction, and e1 is a measure of cornerness within a diameter d neighborhood. These values are stored in four new images, Strength, Edgex and Edgey, Cornerness. Edgex and Edgey are the components of v2 normalized to unit length. Using the hysteresis principle of Canny's edge-finding algorithm, set a high threshold T such that E(T) = { Xo | Strength(Xo) > T } has no more than, say, 40% of all pixels, and a low threshold t that E(t) captures, say, 70% of all pixels. This corresponds to low & hi of 30 and 60 percentile. The low and high thresholds are parameters given in percentiles. Now for each Xo in E(T) such that Strength(Xo) is a local maximum in the +/-v2(Xo) perpendicular directions, start a new contour structure. Build the contour in two passes, one for each direction v2(Xo) and -v2(Xo). Move by one pixel unit distance at a time, ie., to Xo+v2(Xo), and add this point to the contour structure, repeating until the point is not in E(t). Then pursue the other direction, -v2(Xo), similarly. When finished with this contour, go on to the next Xo in E(T) that is a local maximum perpendicular to the edge direction. The contour structure has beginning and end pointers and is a doubly-linked list. Output: `.e' A list of curves. Each curve is a list of points, each with (x,y)=location, (xt,yt)=tangent and k=curvature, all in floating point. The point list for one curve is separated from the next by a blank line. The location (x,y) is in pixel units, but may not fall on a pixel exactly, with (0,0) for the upper-left of the image, x increasing right and y increasing downward. The tangent is a vector of unit length giving the local estimate of slope of the edge at (x,y). Its direction is (arbitrarily) towards the next point on the curve, and away from the previous. The curvature k is a local estimate of 1-dim curvature of the edge at (x,y) in inverse units of distance, (1/pixelwidth)s. It represents 1/r where r is the radius of an osculating circle that best fits the edge at (x,y). The value can be "verified" by computing the change in tangent angle over the distance between points. Thus k=0 means the edge is straight at (x,y), |k|=1 describes a hairpin turn around a neighboring pixel. Larger values for k are unreasonably high and really indicate "corners". Positive values of k indicate a right turn as one travels to the next point (ie. clockwise); negative values indicate a left turn. The "pscurves" program produces a postscript version of an edge map. The program "pscurves" puts out postscript that draws the curves in postscript. This can be printed atop the original image with laserhs -p image; (cat image.ps; curves2ps image.curves) | lpr