NAME
    UMLS::Interface README

  SYNOPSIS
    This package provides a Perl interface to the Unified Medical Language
    System (UMLS). The UMLS is a knowledge representation framework encoded
    designed to support broad scope biomedical research queries. There
    exists three major sources in the UMLS. The Metathesaurus which is a
    taxonomy of medical concepts, the Semantic Network which categorizes
    concepts in the Metathesaurus, and the SPECIALIST Lexicon which contains
    a list of biomedical and general English terms used in the biomedical
    domain. The UMLS-Interface package is set up to access the Metathesaurus
    and the Semantic Network present in a MySQL database.

  CONFIGURATION
    UMLS-Interface allows information to be extracted from the UMLS given a
    specified set of sources and relations through the use of a
    configuration file. The format of the configuration file is as follows:

      SAB :: include FMA, MSH
      REL :: include PAR, CHD, RB, RN
      DEF :: include TERM, CUI, PAR, CHD, RB, RN, SIB, SYN

    where SAB refers to the sources, REL refers to the relations and DEF for
    the relations to inlclude in the extended definition. Another example
    can be found in the configuration file in the samples/ directory.

    You can specify a single source, multiple sources or the entire UMLS
    (using the UMLS_ALL option). Keep in mind that the greater the number of
    sources the larger the search space so if you obtaining path information
    about two concepts this will take longer. The names of the sources in
    the configuration file are expected to be in the SAB (sourse
    abbreviation) form. A listing of the sources and their SABs can be
    found:

    <http://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/re
    lease/source_vocabularies.html>

    You can specify any relations that exist in the specified set of sources
    that you defined. The directional (hierarchical) relations though are
    PAR/CHD and RB/RN. The other relations (such as RO and SIB) are not
    directional which means when obtaining path information when using these
    relations may take much longer than obtaining path information using the
    directional relations. A listing of the different relations can be found
    here (scroll down to the REL table):

    <http://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/re
    lease/abbreviations.html>

    If you do plan on using a multiple sources or the entire UMLS, we would
    advise you to use the --realtime option which is explained below, in the
    Interface.pm documentation and the path programs in the utils/
    directory. We also have a am UMLS_ALL option for this so you do not have
    to specify each and every source. It would be as follows:

      SAB :: include UMLS_ALL

    The extended definition consists of an array containing the definition
    of a concept and all of its relations definitions. We allow for those
    relations to be specified in the config file. The default though is all
    of them.

  INSTALL
    To install the module, run the following magic commands:

      perl Makefile.PL
      make
      make test
      make install

    This will install the module in the standard location. You will, most
    probably, require root privileges to install in standard system
    directories. To install in a non-standard directory, specify a prefix
    during the 'perl Makefile.PL' stage as:

      perl Makefile.PL PREFIX=/home/programs

    It is possible to modify other parameters during installation. The
    details of these can be found in the ExtUtils::MakeMaker documentation.
    However, it is highly recommended not messing around with other
    parameters, unless you know what you're doing.

  DATABASE SETUP
    The interface assumes that the UMLS is present as a mysql database. The
    names of these databases can be passed as configuration options at
    initialization. However, if the names of the database is not provided at
    initialization, then default values are used -- the database for the
    UMLS is called 'umls'.

    The UMLS database must contain six tables: 1. MRREL 2. MRCONSO 3. MRSAB
    4. MRDOC 5. MRDEF 6. SRDEF 7. MRSTY

    All other tables in the databases will be ignored, and any of these
    tables missing would raise an error.

    The mysql server can be on the same machine as the module or could be on
    a remotely accessible machine. The location of the server can be
    provided during initialization of the module.

  INITIALIZING THE MODULE
    To create an instance of the interface object, using default values for
    all configuration options:

      use UMLS::Interface;
      my $interface = UMLS::Interface->new();

    The database onfiguration options can be included in the MySQL my.cnf
    file. This is preferable. The directions for this are in the INSTALL
    file. It is Stage 5 Step D.

    The following configuration options are also provided though:

        'driver'       -> Default value 'mysql'. This option specifies the 
                          Perl DBD driver that should be used to access the
                          database. This implies that the some other DBMS
                          system (such as PostgresSQL) could also be used,
                          as long as there exist Perl DBD drivers to
                          access the database.
        'umls'         -> Default value 'umls'. This option specifies the name
                          of the UMLS database.
        'hostname'     -> Default value 'localhost'. The name or the IP 
                          address of the machine on which the database 
                          server is running.
        'socket'       -> Default value '/tmp/mysql.sock'. The socket on 
                          which the database server is using.
        'port'         -> The port number on which the database server 
                          accepts connections.
        'username'     -> Username to use to connect to the database server. 
                          If not provided, the module attempts to connect as 
                          an anonymous user.
        'password'     -> Password for access to the database server. If not
                          provided, the module attempts to access the server
                          without a password.

        'forcerun'     -> This parameter will bypass any command prompts such 
                          as asking if you would like to continue with the index 
                          creation. 

        'realtime'     -> This parameter will not create a database of path 
                          information (what we refer to as the index) but obtain
                          the path information about a concept on the fly
      
    'cuilist'      -> This parameter contains a file containing a list 
                          of CUIs in which the path information should be 
                          store for - if the CUI isn't on the list the path 
                          information for that CUI will not be stored

        'verbose'      -> This parameter will print out the table information 
                          to a config file in the UMLSINTERFACECONFIG directory

  USING THE MODULE
    Once the object of module is successfully created after following the
    steps described in the previous section, a number of methods can be
    called upon this object:

          getError()                -- Returns the error code and error string
                                       rom the last method call on the object.
          root()                    -- Returns the concept ID of the root of 
                                       the tree.
          depth()                   -- Returns the depth of the tree.
          version()                 -- Return the version of UMLS.
          exists()                  -- Determines if a CUI exists
          validCui()                -- Checks if CUI is a valid concept
          getSab()                  -- Returns the list of sources the concept
                                       exists in
          getConceptList()          -- Returns the list of all concept IDs for 
                                       the term in a specified set of sources.
          getTermsList()            -- Returns the list of terms and their sources 
                                       given a particular concept ID 
          getAllTerms()             -- Returns the list of terms corresponding to 
                                       a particular concept ID for all sources
          getParents()              -- Returns the parent of a given CUI
          getChildren()             -- Returns the children of a given CUI
          getRelated()              -- Returns the relations of a given CUI and 
                                       relation
          getRelations              -- Returns all of the relations associated 
                                       with a specific CUI in a given source
          pathsToRoot()             -- Returns a list of concept IDs that denote 
                                       the path from the input concept ID to the 
                                       root 
                                       concept of the taxonomy.
          findShortestPath()        -- Returns the shortest path between two CUIs
          findLeastCommonSubsumer() -- Returns the least common subsumer between
                                       two CUIs
          getCuiDef()               -- Returns the definition(s) of a given CUI
          dropTable()               -- Drops the temporary table created by 
                                       the UMLS-Interface module of path 
                                       information for a specified set of sources
          findMinimumDepth()        -- Returns the minimum depth of a given CUI
                                       in the current view of the UMLS
          findMaximumDepth()        -- Returns the maximum depth of a given CUI
                                       in the current view of the UMLS
          getSts()                  -- Returns the TUI(s) of the semantic type(s)
                                       associated with a given CUI
          getStAbr()                -- Returns the abbreviation of a semantic 
                                       type given its cooresponding TUI
          getStString()             -- Returns the name of the semantic type 
                                       given its cooresponding abbreviation      
          getStDef()                -- Returns the definition of a semantic type
                                       given its cooresponding abbreviation
          checkConceptExists()      -- Returns true or false (1 or 0) if a 
                                       concept exists given the current view of 
                                       the UMLS
          returnTableNames()        -- Returns the table names in human and 
                                       hex form created by the package for a 
                                       given configuration

          getIC()                   -- Returns the information content of a CUI

          getFreq()                 -- Returns the propogation count of a CUI

    These methods essentially expose an interface as required by the
    UMLS::Similarity modules. The UMLS::Similarity modules require that any
    interface to a taxonomy provide the above methods.

  REFERENCING
        If you write a paper that has used UMLS-Interface in some way, we'd 
        certainly be grateful if you sent us a copy and referenced UMLS-Interface. 
        We have a published paper that provides a suitable reference:

        @inproceedings{McInnesPP09,
           title={{UMLS-Interface and UMLS-Similarity : Open Source Software for Measuring Paths and Semantic Similarity}}, 
           author={McInnes, B.T. and Pedersen, T. and Pakhomov, S.V.}, 
           booktitle={Proceedings of the American Medical Informatics Association (AMIA) Symposium},
           year={2009}, 
           month={November}, 
           address={San Fransico, CA}
        }

        This paper is also found in
        <http://www-users.cs.umn.edu/~bthomson/publications/pubs.html>
        or
        <http://www.d.umn.edu/~tpederse/Pubs/amia09.pdf>

  CONTACT US
    If you have any trouble installing and using UMLS-Interface, please
    contact us via the users mailing list :

    umls-similarity@yahoogroups.com

    You can join this group by going to:

    <http://tech.groups.yahoo.com/group/umls-similarity/>

    You may also contact us directly if you prefer :

        Bridget T. McInnes: bthomson at cs.umn.edu

        Ted Pedersen      : tpederse at d.umn.edu

  SOFTWARE COPYRIGHT AND LICENSE
    Copyright (C) 2004-2009 Bridget T McInnes, Siddharth Patwardhan, Serguei
    Pakhomov and Ted Pedersen

    This suite of programs is free software; you can redistribute it and/or
    modify it under the terms of the GNU General Public License as published
    by the Free Software Foundation; either version 2 of the License, or (at
    your option) any later version.

    This program is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
    Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program; if not, write to the Free Software Foundation, Inc.,
    59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

    Note: The text of the GNU General Public License is provided in the file
    'GPL.txt' that you should have received with this distribution.

