NAME
    UMLS-Interface Installation Guide

TESTING PLATFORMS
    UMLS-Interface has been developed and tested on Linux primarily using
    Perl.

SYNOPSIS
     perl Makefile.PL

     make

     make test

     make install

DESCRIPTION
    The UMLS-Interface package is an interface to the Unified Medical
    Language System (UMLS)

REQUIREMENTS
    UMLS-Interface REQUIRES that the following software packages and data:

  Programming Languages
     Perl (version 5.8.0 or better)
     Java SE Development Kit (JDK)

  CPAN Modules
     DBI

  Database
    MySQL (version 5 or better)

  Data
    Unified Medical Language System (UMLS 2008AA or higher)

INSTALLATION STAGES
    The installation is broken into five stages:
    Stage 1: Install Programming Languages
                  If already installed you need at minimum: 
                      - Perl version 5.8 or better
                      - Java SE Development Kit

    Stage 2: Install CPAN Modules
    Stage 3: Install MySQL
                  If already installed you need at minimum: 
                      - MySQL version 5 or better

    Stage 4: Install the UMLS
                  If already installed you need at minimum: 
                      - the following files in the META directory:
                        -- populate_mysql_db.sh script 
                        -- MRREL.RFF
                        -- MRCONSO.RFF
                        -- MRSAB.RFF
                        -- MRDOC.RFF
                        -- MRDEF.RFF
                        -- MRSTY.RFF
                      - the following files in the NET directory
                            --populate_net_mysql_db.sh script
                            --SRDEF

    Stage 5: Load the UMLS into MySQL
                  If already installed you need at minimum: 
                      - the following tables 
                        -- MRREL
                        -- MRCONSO
                        -- MRSAB
                        -- MRDOC
                        -- SRDEF
                        -- MRDEF
                        -- MRSTY

    Stage 6: Install UMLS-Interface and set environment variable
        More details on how to obtain and install appear below.

Stage 1: Install Programming Languages, if already installed go to Stage 2
  Perl (version 5.8.5 or better)
    Perl is freely available at <http://www.perl.org>. It is very likely
    that you will already have Perl installed if you are using a Unix/Linux
    based system.

  Java SE Development Kit (JDK)
    JDK is freely available at <http://java.sun.com/>. The latest version
    today is JDK 6 which can be found here:

    <http://java.sun.com/javase/6/webnotes/install/index.html>

    I installed JDK 6 on Red Hat Enterprise Linux WS release 4 using rpms.
    The instructions that I used can be found here:

    <http://java.sun.com/javase/6/webnotes/install/jdk/install-linux.html#in
    stall-rpm>

    You can also install it using the binaries. Instructions for that can be
    found here:

    <http://java.sun.com/javase/6/webnotes/install/jdk/install-linux.html#se
    lf-extracting>

Stage 2 - Install CPAN modules, if already installed go to Stage 3
CPAN MODULES
  DBI
    CPAN modules, and will not be repeated in detail for each module.****

    UMLS-Interface uses DBI to access the mysql database containing the
    UMLS. DBI is freely available at <http://search.cpan.org/~timb/DBI/>

    If you have supervisor access, or have configured MCPAN for local
    install, you can install via:

    perl -MCPAN -e shell > install DBI

    If not, you can, "manually" install by downloading the *.tar.gz file,
    unpacking, and executing the following commands.

    perl Makefile.PL PREFIX=/home/Bit-Vector LIB=/home/MyPerlLib make make
    test make install

    Note that the PREFIX and LIB settings are just examples to help you
    create a local install, if you do not have supervisor (su) access.

    You must include /home/MyPerlLib in your PERL5LIB environment variable
    to access this module when running.

Stage 3 - Install MySQL, if already installed go to Stage 4
    Requires MySQL (version 5 or better)

    MySQL is a freely available database at <http://www.mysql.com/>. You may
    be able to install this with you package manager. Otherwise you will
    have to download the appropriate files from the MySQL website. Either
    way, make certain that the following are installed:

            1. Server
            2. Client
            3. Shared libraries
            4. Headers and libraries

    Sometimes the headers and libraries are not installed so you want to
    double check on that.

    ==head2 Installing on Red Hat Enterprise Linux WS release 4 using rpms

    I installed this on Red Hat Enterprise Linux WS release 4 using rpms.
    So, I downloaded the corresponding rpms:

            1. MySQL-server-community-5.0.51a-0.rhel4.i386.rpm
            2. MySQL-client-community-5.0.51a-0.rhel4.i386.rpm
            3. MySQL-shared-community-5.0.51a-0.rhel4.i386.rpm
            4. MySQL-devel-community-5.0.51a-0.rhel4.i386.rpm

    I ran into two problems. The first was I had an older version of RedHat
    that another program was dependent on and it would not let me upgrade. I
    ended up just removing the program and the older version of MySQL. This
    was not recommended by the MySQL documentation but after a certain point
    it just became too frustrating. The second problem that I had was I kept
    receiving the following error:

    Warning: Can't connect to local MySQL server through socket
    '/var/lib/mysql/mysql.sock' (2) in /var/www/html/forum/admin/
    db_mysql.php on line 40

    Warning: MySQL Connection Failed: Can't connect to local MySQL server
    through socket '/var/lib/mysql/mysql.sock' (2) in
    /var/www/html/forum/admin/db_mysql.php on line 40

    I don't completely understand this error but after much googling I ended
    up moving the mysql.sock file to the '/tmp' directory and changed the
    the socket location in my 'my.cnf' file which was located in the '/etc'
    directory. I changed the following line: from : socket =
    /var/lib/mysql/mysql.sock to : socket = /tmp/mysql.sock

    In the /var/lib/mysql/ directory I have a mysql.sock= file. I created a
    symbolic link to that from the /tmp directory

            ln -s /var/lib/mysql/mysql.sock= /tmp/mysql.sock

    Then everything seemed to work fine.

    I then set my root password:

         /usr/bin/mysqladmin -u root password 'new-password'

  Installing on Fedora using yum
    Ted installed mysql on Fedora using yum.

    To install mysql using yum, type the following three commands on the
    command line:

        1. yum install mysql
        2. yum install mysql-server
        3. yum install mysql-devel

    After this is complete mysql should be installed. Next you need to start
    the mysql server by typing the following command:

         service mysqld start

  Installing on Ubuntu using apt-get
    I also installed this on Ubuntu. This was 100 times easier than using
    the rpms! Type the following commands:

    sudo apt-get install mysql-common sudo apt-get install mysql-client sudo
    apt-get install mysql-server

    You are done - much easier!

  Reminder
    Also for all the installations, remember to set your root password:

         /usr/bin/mysqladmin -u root password 'new-password'

    If you are not going to be using root as your main entry into mysql -
    which is probably a good idea - you can create a user account. Here is
    how I did it:

    mysql> CREATE USER bthomson IDENTIFIED BY 'psswrd'; Query OK, 0 rows
    affected (0.00 sec)

    mysql> GRANT ALL ON *.* TO bthomson; Query OK, 0 rows affected (0.00
    sec)

    Directions are in the mysql documentation:

    <http://dev.mysql.com/doc/refman/5.1/en/adding-users.html>

Stage 4 - Install UMLS, if already installed go to Stage 5
  VERSION
    The UMLS-Interface requires version UMLS 2008AA or higher

  REMINDER
    If you already have the UMLS installed, make certain that you have the
    populate_mysql_db.sh script in the META directory as well as the
    MRREL.RFF, MRCONSO.RFF, MRSAB.RFF, MRDOC.RFF, MRDEF.RFF, MRSTY.RFF and
    SRDEF.RFF files. If you do not you will need to run the install program
    (MetamorphoSys) again.

  DESCRIPTION OF UMLS
    The UMLS is a freely available knowledge representation framework
    designed to support broad scope biomedical research queries. It includes
    over 100 controlled medical terminologies and classification systems
    encoded with different semantic and syntactic structures. The three
    major sources of UMLS are the Metathesaurus, Semantic Network and
    SPECIALIST Lexicon. To obtain the UMLS you need to register for a
    license. For more information please see:

    <http://www.nlm.nih.gov/research/umls/>

    The UMLS is installed and the MySQL load scripts are created using
    MetamorphoSys. MetamorphoSys is the UMLS installation wizard and
    customization tool included in each UMLS release. It installs one or
    more of the UMLS Knowledge Sources; when the Metathesaurus is selected,
    the user can create customized Metathesaurus subsets. MetamorphoSys may
    be used to exclude vocabularies that are not required or licensed for
    use in local applications and to select from a variety of data output
    options and filters. Below are directions for a basic installation -
    nothing fancy.

    There are quite a few steps and substeps here. The steps aren't
    complicated - there are just a lot of different windows that pop up and
    I tried to break it down into small steps so I wouldn't miss anything.
    The directions can also be found with in the UMLS documentation:

    <http://www.nlm.nih.gov/research/umls/load_scripts.html>
    <http://www.nlm.nih.gov/research/umls/meta6.html>

  Step 1: Download the UMLS
         The UMLS is freely available but you need to register for 
         a UMLS license prior to downloading. 

         You can register here: L<http://umlsks.nlm.nih.gov/>

         The download link is here: L<http://www.nlm.nih.gov/research/umls/>
         
     Download the following files:
           1. <current release>-1-meta.nlm  
           2. <current release>-otherks.nlm    
           3. <current release>-2-meta.nlm  
           4. mmsys.zip       
           5. <current release>.CHK  
           6. Copyright_Notice.txt  
           7. README.txt

  Step 2: Unzip the mmsys.zip file (unzip mmsys.zip)
            The following should be created:
              1. linux_mmsys.sh
              2. solaris_mmsys.sh
              3. macintosh_mmsys.sh
              4. windows_mmsys.bat
              5. MMSYS/ (directory)

  Step 3: Run MetamorphoSys (the install wizard)
           -- Run the appropriate .sh file for your system 
                   --> I ran ./linux_mmsys.sh

           -- Click 'Install UMLS'

           -- INSTALL UMLS' Window will appear
              - Type in your destination directory. I used 
                the same directory as my Source.
              - At the very least make certain that the Metathesaurus 
                box is checked. Although, I installed all three of the 
                Knowledge Sources: 'Metathesaurus', 'Semantic 
                Network' and 'SPECIALIST Lexicon and Lexical Tools'.
              - Click 'OK'

           -- 'Install UMLS' and 'MetamorphoSys Configuration' windows will appear
               - Click 'New Configuration' in the 'MetamorphoSys Configuration' 
                         
           -- 'License Agreement Notice' window will appear
              - Click 'Accept'

           -- 'Select Default Subset' window will appear
              - Click 'Level 0 + SNOMEDCT'          
                    - Unless you don't want or do not have a license for 
                      SNOMED-CT. then Click 'Level 0'
              - Click 'Done'

           -- 'UMLS Metathesaurus Configuration' window will appear
              - Click on the 'Output Options' tab
              - Check the 'Write MySQL load script' box 
                    - this is under the heading 'Write Database Load Scripts'
              - Click 'Done' and then 'Begin Subset'
                    - this is the last option on the list of options in the 
                      top left hand corner of the window. 
              - A box will appear asking if you would save the changes 
                    - this is up to you; I clicked 'No'

         Now the UMLS will install and the load scripts will be created 
                            
       -- 'Finished' window will appear
              - Click 'Done'

Stage 5 - Load UMLS into MySQL, if already installed go to Stage 6
  Step 1: Create the MySQL database
            Log into MySQL and create a database called 'umls' 
            as follows:

            CREATE DATABASE IF NOT EXISTS umls CHARACTER SET utf8 COLLATE utf8_unicode_ci;

  Step 2: Modify the 'my.cnf' file.
            This has been put in a different place every version or distribution 
            that I have run this one. I have found it in the '/etc' directory 
            as well as the '/etc/mysql' directory. 

            You can find where yours is by using the find command:

                    sudo find / -name my.cnf


            If you do not have a my.cnf file check to see if you have a 'my.ini' 
            file (which one you have will depend on your system).

            The following options need to be modified to optimize for read 
            performance, the MySQL 5 server requires changing buffer sizes 
            to make use of the memory available. 

                          key_buffer        = 300M
                          table_cache       = 300
                          sort_buffer_size  = 20M
                          read_buffer_size  = 20M
                          query_cache_limit = 3M
                          query_cache_size  = 100M

  Step 3:	Populate the MySQL table using the load scripts
   Step A: Find the MySQL load script for the Metathesaurus
                    Go to your Destination directory that you 
                    typed in Step C (this is where I said my 
                    source and destination directory were the 
                    same)

                    go to the: <destination>/2008AA/META directory

                    In it you should see the file:

                       populate_mysql_db.sh

                    If you don't see this file - you need to start 
                    over. Step 3 - F did not get done properly.

   Step B: Modify the MySQL load scripts for the Metathesaurus
               In the populate_mysql_db.sh you need to modify the 
               following:

                    MYSQL_HOME=<path to MYSQL_HOME>
                    user=<username>
                    password=<password>
                    db_name=<db_name>

               For example, I modified my file as follows:

                    MYSQL_HOME=/usr/
                    user=bthomson
                    password=<well I am not giving you my actual password>
                    db_name=umls

   Step C:  Run the MySQL load scripts for the Metathesaurus
                    Run this script by typing on the command line:

                        ./populate_mysql_db.sh

                    If you get the following:

                        ./populate_mysql_db.sh: Permission denied.

                    You need to change your permissions and try again. To change 
                    your permissions type:

                         chmod 755 populate_mysql_db.sh


            Now this takes what feels like FOREVER! Now would be a good 
            time to go do something else until sometime late tomorrow 
            (yes, unfortunately, tomorrow) evening ...

   Step D: Find the MySQL load script for the Semantic Network
                    Go to your Destination directory that you 
                    typed in Step C (this is where I said my 
                    source and destination directory were the 
                    same)

                    go to the: <destination>/2008AA/NET

                    In it you should see the file:

                       LoadScripts.zip

                    Unzip this file as follows:

                       unzip LoadScripts.zip

                    The following file then should appear:

                       populate_net_mysql_db.sh

                    If you don't see this file - you need to start 
                    over. Step 3 - F did not get done properly.

   Step E: Modify the MySQL load script for the Semantic Network
               This is the exact same modifications that were done 
               in the Metathesaurus load script.

               In the populate_net_mysql_db.sh you need to modify the 
               following:

                    MYSQL_HOME=<path to MYSQL_HOME>
                    user=<username>
                    password=<password>
                    db_name=<db_name>

               For example, I modified my file as follows:

                    MYSQL_HOME=/usr/
                    user=bthomson
                    password=<well I am not giving you my actual password>
                    db_name=umls

   Step F:  Run the MySQL load script for the Semantic Network
                    Run this script by typing on the command line:

                        ./populate_net_mysql_db.sh

                    If you get the following:

                        ./populate_net_mysql_db.sh: Permission denied.

                    You need to change your permissions and try again. To change 
                    your permissions type:

                         chmod 755 populate_net_mysql_db.sh


            Now this should not take forever. It goes much much quicker!

  Step G: Modify the my.cnf file (again)
            This has been put in a different place every version or distribution 
            that I have run this one. I have found it in the '/etc' directory 
            as well as the '/etc/mysql' directory. 

            You can find where yours is by using the find command:

                    sudo find / -name my.cnf

            This time you are modifying it so the that the UMLS-Interface 
            package (and the utils/ files) can automatically log into the 
            umls database.

                [client]
                user            = <username>
                password        = <password>
                port            = 3306
                socket          = /tmp/mysql.sock
                database        = umls
                
            The port number is should already be in the my.cnf file 
                under the [mysql] heading. I just copied and pasted that. 
                
            The socket is what ever socket you used in Stage 3. 

                Remember we discussed how big of pain this was since 
                it seemed like the socket location depended on what 
                operating system you were using. My RedHat is:

                          socket = /tmp/mysql.sock
                    
            where as on my Ubunutu 8.04:         

                          socket = /var/lib/mysql/mysql.sock

                and currently on Ubuntu 9.04: 

                          socket = /var/run/mysqld/mysqld.sock
                              
            So the sock file is different and its location is different 
                depending on what version of linux you are running. 

                You can find it using:

                    sudo find / -name mysqld.sock 

                    or

                    sudo find / -name mysql.sock 

                This is a huge pain. If you need help just send me an email.

                The database is whatever you called the database that 
                the UMLS is installed in. Mine is 'umls'.

Stage 6 - Install UMLS-Interface and set environment variable
    The usual way to install the package is to run the following commands:

         perl Makefile.PL
         make
         make test
         make install


        You will often need root access/superuser privileges to run
        make install. The module can also be installed locally. To do a local
        install, you need to specify a PREFIX option when you run 'perl
        Makefile.PL'. For example,

         perl Makefile.PL PREFIX=/home

        or

         perl Makefile.PL LIB=/home/lib PREFIX=/home

        will install UMLS-Interface into /home. The first method above 
        will install the modules in /home/lib/perl5/site_perl/5.8.3 
        (assuming you are using version 5.8.3 of Perl; otherwise, the 
        directory will be slightly different). The second method will 
        install the modules in /home/lib. In either case the executable 
        scripts will be installed in /home/bin and the man pages will 
        be installed in home/share.

        Warning: do not put a dash or hyphen in front of PREFIX, LIB or WNHOME.

        In your perl programs that you may write using the modules, you may need
        to add a line like so

         use lib '/home/lib/perl5/site_perl/5.8.3';

        if you used the first method or

         use lib '/home/lib';

        if you used the second method. By doing this, the installed modules are
        found by your program. To run the umls-similarity.pl program, you would 
        need to do

         perl -I/home/lib/perl5/site_perl/5.8.3 umls-similarity.pl

        or

         perl -I/home/lib

        Of course, you could also add the 'use lib' line to the top of the
        program yourself, but you might not want to do that. You will need to
        replace 5.8.3 with whatever version of Perl you are using. The
        preceding instructions should be sufficient for standard and slightly
        non-standard installations. However, if you need to modify other
        makefile options you should look at the ExtUtils::MakeMaker
        documentation. Modifying other makefile options is not recommended
        unless you really, absolutely, and completely know what you're doing!

        NOTE: If one (or more) of the tests run by 'make test' fails, you will
        see a summary of the tests that failed, followed by a message of the
        form "make: *** [test_dynamic] Error Y" where Y is a number between 1
        and 255 (inclusive). If the number is less than 255, then it indicates
        how many test failed (if more than 254 tests failed, then 254 will still
        be shown). If one or more tests died, then 255 will be shown. For more
        details, see:

    <http://search.cpan.org/dist/Test-Simple/lib/Test/Builder.pm#EXIT_CODES>

        Now please set the UMLSINTERFACE_CONFIGFILE_DIR environment variable.
        This will be the location where you don't mind the UMLS-Interface 
        module writing out system files. The system files that created are:
               
           1. <UMLS_version>_<sources><relations>_config
                  
              which is an example configuration file of the 
                  the current run
                  
           2. <UMLS_version>_<sources><relations>_child
                  <UMLS_version>_<sources><relations>_parent
                  
              this files store the upper level taxonomy which is 
                  created when connecting multiple sources

               3. <UMLS_version>_<sources><relations>_table
                  
              this file contains the path to the root for each 
                  concept in the UMLS view. this information is stored 
                  in a mysql database of the same name. This really 
                  helps speed up the initialization.

CONTACT US
     If you have any trouble installing and using UMLS-Interface, please 
     contact us via the users mailing list : 

     umls-similarity@yahoogroups.com

     You can join this group by going to:

    <http://tech.groups.yahoo.com/group/umls-similarity/>

     You may also contact us directly if you prefer :

     Bridget T. McInnes: bthomson at cs.umn.edu
     Ted Pedersen      : tpederse at d.umn.edu

