Child pages
  • Grid Setup
Skip to end of metadata
Go to start of metadata

Program Testbed Status - Status of program testbed

This page describes how to set up a node in the program testbed, get certificates, and install the workflow system and grid software.

General Documentation

Globus Firewall Requirements for SERVERS and CLIENTS from the System Research Infrastructure meeting on 10/11/2006 provides a very nice description of the firewalls and ports that need to be opened. It lists incoming and outgoing connection details. Note: You can specify PORT ranges for incoming connections (at least a hundered) plus the list of standard ports required for grid services. I would say to start we are setting up a small grid.

Once we have Globus running and tested, we should have a list of hosts on the TANGRAM Grid, with IP address and DNS, to allow Firewalls to be used effectively. We plan to run some monitoring software providing live status of the TANGRAM Grid.

As a group we could agree on a TCP PORTRANGE for Globus Connections.

Network Firewall Ports

These ports are to be opened bi-directional

  1. GRIDFTP 2811
  2. GATEKEEPER 2119
  3. GSISSH 40022
  4. GLOBUS PORT RANGE = 40000,41000 or more if possible.
  5. Allegro Graph Server port = 4567
  6. MYSQL Server port = 3306
  7. HTTP ports (outgoing to connect to various services) 80, 8080, 8443, 443 etc
  8. GANGLIA ports 8655, 8649
  9. DC port ?
  10. PC port ?

Software Installation

Required Software

Head Node/Server (Each site should have atleast one head node and several worker nodes)

OS : Linux native or Linux VM.

Software packages :

  1. Ant 1.7
  2. Java 1.5 and 1.6
  3. ganglia 3.0.6 or 3.0.7 (This is optional installation )
  4. condor 7.1.3
  5. globus 4.0.7
  6. Tangram Software stack

The easiest way to install for several machines if they share a shared file system is to install the software on the shared file system.

e.g. /nfs/software/grid

If there is no shared filesystem, create a directory /local/software/grid

Assign the environment variable TANGRAM_ROOT_DIR to the base directory created in previous step

In this case, TANGRAM_ROOT_DIR=/nfs/software/grid or /local/software/grid depending on existence of shared filesystem.

Add the entry in /etc/profile accordingly
export TANGRAM_ROOT_DIR=/nfs/software/grid or /local/software/grid depending on existence of shared filesystem.


create directories for each package eg.

mkdir -p ant/src java/src ganglia/src condor/src globus/src


Download ant:

cd $TANGRAM_ROOT_DIR/ant/src



tar zxvf src/apache-ant-1.7.1-bin.tar.gz

Symlink the installation directory to default:

ln -s apache-ant-1.7.1 default

Add the following paths to /etc/profile:

export ANT_HOME=$TANGRAM_ROOT_DIR/ant/default

export PATH=$ANT_HOME/bin:$PATH


Download java:

cd $TANGRAM_ROOT_DIR/java/src


chmod 755 *.bin

cd /nfs/software/java

sh src/jdk-6u5-linux-i586.bin

Read the License. Type Yes at the end.
The automated installer will install java 1.6

Symlink the jdk 1.6 to default:

ln -s jdk1.6.0_05 default

Add the entry for java in /etc/profile:

export JAVA_HOME=$TANGRAM_ROOT_DIR/java/default

export PATH=$JAVA_HOME/bin:$PATH


Ganglia is a monitoring system that needs to run on each node. It consists of 3 parts:

*gmond daemon that needs to be installed on each node,
*gmetad that needs to be run on the head node or one of the nodes as a gatherer, and
*ganglia-web module which displays the information and graphs.

Ganglia is available for most distributions via standard yum and apt repositories. If you can't find them, you can install tarballs from _http://www.ganglia.org_. A system admin just needs to run the command on all nodes. Depending on your Linux installation, you may need the following extra packages:

  • perl-Compress-zlib
  • perl-XML
  • perl-XMLParser
  • Sqlite

Download ganglia-monitor:

Enter either:

  • yum install gmond
  • apt-get install gmond
    or manually download the ganglia-monitor package from the ganglia website.

Download gmetad:

Enter on the head node either:

  • yum install gmetad
  • apt-get install gmetad
    or manually download the ganglia gmetad package from the ganglia website.

Download ganglia-web:


  • yum install ganglia-web
  • apt-get install ganglia-web
    or manually download the ganglia web package from the ganglia website.

Note: ganglia-web is only required at host where all the information is being collected.

If you cannot find these packages, you may want to do:

  • yum list ganglia
  • apt-cache search gmond gmetad ganglia

Note: on Debian etch the versions of ganglia packages are 2.7.x. If someone has information about 3.x packages please update the instructions or install manually.

You can also download the ganglia packages from the ganglia website and install them manually.

After installation, copy the gmond.conf file from to /etc/gmond.conf on each node for ganglia 2.x

After installation copy the gmond.conf file from to /etc/gmond.conf on each node for ganglia 3.x

  • Modify the gmond.conf file for your setup.
    • Edit the following section and put values for your cluster.
    • 2.x versions the following entries need to be changed name, owner, url, mcast, setuid.

      name "Viz"
      owner "ISI / CGT"
      url ""
      mcast_if eth0
      setuid ganglia

      • 3.x versions do following

        cluster {
        name = "Windward"
        owner = "ISI / CGT"
        latlong = "N30.0 W122.23"
        url = ""

  • edit the udp_send_channel section and change the mcast_join hostname/ip to be the hostname where your gmetad daemon is running.

    udp_send_channel {
    mcast_join =
    port = 8649
    ttl = 1

  • Edit the /etc/gmetad.conf file
    • 2.x the following entries need to be changed

      data_source "<Clustername>" localhost
      gridname "<your organization>"
      authority ""
      trusted_hosts <yourhostname or ip>
      setuid_username "ganglia"

    • 3.x the following entries need to be changed.
    • change the gridname to your Grid name.
    • Change trusted_hosts to add and if not already there.
    • Send to the host/ip port info where your gmetad is running.
  • Start all the gmond daemons by running the /etc/init.d/gmond or /etc/init.d/ganglia-monitor and /etc/init.d/gmetad scripts.
  • If you installed the ganglia-web package then you may additionally need to configurae and start your httpd server to show graphs. Otherwise your site will still be displayed on the url


  • Download in directory $TANGRAM_ROOT_DIR/condor/src
  • You may additionally need to install compat-libstdc++ libraries if your condor installation does not work. You will get an error when you try to start condor and this will be in the logs.
  • cd $TANGRAM_ROOT_DIR/condor/src

*tar zxvf*tar.gz

  • cd $TANGRAM_ROOT_DIR/condor/src/condor-7.1.3

Add user

  • adduser --home /home/condor condor
  • ./condor_install --install-dir $TANGRAM_ROOT_DIR/condor/7.1.3 --make-personal-condor --owner=condor
  • If you get the error
    Condor requires '':
    I don't know what package will provide this library
    You need to find and install this
  • do
  • apt-get install gcc-4.1

(the make personal condor is only if you are running a single node cluster.

If you plan to install condor for multiple nodes, then you may want to make the head node just act as a submit node --type=submit and
select some other node to be a central manager which normally does not run any job.
All the cluster nodes will then be of type=execute.)

    • The script may prompt you to specify local condor directory. Set them to a non shared file system.
    • If you plan to install condor also as a scheduler for various nodes in the cluster you will need additional configuration on each node.
    • On each node set the CONDOR_HOST to a machine acting as the central manager. This is generally the machine where your ran the condor-configure command as --make-personal-condor or as --type=manager.

Check the condor_config file written in $TANGRAM_ROOT_DIR/condor/7.1.3/etc/condor_config

You may additionally need to change an entry in the condor_config file where it says HOSTALLOW_WRITE to be *.yourdomain, *, *

  • Add the following in entries /etc/profile

export PATH=$TANGRAM_ROOT_DIR/condor/7.1.3/bin:$TANGRAM_ROOT_DIR/condor/7.1.3/sbin:$PATH

export CONDOR_CONFIG=$TANGRAM_ROOT_DIR/condor/7.1.3/etc/condor_config

Once condor is installed run the the command as root.


You should see several condor daemons start up. e.g. master, collector, negotiator( if your machine is set to manager), schedd (if your machine is set to submit), startd (if your machine is set to execute) or all of them if you chose (personal-condor).

You can either write a inetd script to start your condor automatically at start time or modify the file in $TANGRAM_ROOT_DIR/condor/7.1.3/examples/condor.boot and install it in the appropriate inetd locations.


After starting condor_master , run the following commands

  • condor_status
  • condor_q

Here is an example of what is displayed

root@ttwo:/local/software/grid/condor/src/condor-7.1.3# condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime LINUX      INTEL  Unclaimed Idle     0.000  1013  0+00:05:04 LINUX      INTEL  Unclaimed Idle     0.000  1013  0+00:05:05

                     Total Owner Claimed Unclaimed Matched Preempting Backfill

         INTEL/LINUX     2     0       0         2       0          0        0

               Total     2     0       0         2       0          0        0

root@ttwo:/local/software/grid/condor/src/condor-7.1.3# condor_q

-- Submitter: : <> :


Download globus 4.0.8 binary from for your system.

save it in $TANGRAM_ROOT_DIR/globus/src

  • Untar the binary tarball
  • Run the command
    • ./configure --prefix=$TANGRAM_ROOT_DIR/globus/4.0.8 --enable-wsgram-condor --disable-tests --disable-wstests
      (If you are running torque/pbs or some other scheduler other then condor you will need to do --enable-wsgram-pbs or --enable-wsgram-lsf etc.)
    • make
    • make install

Note: If you get an error for missing perl module ( Required perl module XML::Parser not found )

  • do
  • apt-get install libxml-parser-perl
  • Globus will be installed in $TANGRAM_ROOT_DIR/globus/4.0.8
  • cd $TANGRAM_ROOT_DIR/globus/
  • Make a symlink from 4.0.8 to default
  • ln -s default 4.0.8
  • If you install globus on a shared file system it is recommended to move the globus/var and globus/tmp directories to a local file system and symlink them from the installation directory e.g.

cd $TANGRAM_ROOT_DIR/globus/4.0.8

mkdir /var/spool/globus

mv var /var/spool/globus/var

chmod 1777 /var/spool/globus/var

mv tmp /var/spool/globus/tmp

chmod 1777 /var/spool/globus/tmp

ln -s /var/spool/globus/var var

ln -s /var/spool/globus/tmp tmp

  • Add the following in entries /etc/profile


source $GLOBUS_LOCATION/etc/

  • You will need to write several files to enable globus services.
  • If xinetd is not installed , you need to install it
    • example apt-get install xinetd
  • To start with
    • Write a file called globus-gatekeeper in /etc/xinetd.d directory
    • Replace <$TANGRAM_ROOT_DIR> with the value of the environment variable that you set
service globus-gatekeeper
      socket_type  = stream
      protocol     = tcp
      wait         = no
      user         = root
      server       = <$TANGRAM_ROOT_DIR>/globus/default/sbin/globus-gatekeeper
      server_args  = -conf <$TANGRAM_ROOT_DIR>/globus/default/etc/globus-gatekeeper.conf
      disable      = no
      env          = LD_LIBRARY_PATH=<$TANGRAM_ROOT_DIR>/globus/default/lib
      env         += GLOBUS_LOCATION=<$TANGRAM_ROOT_DIR>/globus/default
     env         += GLOBUS_TCP_PORT_RANGE=40000,41000
    • Write a file called gridftp in the same directory
    • Replace <$TANGRAM_ROOT_DIR> with the value of the environment variable that you set
service gridftp
            instances               = 100
            socket_type             = stream
            wait                    = no
            user                    = root
            server                  = <$TANGRAM_ROOT_DIR>/globus/default/sbin/globus-gridftp-server
            server_args             = -i -d info -l <~UWC_TOKEN_START~1254808297989~UWC_TOKEN_END~TANGRAM_ROOT_DIR>/globus/default/var/gridftp.log
            log_on_success         += DURATION USERID
            log_on_failure         += USERID
            nice                    = 10
            disable                 = no
            env                    += GLOBUS_LOCATION=<$TANGRAM_ROOT_DIR>/globus/default
            env                    += PATH=<~UWC_TOKEN_START~1254808297991~UWC_TOKEN_END~TANGRAM_ROOT_DIR>/globus/default/bin:<~UWC_TOKEN_START~1254808297992~UWC_TOKEN_END~TANGRAM_ROOT_DIR>/globus/default/sbin
            env                    += LD_LIBRARY_PATH=<~UWC_TOKEN_START~1254808297993~UWC_TOKEN_END~TANGRAM_ROOT_DIR>/globus/default/lib
           env         += GLOBUS_TCP_PORT_RANGE=40000,41000
  • edit all the paths to the globus software mentioned in the above file for your environment.
  • edit the GLOBUS_TCP_PORT_RANGE to define the ports which you have poked in your firewall.
  • edit the file /etc/services and add the lines
    gridftp 2811/tcp

globus-gatekeeper 2119/tcp

  • Restart xinetd
  • /etc/init.d/xinetd restart
  • Start the gsissh server by first copying the file $TANGRAM_ROOT_DIR/globus/default/sbin/SXXsshd to /etc/init.d/gsisshd
  • cd /etc/init.d
  • PLEASE NOTE: Below is a red hat command.
  • Run /sbin/chkconfig --add gsisshd
  • Debian equivalent ( SEA to confirm )
  • root@ttwo:/etc/init.d# update-rc.d gsisshd defaults
  • Edit $TANGRAM_ROOT_DIR/globus/default/etc/ssh/sshd_config
  • Uncomment the port line on the top and change it from 22 to 40022
  • Start the gsissh server by running the script as root /etc/init.d/gsisshd start


To be done as root.

Download the package

Untar the package as root in /etc/grid-security

This will create directory called grid-security with the CA certificates etc in place.


To be done as root
A file named grid-mapfile in /etc/grid-security has to be created to map DN credentials to local users on the node.

The file format is

"/DN/FOO/BAR" userid
"/DN/BAR/FOO" userid2

The allocated user DN's are mentioned below on this page.

"/DC=org/DC=doegrids/OU=People/CN=Karan Vahi 476301" sr

The user needs to be added if does not exist on system.

adduser --home /home/sr sr

Make sure that the permissions on the /etc/grid-security directory and /etc/grid-security/certificates directory is 755.
The perms on all the files in /etc/grid-security/certificates should be 644.
The perms on /etc/grid-security/grid-mapfile /etc/grid-security/hostcert.pem should be 644
The perms on /etc/grid-security/hostkey.pem should be 400

All the above files should be owned by root.

Certificates (Tangram CA)

Host Certs

NOTE : This step only needs to be done once by the site admin. A new cert should be generated in case of a compromise, new hostname, or the cert expiring.

On the tangram node at your site as user root..

  • Make sure env GLOBUS_LOCATION is set
  • source $GLOBUS_LOCATION/etc/
  • Run the command grid-cert-request -ca 0683a0c5 -host "fullhostname-including-domain". e.g.

    grid-cert-request -ca 0683a0c5 -host ""

  • The command will generate /etc/grid-security/hostkey.pem , hostcert_request.pem and hostcert.pem
  • email [ gmehta at isi do edu ] the hostcert_request.pem file with the Subject . Tangram CA Sign Host Certificate
  • After signing you will get an email will the signedcert. Save this file as /etc/grid-security/hostcert.pem

User Certs

On the tangram node at your site

  • Make sure env GLOBUS_LOCATION is set
  • source $GLOBUS_LOCATION/etc/
  • Run the command grid-cert-request -ca 0683a0c5 -cn "Your Full Name + 6 digitpin" . It will prompt you to specify a password. Put your password... e.g.

    grid-cert-request -ca 0683a0c5 -cn "Gaurang Mehta 123456"

  • The command will generate $HOME/.globus/userkey.pem , usercert_request.pem and usercert.pem
  • email [ gmehta at isi do edu ] the usercert_request.pem file with the subject Tangram CA Sign User Certificate
  • After signing you will get an email will the signedcert. Save this file as $HOME/.globus/usercert.pem
  • Run grid-proxy-init command.. Enter the password you provided in the step when you created the certificate. This will create a proxy for you to use.
  • Run grid-proxy-info and you should get output similar to this

gmehtaatwind ~$ grid-proxy-info
subject : /O=com/O=STDC/OU=Certificate Authorities/ Mehta/CN=654089001
issuer : /O=com/O=STDC/OU=Certificate Authorities/ Mehta
identity : /O=com/O=STDC/OU=Certificate Authorities/ Mehta
type : Proxy draft (pre-RFC) compliant impersonation proxy
strength : 512 bits
path : /tmp/x509up_u1003
timeleft : 11:59:58


From a machine where you already have your user cert and env setup

Make sure you set your environment variables to include

GLOBUS_LOCATION=</path to globus dir>

and source $GLOBUS_LOCATION/etc/

Make sure that the following environment variable is set


sukhna 32% grid-proxy-init
Your identity: /DC=org/DC=doegrids/OU=People/CN=Karan Vahi 476301
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Sat Oct 25 03:03:12 2008

Testing the grid ftp server

sukhna 33% telnet 2811
Connected to
Escape character is '^]'.
220 GridFTP Server 2.8 (gcc32, 1217607445-63) [Globus Toolkit 4.0.8] ready.

Testing the jobmanager

sukhna 34%
sukhna 34% globusrun -a -r

GRAM Authentication test successful
sukhna 35%

sukhna 41% setenv GLOBUS_TCP_PORT_RANGE 40000,41000
sukhna 42% globus-job-run /bin/date
Fri Oct 24 18:09:43 EDT 2008

sukhna 43% globus-job-run /bin/date
Fri Oct 24 18:11:05 EDT 2008

 gsissh -p 40022
Warning: No xauth data; using fake authentication data for X11 forwarding.
Linux ttwo 2.6.24-etchnhalf.1-686 #1 SMP Mon Jul 21 11:17:43 UTC 2008 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
/usr/bin/X11/xauth:  creating new authority file /home/sr/.Xauthority
sr@ttwo:~~UWC_TOKEN_START~1254808298002~UWC_TOKEN_END~ exit

After you have installed your User and host certs as described below, you need to run the command


This will generate a proxy valid for 12 hours

Then Follow the testing the ISI grid instructions at the bottom.

To test your own grid server. just change the hostname to your hostname

User Certs/ Host Certs ( DOE/Open Science Grid Certification Authority (CA) ) DEPRECATED

Do not use these instructions unless instructed to do so specifically

POC: Gaurang Mehta gmehta at isi dot edu:

The procedure for requesting USER and HOST Certificates from the Open Science Grid is described here

The user cert part is a bit more detailed than the host cert part, mostly because it has smaller requirements and it has less software involved in the process.

Using the Grids

NOTE: you will issue the commands from your machine containing the VDT installation.
(the following is for BASH shells, others CSH/TCSH should change these accordingly):

~UWC_TOKEN_START~1254808298003~UWC_TOKEN_END~ export GLOBUS_LOCATION=<path_to_globus>

You need to have the credentials in the right place, and initialized

$ mkdir ~/.globus

$ cp usercert.pem ~/.globus
$ cp userkey.pem ~/.globus
$ grid-proxy-init

You are ready to issued grid commands

Test the ISI Grid


  • Check if you can authenticate
    ~UWC_TOKEN_START~1254808298010~UWC_TOKEN_END~ globusrun -a -r
  • Check if you can run a job
    $ globus-job-run /bin/date
  • Check if you can transfer a file
    ~UWC_TOKEN_START~1254808298012~UWC_TOKEN_END~ globus-url-copy -dbg -vb file:///tmp/sometemp file gsi


Most of the time you will get the ill-famed

GRAM Job submission failed because the job manager failed to open stderr (error code 74)

This is caused by the fact that you are firewalled, and you can address these in the following way: one trick you can use is to specify that your machine expects replies
on a specific range of ports:

~UWC_TOKEN_START~1254808298013~UWC_TOKEN_END~ export GLOBUS_TCP_PORT_RANGE=40000,41000

Then you can tell the firewall to open those ports. On my Linksys router, there is a "Applications and Gaming" option where I can set up "Port range forwarding" that maps incoming traffic on specified ports to the machine from where I issue the Grid commands.

Setting Up Grid Servers

If you installed VDT as root, you are almost ready to make your host a server. All services are ready, installed and started by VDT for you.

Decide on local UNIX User Accounts

On your local server, create accounts for folks to run Globus jobs on, or to login (if you choose to allow gsissh logins). You can create individual unique accounts (more flexibility) (e.g. gmehta, ww-user1, ww-user2 etc.) or a group account (e.g. ww-user)

Grid Mapfile Entries

You would have to edit or create

/etc/grid-security/grid-mapfile, mapping Distinguished names to a local unix account, put each of the following entries on one full line.

If you have/created local accounts on your server "jcournoyer", "sridhar", "ww-user" you would cut and paste all distinguished names from the wiki and put them line by line.

e.g. You could set up:

"/DC=org/DC=doegrids/OU=people/CN=Jason Cournoyer 939022" jcournoyer
"/DC=org/DC=doegrids/OU=People/CN=Sridhar Gullapalli 94604" sridhar
"/DC=org/DC=doegrids/OU=People/CN=Tiberiu Stef-Praun 764752" ww-user
"/DC=org/DC=doegrids/OU=people/CN=Stephen Norris Hookway 665012" ww-user


If you have a firewall then you will have to open access to the following ports from external IP hosts listed on the wiki page above.

Table: Unix Port Numbers used by Grid software

Port 	service
22 	SSH and GSISSH over tcp
80 	HTTP server

443/8443	HTTPS server
636 	LDAP over SSL
1024 	GridFTP data return
2119 	Globus GRAM resource manager (Gatekeeper, tcp)
2811 	GridFTP contact (tcp)
7512 	MyProxy server
8080 	Web cache server


Run the grid-proxy-init command, followed by Globus-job-run and globus-url-copy tests internally. If they work, then have someone external (Manoj, Gaurang) test it. Then report success and update wiki server column.


Getting User and Host Certification

Q: How do I get my certs?

A: Refer to the documentation on the WIKI.

Q: When extracting my hostcert information, it asks me for a password for my keyfiles, how do I avoid this?

A: Globus doesn’t like password protected private keys, use the --nodes option: openssl pkcs12 -in ~/doe.p12 -nocerts -out ~/.globus/userkey.pem --nodes

Q: What do I do with my host certs?

A: Hostcerts need to go in /etc/grid-security/hostcert.pem and hostkey.pem, They need to
be owned by root, and hostkey.pem must be readable only by root.

Firewall and DNS Configuration

Q: How do I configure my computer?

A: You will need to get a static fixed Fully Qualified Domain Name which matches your computer's hostname. There are tweaks needed to make it work with NAT.

Q: What holes will I need in my firewall?

A: For a more complete discussion, refer to the firewall documentation.
The short story: You will need to open connections for hosts you expect to connect to, and if running as a host, all clients you expect to have connect.
Open the ips to the following ports: 2135,2119,2811,22,7512,8080,8443,80,443,636,1024
Also allow the hosts you are connecting to an extra 1000 port range ie from 40000-41000.

Q: Anything else I need to do?

A: In your shell set the GLOBUS_TCP_PORT_RANGE environment variable to reflect the 1000 port range (eg: export GLOBUS_TCP_PORT_RANGE=40000,41000) before running a job. One way to ensure you do this is to add the line in - this needs to be done for the server too.


Q: I’m worried that clock skew may be causing problems, what do I do?

A: Use NTP to make sure you are synched up. This is required.

Q: How do I test everything?

A1: Refer to the WIKI – Easiest way is to run a simple job:

%globus-job-run hostname /bin/date

Q: Should I run globus jobs as root?

A: No, run them as another user with your certs in HOME/.globus