How To: Guides for users and admins.

Help: Links and emails for further info.

Configuration: technical layout of the cluster, primarily for admins.

Log: All changes done to the cluster, primarily for admins.

Notes, To Do & Sandbox

If you arrived at this page by Google search, odds are this page won't help you, it's mostly 'notes to self'. Try the admin guide.

Notes

The next shoot-node will:

To Do

Sandbox

Kernel panics/system rescue on HN

insert SL4.5 install disk 1, type linux rescue

Generally don't need the network enabled.

Mount Linux installation under /mnt/sysimage? Continue

chroot /mnt/sysimage

Execute whatever commands are needed to recover (/boot/grub/grub-orig.conf is a common source of problems preventing boot.)

exit

exit

system will reboot, remove CD from drive while rebooting

 

Holding condor jobs

To hold all the jobs running on the condor batch system, as root from the HN:

condor_status -schedd
For all nodes listed as the scheduler for running or idle jobs (e.g. compute-x-y):
ssh compute-x-y
condor_hold -name compute-x-y -all

To resume jobs:

condor_status -schedd
For all nodes listed as the scheduler for held jobs (e.g. compute-x-y):
ssh compute-x-y
condor_release -name compute-x-y -all

There must be an easier way to do this, but I don't know what it is! cluster-fork "condor_hold -all" will only hold jobs submitted by the root user.

 

Rocks Backup

Doesn't work - seg fault

 

BeStMan (via VDT)

BeStMan:
Note: As before, we are installing BeStMan on the same network mount as the CE, so we'll have the CE handle certificates. Therefore, we must wait on turning on the BeStMan services until after the CE install. We also run Globus GridFTP via the CE.

  1. Install with starting configuration:
    cd /share/apps/bestman
    pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Bestman

    *Note: As of Oct. 29, VDT has not officially released BeStMan-Gateway. To get VDT's BeStMan-Gateway, we had to use the version in development in the test-cache:
    pacman -get http://vdt.cs.wisc.edu/test-cache/bestman:Bestman
    I will change this guide once it has been officially released. Alternatively, you can download the BeStMan tarball and follow the BeStMan installation steps provided in the archived OSG 0.8 Admin guide.
    Answer yall when asked if you want to add sites to trusted.caches

    Agree to license? y
    Update CRLs automatically? n
    Cron rotation of VDT log files? y
    Where to store CA files? l (lowercase L, local)
    Update CA certificates automatically? n
    Run Globus GridFTP automatically? n
    Run BeStMan automatically? y
  2. Additional configuration:
    Note that we copied the default configuration saved in vdt-install.log and modified it to our needs.
    vdt/setup/configure_bestman \
    --vdt-install /share/apps/bestman \
    --http-port 7070 \
    --https-port 8443 \
    --globus-tcp-port-range 20000,25000 \
    --enable-gateway
  3. vdt-register-service --name bestman --enable

    --enable-sudofsmng ??? - Need to edit sudoers file???
    --disable-sudofsmng

    vdt-install.log: !!! Important: If this is your only gridmap file, please set it in server-config.wsdd as well.