Trash cleaners: Difference between revisions

From Genecats
Jump to navigationJump to search
(initial contents)
 
(adding hgwbeta cleaner description)
Line 24: Line 24:


The '''trashCleanMonitor.sh''' script uses a lock file to prevent it from overrunning an existing
The '''trashCleanMonitor.sh''' script uses a lock file to prevent it from overrunning an existing
running instance of these scripts.
running instance of these scripts. When this lock file exists, the system will not start a new
instance of the cleaners.  It sends email to '''hiram''' as an alert that the cleaners are overrunning
themselves.  They normally will not overrun themselves if everything is OK.  If a previous instance
failed, the lock file remains in place to keep the cleaners off until the error is recognized and
taken care of.  The complete cleaner system must finish successfully to remove the lock file.
 
==hgwbeta cleaner==
 
This first script '''hgwbeta/trashCleanMonitor.sh''' has a simple job.  It scans the '''namedSessionDb'''
table in hgcentralbeta to take care of the trash files that belong to a saved session on hgwbeta.
Trash files that are used from a saved session are moved out of the trash directory into
'''/export/userdata/ct/beta/'''
with a symlink left in the primary trash directory:
'''/export/trash/ct/someFile -> ../../userdata/ct/beta/someFile'''
 
The actual script that does this scanning, moving files, and symlinks is called from '''hgwbeta/trashCleanMonitor.sh'''
in order to ''monitor'' the successful result of the called script:
'''/home/qateam/trashCleaners/hgwbeta/trashCleaner.csh'''
The '''trashCleanMonitor.sh''' verifies that script has completed successfully via not only its return code,
but also the last line of the log file written by the script which must read: '''SUCCESS'''.  The log file written
by this script can be found in:
'''/export/userdata/betaLog/YYYY/MM/cleanerLog.YYYY-MM-DDTHH.txt'''
where YYYY is the year, MM the month, DD the date, HH the hour at the time the script runs.
 
Upon successful completion of the '''hgwbeta/trashCleaner.csh''' script the ''monitor'' script runs an '''exec'''
command for the primary '''RR''' cleaning script
'''exec /home/qateam/trashCleaners/rr/trashCleanMonitor.sh searchAndDestroy'''

Revision as of 20:55, 17 July 2013

Overview

The trash cleaning system at UCSC has evolved from a simple one-line cron job that removed older files from the /trash/ directory into a complex set of interlocking scripts. This discussion outlines the procedures and lock files that keep the system running safely.

Primary trash directory

The current trash directory NFS server is on the server: rrnfs1

You can login to that machine via the qateam user.

A cron job running under the root user calls the scripts in the qateam directory. It is currently running once very 4 hours, at times: 00:10 04:10 08:10 12:10 16:10 20:10 The cluster admins maintain this root cron tab entry, it is a single command:

 /home/qateam/trashCleaners/hgwbeta/trashCleanMonitor.sh searchAndDestroy

This hgwbeta/trashCleanMonitor.sh script is going to clean trash files for hgwbeta custom tracks, and then call the primary RR trashCleanMonitor.sh to do the big job of cleaning the RR custom tracks.

Cleaner lock file

The trashCleanMonitor.sh script uses a lock file to prevent it from overrunning an existing running instance of these scripts. When this lock file exists, the system will not start a new instance of the cleaners. It sends email to hiram as an alert that the cleaners are overrunning themselves. They normally will not overrun themselves if everything is OK. If a previous instance failed, the lock file remains in place to keep the cleaners off until the error is recognized and taken care of. The complete cleaner system must finish successfully to remove the lock file.

hgwbeta cleaner

This first script hgwbeta/trashCleanMonitor.sh has a simple job. It scans the namedSessionDb table in hgcentralbeta to take care of the trash files that belong to a saved session on hgwbeta. Trash files that are used from a saved session are moved out of the trash directory into

/export/userdata/ct/beta/

with a symlink left in the primary trash directory:

/export/trash/ct/someFile -> ../../userdata/ct/beta/someFile

The actual script that does this scanning, moving files, and symlinks is called from hgwbeta/trashCleanMonitor.sh in order to monitor the successful result of the called script:

/home/qateam/trashCleaners/hgwbeta/trashCleaner.csh

The trashCleanMonitor.sh verifies that script has completed successfully via not only its return code, but also the last line of the log file written by the script which must read: SUCCESS. The log file written by this script can be found in:

/export/userdata/betaLog/YYYY/MM/cleanerLog.YYYY-MM-DDTHH.txt

where YYYY is the year, MM the month, DD the date, HH the hour at the time the script runs.

Upon successful completion of the hgwbeta/trashCleaner.csh script the monitor script runs an exec command for the primary RR cleaning script

exec /home/qateam/trashCleaners/rr/trashCleanMonitor.sh searchAndDestroy