Storage and Backups
á 1 Head/management node
á 1 Interactive node
á 16 compute nodes
o IBM e326m 1U servers
o 2 dual core 2.4GHz AMD opteron 280 cpus.
o 2 or 3GB RAM
o Voltaire Infiniband PCI-E networking
o Gigabit Ethernet
á 2 storage nodes
o IBM x346 2U servers
o Both are connected to an IBM DS4000 RAID array that is available as /gpfs1 (see below)
á 1 backup server
o IBM x346 2U server
o IBM 4350 33 slot LTO-2 tape library
o Overland ??? 120 slot LTO-4 tape library
o Tivoli storage manager software (IBM)
á 1 Head/management/interactive node
á 4 compute nodes
o 2 dual core 2.2GHz AMD opteron 275 cpus
o Infinipath Infinband HTX networking
á 2 Sun graphics workstations
o 2 x 2.4GHz AMD opteron CPUs
o 4GB RAM
o Nvidia FX3000 AGP graphics card connected to 1 LCD and 1 CRT monitors
o Stereo graphics on CRT via NUvision LCD panel/polarized glasses
á This is the main storage area for the unix computers in the lab.
á It is running GPFS (General Parallel FileSystem), setup by IBM.
á Backed up every night to tape.
á For performance reasons, any tasks involving a lot of disk I/O for files should be preformed on one of the Bioinformatics cluster nodes, such as compute00
1. /gpfs1 is running a hierarchical storage management system (HSM) called Tivoli storage manager for space management.
2. A 3TB RAID array is used as the first level of storage
3. As the RAID fills up, large files are migrated to tape
1. When the filesystem reaches > 85% usage, files are migrated to bring it down to 75% usage
2. Files to be migrated are prioritized based on file size and last access time.
3. Only files > 15MB are considered for migration
4. Migrated files
1. Files that have been migrated to tape are replaced by a 1MB ÒstubÓ file that appears just like the original file on disk but only contains the first 1MB of content
2. As soon as migrated files are accessed, the software will automatically mount the tape containing the full version of the file and transfer it back to the RAID disk. Other than a slight delay, this process is transparent to the user.
3. To access a migrated file all you have to do is wait for it to be automatically retrieved. It will then remain on disk until the available space runs low again.
4. You can determine if a file has been migrated to tape using the Ôls –lshÕ command executed on one of the cluster nodes:
1. The first column indicates the size of the file resident on disk while the complete file size is indicated by the column just before the date
2. In the example below all 3 files are 1.1GB total size but the second and third files have been migrated to tape so only 1.0MB remains on disk:
# ls -lsh *.dcd
1.1G -rw-r--r-- 1 youngmat young 1.1G Dec 10 14:39 dyn000000-100000-nowat-10ps.dcd
1.0M -rw-r--r-- 1 youngmat young 1.1G Feb 2 20:43 dyn000000-100000-prot-10ps.dcd
1.0M -rw-r--r-- 1 youngmat young 1.1G Feb 2 20:45 test1.dcd
á /gpfs1/username is the default login/home dir for all users on the clusters
á On the BI cluster it is mounted using the GPFS protocol using the high-performance Infiniband network.
á On the Biochem cluster it is mounted using NFS.
á Once again: For performance reasons, any tasks involving a lot of disk I/O for files should be preformed on one of the Bioinformatics cluster nodes.
á should be mounted as /gpfs1 using the NFS protocol
á Files can be transferred on/off /gpfs1using ssh or sftp to either
á this ÒshareÓ should be mounted as the U:/ drive on lab PCs