Quick User Guide

Disclaimer: This is a reference manual for RUPC users, describing the organization of the Beowulf cluster, the resources available and the Sun Grid Engine queuing system. We encourage users to spend a few minutes in reading this quick manual to fully profit from the cluster usage. Quick user's guide for previous system could he found here.

Cluster Access:

Cluster Structure (Hardware):

Cluster Structure (Software):

Queuing System

In order to run serial and/or parallel jobs on the cluster, you must prepare a job control file and submit it from any of the submit nodes (rupc03-09). Job control files are nothing else as shell scripts, with additional information specifying the queue, the number of CPUs, etc. We strongly encourage you to use job script templates given below.

Currently activated queues (2019/09/24):

Backup:

Snapshots of home directory are taken every 5 hours. Hourly snapshots are available from any login node in directory /snapshot/hour. . Daily snapshots are available from any login node in directory /snapshot/day.
Hourly snapshots are complete images of your home directory taken at 7am, 12pm, 5pm and 10pm. Daily snapshots are usually taken at 3am every night.

Daily snapshots of work directory are available at any login server in directories: /mnt/swkXX/ where XX is number of your work directory folder. (for the XX number see /work directory: ls -l /work and your name).
Example: /mnt/wk19/user/  means XX=19 i.e. backup directory is /mnt/swk19.

Note:
Snapshot directories are read only i.e. you can not modify files there! One can only copy deleted files from them.

Storage Policy:

Home directory usage should not exceed          50 GB.
Work directory usage should not exceed           1.5 TB
Storage directory usage should not exceed      500 GB  [all data should be in tar and gzipped form].

Cluster Usage Policy:

Common rules:
There are 4 sub clusters, which can be used by most of users:
 I) Physics (located at Physics Rm#284, common usage)
     1) can be accessed from rupc04
     2) can also be accessed from other rupc frontend nodes by typing "phys"

 II) CORE  (located at CoRE building)
     1) can be accessed from rupc02, rupc06, rupc08, rupc09
     2) can also be accessed from other rupc frontend nodes by typing "core"

III) CORE2  (located at CoRE building)
     1) can be accessed from rupc05, rupc07
     2) can also be accessed from other rupc frontend nodes by typing "core2"

Although most of users have very high limit on the number of cores which they can simultaneously use, it is expected that users will be moderate, and not use disproportional amount of computer time. If you need more computer time, you should get a special permission, which will be granted if resources are available.



There are three groups of users allowed to run jobs in the cluster.  Users are divided depending on a group supervisor.
  1. Profs. K. Rabe and D. Vanderbilt :  Jobs can be run on Physics and CoRE2  sub clusters.    Policy (very important, must read)
  2. Profs. G. Kotliar and K. Haule   : Jobs can be run on Physics and CoRE  sub clusters. Policy (very important, must read)
  3. Prof. J. Pixley   : Jobs can be run on Physics and CoRE(queue "jed")  sub clusters. Policy (very important, must read)

And finally:

Notes:


Copyright © 2024, HPC, RUPC, Rutgers, The State University of New Jersey.
Thursday, 28-Nov-2024 03:36:12 EST