Using CASPUR
This page is maintained by the software tutors. For errors and/or amendments please contact the current tutor supporting the program/service.
Introduction
CASPUR is a consortium located in Rome providing high performance computing (HPC) facilities. Recently, CASPUR set up a new HPC cluster called “matrix”, to which the EUI members have full access.
Cluster Description
“Matrix consists of a front-end node (accessed as matrix.caspur.it) intended for user login and job management, free interactive nodes (accessed as mrsmith.caspur.it) for users code-testing, and several number-crunching nodes (from neo001 to neo320). All the nodes are 2-way quad-core Opteron 2.1GHz with 16 GB of RAM, except number-crunching nodes from neo233 to neo258 which are equipped with 32GB of RAM. The number of available nodes may change in time for several reasons (reservations for special projects, maintenance and tests, hardware failures, new acquisitions, etc. etc.)” (source: hitchhikers-guide-to-the-matix).
Available Software (among others)
|
GAUSS 8.0
|
|
|
Matlab 2009b
|
|
|
R 2.10.0
|
|
|
Stata 11
|
|
|
Compilers for C, C++ and Fortran
|
Gnu-4.4.3 along with openmpi-1.4.1
Intel-11.1.064 along with openmpi-1.4.1
Pgi-10.3 along with openmpi-1.4.1
|
The HPC portal
A crucial resource for your work with matrix is the HPC portal . In the section “accounting” (appears only after login) you are able to see to which project group(s) you are allocated. Usually, EUI users are allocated to the project “aej”. Login to the hpc portal works with your CASPUR credentials.
Further Information
This guide is to be seen as a rough quick-start reference in order to give a first overview to the matrix cluster. It does not substitute a thorough guide to the matrix as provided on the HPC portal. Before you start your first session, we strongly recommend to read through the hitchhikers-guide-to-the-matrix: http://hpc.caspur.it/guides/the-hitchhikers-guide-to-the-matrix .
Please contact the CASPUR software tutor for any further questions.
Quickstart: How to connect
Windows (XP, Vista, 7, ...)
For Microsoft Windows, things are a bit tricky. Windows does not come with a built-in SSH client, and this makes things difficult. What you have to do is download a client from the Internet and install it, then use that client. As a shell client, we recommend putty.
-
Download putty
-
Execute putty.exe
-
Login to server matrix.caspur.it as shown below
-
Click “open” and confirm security alert
-
Type in user-name and password
-
You're in the matrix! You now have a prompt that resembles that in the Unix/Linux environment. To continue you need some basic knowledge of Linux commands, as provided below.
-
Go to project folder by typing cd /work/aej/yourusername
-
In order to close the session type exit
Mac
-
Open terminal
-
type ssh -l yourusername matrix.caspur.it
-
insert password
-
You're in the matrix!
-
Go to project folder by typing cd /work/aej/yourusername
-
In order to close the session type exit
Linux
-
Open terminal
-
type ssh -l yourusername matrix.caspur.it
-
insert password
-
You're in the matrix!
-
Go to project folder by typing cd /work/aej/yourusername
-
In order to close the session type exit
File Transfer
This section describes how to transfer files between your local computer and the remote server. The matrix cluster has three storage areas:
-
/home a small area (around 200MB) where typically customisation files are stored. It is organised on a per-user basis.
-
/work/aej/ a large area where you can organise and run your jobs, store output files etc. This area is organised on a per project group basis. EUI users are allocated to the project group aej.
-
/scratch a large area designed to store temporary files generated by your running codes and jobs. This area is organized on a per user basis but does not have a backup. This area is subject to an automatic cleaning policy to free space; the bulldozer script deletes everything older than 14 days.
Please see the section storage areas in the hitch-hikers-guide-to-the-matrix.
Windows
In order to be able to exchange files, we need a client that supports the SCP protocol. We recommend WinSCP .
-
Download WinSCP
-
Login to matrix.caspur.it (make sure that sftp is selected as shown in figure)
-
Confirm security alert with “yes”
-
In order to change directory on remote server select “Remote” - “ Go to” - “Open Directory” or select right panel and use ctrl + o. Type in the directory you want to change, e.g. /work/aej/yourusername (make sure of using slash “/” instead of back-slash “”)
Mac
For a file transfer client for Mac, we recommend Fugu
. In the Fugu documentation it says that
“Fugu has been tested on Mac OS X 10.2.x. It may work with on Mac OS X
10.1. You must also have the BSD subsystem installed, of which OpenSSH's sftp client is a part.
This is included with the default installation of Mac OS X. Mac OS X 10.0 is not supported.”
-
Download Fugu
-
Connect to matrix.caspur.it
-
Browse files
Linux (Ubuntu with file-browser Nautilus)
-
Open file browser (Nautilus)
-
Switch to “Go To”- view: use key combination ctrl + L
-
In “Go-to” input box type sftp://matrix.caspur.it
-
Insert username and password; then select “connect”
Quick Intro to basic Linux commands
|
passwd
|
change your user password
1) Type old password
2) Insert new password
3) Confirm new password
|
|
ls
|
List folders and files in current directory
|
|
cd
|
change directory
Examples:
cd .. (go to parent directory)
cd / (go to root directory)
cd ~ (go to home directory)
|
|
~
|
home directory of user
short-cut for: /home
|
|
/
|
root directory
|
Serial vs. multi-threaded vs. parallel jobs
A serial job runs on one node with one processor only. For EUI users who plan to run small standard Matlab, Gauss, R, and Stata jobs, this is the way to go. Until present, it is not possible to run parallel jobs with the mentioned software.
A parallel job uses contemporaneously several nodes and processors (CPUs). For a parallel job, one has to use special parallel modules that manage the outsourcing of different tasks across the nodes. Submitting parallel jobs are possible by using Fortran or C++. In case you need to run such a job, please contact the CASPUR software tutor.
A multi-threaded job uses several processors on one node. Some software, such as Matlab, intrinsically use multi-threading, for example to execute element-wise operations, also see Which MATLAB functions benefit from multithreaded computation? Remember that matrix uses quad-core processors, so if you assign one CPU to a job, Matlab uses implicitly all four cores.
Important
Please contact the CASPUR tutor in order to specify the optimal number of nodes and CPUs for your jobs. As the available hours on CASPUR are measured in CPU hours, the more CPUs and nodes you use the used resources increase exponentially. An example: A 12 hour job with one node and one cpu consumes 12 CPU hours. A 12 hours job with one nodes and 2 CPUs consumes 24 CPU hours. Please bear this in mind and contact the CASPUR tutor in order to figure out the optimal amount of nodes and CPUs for your jobs!
Write a jobscript
-
Open an empty text-file
-
Copy and paste the sample job-script for your favourite application provided below
-
Set the options for your job
-
-
Set number of nodes; in a serial job that is one node and one processor.
Important: Please contact the CASPUR tutor in order to specify the optimal number of nodes and CPUs for your jobs. As the available hours on CASPUR are measured in CPU hours, the more CPUs and nodes you use the used resources increase exponentially. An example: A 12 hour job with one node and one cpu consumes 12 CPU hours. A 12 hours job with one nodes and 2 CPUs consumes 24 CPU hours. Please bear this in mind and contact the CASPUR tutor in order to figure out the optimal amount of nodes and CPUs for your jobs!
#PBS -l nodes=1:ppn=1
#PBS -l walltime=24:00:00
The walltime specifies the maximum expected time for the execution of the job. Don't allocated too little hours but also don't allocate 24 hours when the job clearly only takes five minutes. With the time, you will get a good feeling for it. The maximal walltime for a serial job is 72 hours. Please see a table with possible system queues http://hpc.caspur.it/guides/the-hitchhikers-guide-to-the-matrix/resource-manager-and-job-requests#system-queues
-
Specify to which project you belong; the EUI users are allocated to the project “aej”
#PBS -A aej
-
Insert email address for notifications about the job
#PBS -m abe -M youremailaddress@eui.eu
-
save the file with the file extension .sh
Submit a serial job
-
Write a job-script (sample scripts see below), e.g. myjobscript.sh
-
Copy the jobscript to the work-directory (/work/aej/yourusername/somesubfolder) on matrix as described in section “File Transfer” above
-
Windows: switch to putty (also included in WinSCP) and login;
-
Mac/Linux: switch to terminal and login
-
Change to directory where jobscript is located
-
Submit the job by typing:
-
qsub myjobscript.sh
-
The job is submitted. For further commands to control the job (cancel etc.) please see the hitchhikers-guide-to-the-matrix.
Sample Jobscripts (Serial Jobs)
Matlab
#!/bin/sh
### set options for qsub ###
#PBS -l nodes=1:ppn=1,walltime=48:00:00 -A aej
### notify options, specify list of email addresses, i.e. xxx1@yyy.zz,xxx2@yyy.zz, ... ###
#PBS -m abe -M youremailaddress@eui.eu
### Script to run a single serial Matlab job ###
#### set your environment ####
module load matlab
## working directory ###
WORKDIR=/work/aej/yourdusername/yourdirectory
#### begin commands ####
cd $WORKDIR
# Run matlab commands in the script file my_matlab_script_file.m
# Notice, don't type the file extension (.m)!!!
matlab -nojvm -nodisplay -r my_matlab_scrip_file
GAUSS
#!/bin/sh
#PBS -l nodes=1:ppn=1,walltime=4:00:00 -A aej
#PBS -m abe -M youremailaddress@eui.eu
# Script to run a serial GAUSS job
#### set your environment ####
module load GAUSS/8.0
## working directory ###
WORKDIR=/work/aej/yourdusername/yourdirectory
#### begin commands ####
cd $WORKDIR
# Run gauss commands in the file gauss_input, output goes to file output_file
# tgauss < myGAUSSscript.g
tgauss myGAUSSscript.g
R
#!/bin/sh
#PBS -l nodes=1:ppn=1,walltime=24:00:00 -A aej
#PBS -m abe -M youremailaddress@eui.eu
# Script to run a single serial R job
#### set your environment ####
module load R
## working directory ###
WORKDIR=/work/aej/yourdusername/yourdirectory
#### begin commands ####
cd $WORKDIR
# Run R
R --no-save < myRscript.R > outputfile.out
Stata
#!/bin/sh
#PBS -l nodes=1:ppn=1,walltime=24:00:00 -A aej
#PBS -m abe -M youremailaddress@eui.eu
# Script to run a single serial stata job
#### set your environment ####
module load stata
## working directory ###
WORKDIR=/work/aej/yourdusername/yourdirectory
#### begin commands ####
cd $WORKDIR
# Run do file
stata -b do yourStataInputFile.do
Documentation