2. General Information¶
2.1. How to log into Svante¶
To login into the Svante cluster, from a terminal window on your local computer, type:
ssh -Y «username»@svante-login.mit.edu
a password will be requested; use your athena password (your svante «username» above will be the same your athena «username»).
After you log in, you should receive a terminal prompt with your local working directory set to /home/«username»
;
you will logged into the Svante cluster login (or “head”) node svante-login
. See Section 6 for discussion of proper uses of the head node, file server
nodes, and compute nodes.
2.2. /home
spaces¶
/home/«username»/
: we have about 100 terabytes (TB) of total ‘home space’ for general purpose usage, source code, plots and figures, model builds, etc.
Every svante user is given disk space here; quotas on home space are 500 gigabytes (GB) per user.
/home
is backed up daily (offsite) and protected from disk failure via a RAID array. Svante /home
space is mounted
to all nodes in the cluster. Home space is not intended as a repository for large (or large numbers of) data files for analyses,
or to be used as disk space for large model runs.
2.3. File servers¶
Svante file servers are named fsxx (see Table 2.1), currently fs01
-fs11
, with total capacity presently roughly 3 PB.
To get onto a file server node, for example, typing
ssh fs02
(from svante-login
or any other node in the Svante cluster)
will give you a shell on file server fs02
.
Once you ssh
to a file server, local disks should be accessed as /d0
, /d1
, /d2
, and /d3
;
note some partitions not present on all fileservers. From all other nodes, this space can be reached through ‘remote’ mounts
in which your access paths would be, for example, /net/fs02/d0
to access the /d0
partition
on fs02
. Storage on these file servers is for runs, experiments, downloaded data, etc.
that will be accessed and kept for longer-term periods, and these spaces are backed up (weekly offsite backup,
although in periods of heavy use, it may take longer for backups to complete). These machines can also be accessed
externally (i.e. from outside svante): fs02
is svante2.mit.edu
, fs03
is svante3.mit.edu
, etc.
(see Table 2.1). User directories of
various sizes, organized by research group, will be created and allocated by Jeff on a project-based, need basis.
These disk spaces are reserved for research-related work; they are not intended as “personal” storage spaces,
and any such use will not be tolerated.
2.4. Compute nodes¶
Svante includes a large pool of compute nodes, which comprise the main computational engine of the cluster,
interconnected through a fast infiniband network.
Users cannot directly ssh
to log into compute nodes, but instead
must go through the SLURM scheduler, see Section 4 (the exception to this is if the user has a
running job on a compute node, ssh
to that specific node is permitted).
Users can also access local disk space ( /scratch
) on any given compute node, as limited by the size of its disk (see Table 2.1).
Every user has unlimited storage in these spaces, the caveat being there is no safeguard for you or another user filling up a local compute node disk.
These spaces are not backed up, and any files left longer than six months will be deleted without notice.
Local scratch spaces can also be accessed (say, from svante-login
or a file server node) through remote mounts /net/«cxxx»/scratch
,
where «cxxx» denotes the compute node name – see Table 2.1,
although not all remote scratch disks may be available to other compute nodes (i.e. on a compute node, use the local scratch disk, not
another compute node’s scratch space via a /net
mount).
2.5. Archive space¶
Server fs01
contains two archive spaces. /data
holds downloaded external data, e.g. reanalysis data,
CMIP data, etc. Let us know if you require external data sets to be stored on svante; if we think such data might
benefit general users, we would be happy to store it here. /archive
holds file server spaces of users that
are no longer actively working at MIT; these spaces may be compressed or uncompressed, depending on the data format
and likelihood access will be required. Our general policy is to maintain users’ data in perpetuity.
Current svante users do not have individual spaces on fs01
,
nor can users ssh
to fs01
to use this server for computational analysis purposes.
2.6. Svante node specs¶
Node(s)
|
External
Name
|
#
Cores
|
RAM
(GB)
|
CPU
Arch.
|
CPU
|
Speed
(Ghz)
|
IB
|
Partition
|
Size
(TB)
|
Usage
|
---|---|---|---|---|---|---|---|---|---|---|
svante- login |
svante- login |
16 |
64 |
broadwell |
E5-2609 v4 |
1.70 |
EDR |
login node |
||
c041 - c060 |
16 |
64 |
sandy bridge |
E5-2670 |
2.60 |
FDR |
/scratch |
2 |
compute nodes |
|
stooges |
24 |
128 |
haswell |
E5-2680 v3 |
2.50 |
FDR |
/scratch |
4 |
compute nodes |
|
c061 - c096 |
32 |
128 |
broadwell |
E5-2697A v4 |
2.60 |
EDR |
/scratch |
4 |
compute nodes |
|
fs01 |
20 |
512 |
xeon scalable |
Silver 4210 |
2.20 |
EDR |
/data |
570 |
external datasets |
|
/archive |
570 |
old user archive |
||||||||
fs02 |
svante2 |
16 |
96 |
sandy bridge |
E5-2660 |
2.20 |
FDR |
/d0 |
60 |
Solomon |
/d1 |
40 |
Heald |
||||||||
/d2 |
40 |
CGCS + JP |
||||||||
/d3 |
40 |
Solomon |
||||||||
fs03 |
svante3 |
24 |
512 |
xeon scalable |
Silver 4214R |
2.40 |
EDR |
/d0 |
250 |
Selin |
/d1 |
120 |
Selin |
||||||||
fs04 |
svante4 |
16 |
128 |
broadwell |
E5-2620 v4 |
2.10 |
EDR |
/d0 |
100 |
Land Group |
/d1 |
100 |
Land Group |
||||||||
/d2 |
255 |
Land Group + JP |
||||||||
fs05 |
svante5 |
4 |
24 |
nehalem |
W3530 |
2.80 |
FDR |
/d0 |
40 |
Land Group |
/d1 |
120 |
Land Group |
||||||||
fs06 |
svante6 |
OFFLINE |
||||||||
fs07 |
svante7 |
16 |
96 |
sandy bridge |
E5-2660 |
2.20 |
FDR |
/d0 |
80 |
Land Group |
/d1 |
70 |
Land Group |
||||||||
fs08 |
svante8 |
16 |
128 |
ivy bridge |
E5-2640 v2 |
2.00 |
FDR |
/d0 |
150 |
Marshall |
fs09 |
svante9 |
12 |
128 |
ivy bridge |
E5-2620 v2 |
2.10 |
FDR |
/d0 |
50 |
Marshall |
/d1 |
65 |
Marshall |
||||||||
/d2 |
70 |
Marshall |
||||||||
fs10 |
svante10 |
16 |
128 |
ivy bridge |
E5-2650 v2 |
2.60 |
FDR |
/d0 |
110 |
Marshall |
/d1 |
110 |
JP |
||||||||
/d2 |
110 |
JP |
||||||||
fs11 |
svante11 |
24 |
512 |
broadwell |
E5-2650 v4 |
2.20 |
EDR |
/d0 |
150 |
Selin |
/d1 |
150 |
Selin |
||||||||
gesochem |
geoschem |
12 |
32 |
ivy bridge |
E5-2620 v2 |
2.10 |
FDR |
/data |
166 |
geoschem data |
Notes:
At present, all svante nodes’ CPUs use Intel architecture.
Svante’s operating system is Linux and by default terminal windows are Bash shell, although we maintain limited legacy support of C shell.
Nodes with external names can be accessed from outside the Svante cluster, e.g.
ssh -Y «username»@svante2.mit.edu
would log you intofs02
.From inside Svante, it is possible to
ssh
to compute nodes as well as file servers (as discussed above), but for compute nodes only if the user has a SLURM job running concurrently on that specific node.The stooge nodes –
curly
,shemp
,moe
, andlarry
– do not follow the cxxx compute node naming convention, but are in the same partion with the other FDR nodesc041
-c060
. Until this is changed, requesting stooge nodes to the exclusion of other FDR nodes is a bit awkward; one can always however request a specific node by name using the SLURM-w
option (see Section 4.2)Although the clock speeds are fairly similar across all nodes, in terms of general speed, nehalem < sandy bridge < ivy bridge < haswell < broadwell < xeon scalable. Your job might run 40% faster on
c070
thanc045
, for example. There have been improvements in cpu efficiency and RAM speed over the years, which translates to faster calculations, etc. that typically will improve your job’s efficiency.Infiniband (IB) is a super-fast network connection, running in parallel with the standard ethernet connection. Note however that although the FDR and EDR switches are interconnected, MPI jobs cannot span across FDR and EDR compute nodes.