Basic Mill Cluster Usage

The first and most basic thing to remember when using the Mill is that you are not going to be running your workflow on the login node where you first land. The login node is there for you to do basic file operations and construct your resource requests. Then the actual work, whether it is done interactively or scheduled with a script, is done on the compute nodes. You get from the login node to the compute nodes by using the Slurm scheduler. The following sections will explain different ways to use the scheduler and how to check on your work as it runs or waits in the queue, as well as the basics of how to keep an eye on your storage use. 

Interactive Usage 

It is recommended that before submitting a script for a large job you test your workflow interactively on a reduced problem size. This helps to prevent wasted time if your first attempts at writing a batch script fail. Using the cluster “interactively” just means that you are running the commands or software yourself rather than submitting a script which is scheduled and runs unattended. This can be done using a shell on a compute node or using a desktop session in OpenOnDemand (further info below). 

The best way to get an interactive session on a compute node is with the “salloc” command. This command has many options to allow you to request exactly the resources you need, but for just the basics you can use the following as a template: 

salloc -c 8 –mem 16G –time 60 -p requeue 

This requests 8 cpu cores and 16Gb RAM for 60 minutes in the requeue partition. You can change the numbers to suit your needs and explore additional options like GPUs or using multiple nodes later if you need them for your work. (https://docs.itrss.umsystem.edu/pub/hpc/mill#submitting_a_job

Scheduled Usage 

Once you have confirmed that your workflow is functioning properly in an interactive session, you may want to submit a script that will run unattended on a larger problem or dataset. The options you must specify for this type of request are very similar to what you do for an interactive session, but the way you specify them is somewhat different. 

When submitting a scheduled job, you put your request into a script called a “batch file” that contains both your resource requests and the commands you wish to run. There are also a few extra options that are useful for this type of job, like giving your request a name and specifying where the output will go. More advanced options also allow you to set an email address to notify you when the job starts and finishes, or even schedule multiple steps that are dependent on one another. (https://docs.itrss.umsystem.edu/pub/hpc/mill#submitting_a_job

OpenOnDemand 

Another way you can use the Mill is through a web interface called OpenOnDemand. This provides a somewhat more user-friendly way to do things like file management, job monitoring, or even getting a terminal session on the login node. It also allows you to do some things that you can’t do otherwise, like using a remote desktop session for GUI programs. If you want to use things like VScode or Jupyter Notebook, OOD is also the easiest way to get started with those. An additional benefit to OOD is that you do not need a vpn connection or ssh keys to use it from off-campus. 

OOD Link: https://mill-ondemand-p1.itrss.mst.edu/ 

Managing Your Storage 

Each user gets a 50GB home directory quota upon being approved for an account on the Mill. This is the folder where your sessions will start by default and will have the same contents whether you are on the login node or a compute node. Making sure that you stay within the 50GB quota is your responsibility. To facilitate this, we have installed a program called “ncdu” on specifically the login node. This program allows you to interactively explore your data usage and see visually which folders are the largest. To use this program in leased storage, simply “cd” to that directory first, then run the “ncdu” command. 

There is a second, less obvious storage quota on the number of files in your storage. Due to the way filesystems work, creating files uses a resource called “inodes”. The entire file system has a finite number of these, so we also assign a quota on inodes in your home directory and leased storage directories. Most people will never hit these limits, but if you have many small files, it is possible. If you are under your storage quota, but still getting quota related errors, please submit a ticket (help.mst.edu) and we can see whether the issue is inodes.