Body
Motivations for Changing the Mill's Cost Model
Pricing Estimator Note:
The pricing estimator is only an estimate. Double check the cost of your jobs! Also note that for array jobs, the price estimate is for a single job in the array!
The Status of the Mill
Of more than 250 physical nodes that make up the Mill, only 26 of them remain under warranty. 32 of these systems are from the Forge, which was our production supercomputer 2 generations ago. Of the 34 GPUs in the Mill, 26 are legacy NVidia V100s. The V100s have been EOL for over a year and Nvidia is stopping driver/Cuda support for them soon. Once this happens, the V100s will be removed from service. This will leave only the 8 H100 GPUs to service the needs of the entire S&T campus. Our hardware is failing at an unsustainable rate; it currently takes nearly a full staff member's time to perform repairs when we have the necessary parts available. As more hardware fails outside of the warranty period, there will be less hardware for use by the campus community.
What You Can do to Help
Researcher Funded Hardware: If the Mill is important to your workflow, please consider including hardware funding in your grant proposals. We can host researcher-funded hardware in the Mill as leased nodes. RSS is happy to partner with your to assist in the development of such proposals. Please reach out to us at itrss@mst.edu.
Citing the Mill: Citing the Mill helps us quantify the need for HPC resources on campus. Proof of utilization of the Mill in publications helps us demonstrate our value to the university and justify internal funding. The Mill can be cited using the following DOI: https://doi.org/10.71674/PH64-N397.
Pricing Model Changes
The Mill's previous cost model was built with the assumption of periodic funding for maintenance or replacement of hardware and facilities. The level of actual financial investment has not been sufficient to sustain the Mill, which necessitates a change.
This new cost model does not solve the problem of funding for replacement hardware. Replacement level funding would require higher prices than we are charging, and would need to have started when the hardware was originally purchased. This funding, and previous priority tier funding, is meant to provide a modest budget for replacing failing hardware in an attempt to prolong the useable life of the Mill.
Current Pricing Model
General Access Tier
- No cost
- Requeue or General partitions
- Limited access to all hardware
- Max runtime of 2 days
Lease Tier
- Payment per node per year
- Full access to leased nodes
- Only wait on your lab groups jobs
- Max runtime of 28 days
New Pricing Model - Three Tier
In broad strokes, the new pricing model will convert the general partition to a pay-as-you-go model. Requeue will remain available to all users, with a lower max wall time. Leased nodes will remain largely the same. More details below:
The Mill will have three tiers for pricing:
- Opportunity (partition: requeue)
- Available at no cost, but with limitations
- Jobs are subject to preemption and requeueing
- 24 hour max runtime
- Similar to AWS Spot instances - filling unused resources
- Ideal for testing and small-scale or pre-award work
- Curiosity (Pay-as-you-go partitions: std-cpu, std-v100, std-h100)
- Increased 2-week runtime and elevated priority
- Price per resource - pay only for what you use
- Requires an MOU and MOCode on file
- Similar to AWS On-Demand instances
- Ideal for short bursts of expedited work
- Only compete for resources with others in this tier
- More cost-effective than leasing a node in many cases
- Perseverance (leased partitions)
- Similar to AWS Reserved Instances
- Ideal for prolonged, consistent computations
- Lease entire nodes on a one year cycle
- Max runtime of 28 days
- Only compete for resources with others in your group
Curiosity Tier Pricing
| Resource |
Cost/hour |
Hours/$ |
| CPU (Core) |
$0.00375 |
266.67 |
| RAM (GB) |
$0.000938 |
1066.67 |
| V100 GPU |
$0.60 |
1.67 |
| H100 GPU |
$1.20 |
0.833 |
Perseverance Tier Pricing
| Model |
Price/yr |
Cores |
RAM |
# Available |
| R6525 |
$6,250 |
128 |
512 GB |
10 |
Storage Pricing
Storage pricing remains unchanged. Current prices are as follows:
| Tier |
Price/Tb/Yr |
Description |
| Ceph |
$100 |
General-purpose storage |
| Vast |
$160 |
High-performance storage |