Altering jobs that are in the queue
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
Recently a user came by my office to complain that his jobs where not running in the queue. This is a reasonable complainte because many of our nodes are being drained to reboot and …
read morekeepalive script - solution to work around automounter problems
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
Sometimes the automounter does not properly mount the home directories on our hpcc system. This is not a problem when the job first starts because there is a epilogue script that runs before each …
read moreFiles as semaphores
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
The following is a script that is designed to make it really easy to run a large number of embarrassingly parallel jobs on our scheduling system. The trick to getting this to work is …
read morePBS quick submission script
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
I write a lot of submissions scripts for a lot of users on the HPCC. I find myself using the same tricks over and over again. Recently I came up with the following script …
read moreMakefile Mystery
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
I was working with a makfile the other day and I came across a "feature" that I was not aware of. When running my makefile I saw the following lines appear:
cat build.sh …
On Demand MakeFlow PBS script
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
We just installed MakeFlow on our system as an easy to use workflow manager that uses the familiar "makefile" syntax. MakeFlow uses a master node and schedules all of the work off to worker …
read moreNew Powertool to help checkpoint jobs
Blog post edited by Xiaoge Wang
In a previous blog post I posted my script for automatically checkpointing jobs using BLCR which enables us to run jobs longer than a week:
http://wiki.hpcc.msu.edu/x/eIHT
I didn't like the complexity of the script so I created a …
read moreHFSS script
Blog post edited by [Pat Bills]( https://wiki.hpcc.msu.edu/display/~billspat@msu.edu
) - "Migration of unmigrated content due to installation of a new plugin"
HFSS is available on the MSU HPCC. You can run interactively or in batch mode.
You may run the GUI interactive using the iCER …
read moreMonitoring Job overutilization
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
This week I was debugging some user code that was over-utilizing a compute node. The job was intended to use only 1 cpu but one of the job's libraries ended up using all the …
read moreRunning jobs longer than one week using BLCR
Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"
Our submission system is set up with a maximum walltime of one week. This works fine for most users but sometimes it is nice to be able to run a job even longer. The …
read more