Altering jobs that are in the queue

Wed 15 August 2012 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

Recently a user came by my office to complain that his jobs where not running in the queue. This is a reasonable complainte because many of our nodes are being drained to reboot and …

read more

keepalive script - solution to work around automounter problems

Wed 23 May 2012 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

Sometimes the automounter does not properly mount the home directories on our hpcc system. This is not a problem when the job first starts because there is a epilogue script that runs before each …

read more

Files as semaphores

Sun 15 April 2012 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

The following is a script that is designed to make it really easy to run a large number of embarrassingly parallel jobs on our scheduling system. The trick to getting this to work is …

read more

PBS quick submission script

Mon 19 March 2012 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

I write a lot of submissions scripts for a lot of users on the HPCC. I find myself using the same tricks over and over again. Recently I came up with the following script …

read more

Makefile Mystery

Mon 05 March 2012 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

I was working with a makfile the other day and I came across a "feature" that I was not aware of. When running my makefile I saw the following lines appear:

cat build.sh …
read more

On Demand MakeFlow PBS script

Fri 11 November 2011 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

We just installed MakeFlow on our system as an easy to use workflow manager that uses the familiar "makefile" syntax. MakeFlow uses a master node and schedules all of the work off to worker …

read more

New Powertool to help checkpoint jobs

Thu 06 October 2011 by Dr. Dirk Colbry

Blog post edited by Xiaoge Wang

In a previous blog post I posted my script for automatically checkpointing jobs using BLCR which enables us to run jobs longer than a week:

http://wiki.hpcc.msu.edu/x/eIHT

I didn't like the complexity of the script so I created a …

read more

HFSS script

Wed 08 June 2011 by Dr. Dirk Colbry

Blog post edited by [Pat Bills]( https://wiki.hpcc.msu.edu/display/~billspat@msu.edu

) - "Migration of unmigrated content due to installation of a new plugin"

HFSS is available on the MSU HPCC. You can run interactively or in batch mode.

You may run the GUI interactive using the iCER …

read more

Monitoring Job overutilization

Thu 28 April 2011 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

This week I was debugging some user code that was over-utilizing a compute node. The job was intended to use only 1 cpu but one of the job's libraries ended up using all the …

read more

Running jobs longer than one week using BLCR

Thu 21 April 2011 by Dr. Dirk Colbry

Blog post edited by Anonymous - "Migration of unmigrated content due to installation of a new plugin"

Our submission system is set up with a maximum walltime of one week. This works fine for most users but sometimes it is nice to be able to run a job even longer. The …

read more