The functions in the RMINC library easily perform parallel operations. However, sometimes jobs can die on the cluster and there is no ability to resume. Instead, with just one failed job, the whole functions needs to be re-run. This is especially annoying for voxelwise mixed-effects models, which require parallelization to make them feasible. RMINC performs parallelization using the Batchtools package. This tutorial will use the batchtools package directly to run voxelwise statistics, and resubmit failed jobs. It has the added benefit of running any test in a voxelwise manner.
Log into hpf and qlogin
Load necessary modules.
Start R and load necessary libraries
Read the data for this tutorial.
Step 1: Write your voxelwise function
In this example, we will run two linear mixed-effects models on the Scanbase data and perform the log-likelihood ratio test to compare the models and obtain significance for each voxel. I am going to call this function 'vxl_func'.
Step 2: Write a function that reads voxel-data and applies function
We are eventually going to apply 'vxl_func' to every voxel. But to make the problem more managable, we will apply this function to a chunk of voxels (indexed by vector 'idx'). Don't worry if you don't know what I mean by 'indexed'. You probably don't have to worry about it.
This function (called 'par_func') is going to read our data (given as minc files in a column 'mncfiles' in the data frame 'df'), mask it (given by a minc file 'mask'), extract only a chunk of voxels (indexed by vector 'idx'), and run a voxelwise function ('vxl_func') on this chunk.
Step 3: Create chunks for voxels
Create chunks of voxels. We will create chunks of 10000 voxels each (i.e. Each parallel process with do only 10000 voxels). You can change it as you please. For faster jobs, you can increase the number of voxels per parallel jobs (ex. 100000) and for slower jobs, decrease the number of voxels per parallel job (ex. 1000). We will just make a function for chunking and I'll let you pick how many voxels per job you want.
Step 4: Request resourses for your parallel jobs
Set the resourses you want for your job. Keep in mind, the more you demand, the longer it takes for your job to start running. But, of course, if you don't demand enough, your job will fail. Below, I request 16G of memory and 2 hours of walltime for each node. I also cap the number of jobs I allow at any particular time to 1000. If you submit more than 1000 jobs, batchtools will automatically wait until some jobs as finished before submitting more. This way you only have a max of 1000 jobs on the queue at any given time. It is not useful for this example, but may come in handy some other time.
You also need a template script. RMINC has one by default you can use if you want. Run this command if you want to find it:
Feel free to make your own template.
Step 5: Create registry and run parallel jobs
You are finally ready to create and run the parallel jobs.
Suppose you made a mistake and want to kill all the jobs, run this:
If jobs fail, you can use different functions to help you debug.
Sometimes, jobs can fail for no reason as they are dropped by the cluster.
You can easily reset and resubmit jobs.
You can make this into a loop to automatically resubmit dropped jobs. This is dangerous though because jobs could be dropped because of an error and resubmitting does nothing to fix bugs in YOUR code. You should always check if you have actual errors before you resubmit.
Step 6: Gather results and create a voxel statistics map
Once the jobs finish, we need to gather the results together. This can be done with the 'reduceResultsList' function. It will gather the results into a list. For this example, I will convert the list into a vector, then correct for multiple comparisons. The vector has the same number of elements as the mask. We can then create a volume with the voxel statistics results. The statistics volume can now be used with RMINC/MRIcrotome visualization tools or written to a minc file.
Step 7: Delete or load registry
You can delete the registry when you are done to save disk space.
If you do NOT delete registry, you can load the registry again in a different R session.