This page describes how to account for multiple corrections using the Threshold Free Cluster Enhancement (TFCE, [1]) method.
A detailed explanation of the steps involved is still to be done, as for now it only provides a recipe...
The TFCE code can be found on GitHub in the minc-stuffs repository:
The process of calculating the TFCE threshold this way generates many many files, which you don't need after you have your thresholds, so please remove all the minc files that are created along the way when you're done.
Overview
- Obtain the t-statistics map that you want to threshold
- Create 1000 permutations of the grouping in your data
- Calculate TFCE thresholds for all the permutations
- Calculate the maximum value for each of the TFCE results, and obtain your 1%, 5% or 10% threshold
- Remove all minc files created along the way
Step 1: Obtain t-statistics map
The example that will be used here concerns the relative differences in the brain versus genotype (wild-type (wt)and mutants (mut)).
In R:
gf <- read.csv("details_on_your_data.csv") library(RMINC) vs <- mincLm(gf$jacobians ~ gf$genotype, mask="mask.mnc") mincWriteVolume(vs, "original-tstats.mnc", 3)
Calculate the TFCE map for this file (at the moment on our system at MICe, the script should be called as TFCE.py):
TFCE original-tstats.mnc original-tstats_TFCE.mnc
The output file, original-tstats_TFCE.mnc, is the file that we now have to find the appropriate threshold for.
Step 2: creating the permutations
The next step is to create 1000 permutations, and calculate the t-statistics map for eahc of those. Adapt the following script to your needs. This works on sge.
If you are running this on hpf, see step 2b below.
#!/axiom2/projects/software/arch/linux-x86_64-eglibc2_11_1/bin/Rscript library(RMINC) # load the information about your data gf <- read.csv("details_on_your_data.csv") # create a permutation, make sure you create a permutation of the measurement you are interested in (the example here are the jacobians) gf$permuted <- gf$jacobians[sample(1:nrow(gf))] # create a name for the output file fname <- paste("permute_", Sys.info()["nodename"], "_", Sys.getpid(), ".mnc", sep="") # run the linear model vs <- mincLm(permuted ~ genotype, gf, mask="mask.mnc") # output the t-stats map mincWriteVolume(vs, fname, 3)
Copy and past the code above in a script of your own, and adjust it as necessary (I saved it as "permute-script.R"). Then submit it to the compute farms:
# First make sure that you can run your script as an executable: > chmod u+x permute-script.R > for i in `seq 1000`; do sge_batch ./permute-script.R; sleep 0.2; done
When all the permutations have finished running, make sure that 1000 permutations were actually generated. If you miss a couple, then create the missing ones:
# count how many permutations were created: > ls permute_*mnc | wc -l
Step 2b: creating the permutations on hpf
To run on hpf, create a script to generate the permutations (e.g. call this permute-script.R)
#!/hpf/tools/centos6/R/3.3.2/bin/Rscript library(RMINC) # load the information about your data gf <- read.csv("details_on_your_data.csv") # create a permutation, make sure you create a permutation of the measurement you are interested in (the example here are the jacobians) gf$permuted <- gf$jacobians[sample(1:nrow(gf))] # create a name for the output file fname <- paste("permute_", Sys.info()["nodename"], "_", Sys.getpid(), ".mnc", sep="") # run the linear model vs <- mincLm(permuted ~ genotype, gf, mask="mask.mnc") # output the t-stats map mincWriteVolume(vs, fname, 3)
Make sure your script is executable.
# First make sure that you can run your script as an executable: > chmod u+x permute-script.R
Now create a wrapper script to submit your jobs to hpf.
Modify the vmem and walltime requirements as needed. Call this script submit.script.sh.
#PBS -V #PBS -j oe #PBS -l nodes=1 #PBS -l walltime=01:00:00,vmem=16G R CMD BATCH --no-save --no-restore "permute_script.R" /dev/stdout
Now submit to hpf:
for i in `seq 1000`; do qsub submit-script.sh; sleep 0.1; done
Step 3: Calculate the TFCE thresholds for all your t-stat map
> for file in permute_*mnc; do sge_batch TFCE $file TFCE_on_${file}; done
Step 4: Compute the max for all data sets, and find the right threshold
> for file in TFCE_on_permute_*mnc; do mincstats -quiet -max_buffer_size_in_kb 10066240 -mask mask.mnc -mask_binvalue 1 -max $file; done > TFCE_max.txt
in R:
tfce <- read.table("TFCE_max.txt") # 1% threshold sort(tfce$V1)[990] # 5% threshold sort(tfce$V1)[950] # 10% threshold sort(tfce$V1)[900]
You can now use those thresholds on your original map: original-tstats_TFCE.mnc.
Step 5: Remove all temp files
> rm -f permute_*mnc TFCE_on_permute_*mnc
References
[1] S.M. Smith and T.E. Nichols, Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. Neuroimage, 2009 Jan 1; 44(1):83-98.