Each executor that you have running will create its own log file. They are placed in your --output-dir, and have the form: (year, month, day, and the time in high precision). When an executor has finished running a stage, it writes a line into its log directory that will contain the stage number and the return of the stage. Here is an example: 10:18:22,659 __main__ INFO: Stage 1 finished, return was: 0

The end of the line "return was: 0" means that this stage finished correctly. If the return is anything but 0, something went wrong. For instance:

failed stage, indicated by the non-0 return: 10:18:39,120 __main__ INFO: Stage 58 finished, return was: 1

Here we see that stage 58 finished, but the return of the stage was non-0 (1), and so something went wrong. We can now find stage 58 in the pipeline-stages.txt file to investigate what went wrong. (For MAGeT, this will be called MAGeT-pipeline-stages.txt, for the registration chain, it will be called Registration-chain-pipeline-stages.txt, etc.) 

Verify that all stages finished correctly

The file MAGeT-pipeline-stages.txt contains all the stages that should be run. We can use this file in combination with the executor output to verify whether or not all stages have finished as follows:

# determine the number of stages to be run (wc - word count - counts the number of lines in a document; here the number of stages):
> cat MAGeT-pipeline-stages.txt | wc -l
# now we want to know how many stages have run successfully:
> grep "return was: 0"* | wc -l

When these two numbers are the same, then we know that all stages have finished. 

  • No labels