The following table presents the statistics all the job involves in the SIPHT workflow.

The following table presents the statistics information on the sum of files generated during the runs. In total, there were 1640 SOIs. 

Name

Value

nb

1640

sum

52 GB 405 MB

max

307 MB 851 KB

min

121 KB 304 B

avg

32 MB 732 KB

std

48 MB 275 KB

q1

1 MB 91 KB

q3

43 MB 746 KB

Range

121 KB-25 MB

25 MB-51 MB

51 MB-77 MB

77 MB-102 MB

102 MB-128 MB

128 MB-153 MB

153 MB-179 MB

179 MB-205 MB

205 MB-230 MB

230 MB-256 MB

256 MB-282 MB

282 MB-307 MB

Count

1013

277

135

75

57

19

26

12

8

7

6

5

How to run a multiple-SOI run:

We have implemented a 'multiple_soi_vs_all.pl' script which creates a DAX that contains inner DAXs.
The way this script works is quiet sample and relies on the single SOI version: 'pegasus_sipht.pl'
First, the script calls 'pegasus_sipht.pl' and creates only the DAX for one SOI, then it create the outer DAX which has the reference of all the DAX.
So the outer DAX is a bag of DAX to execute.

Then a simple pegasus-plan and pegasus-submit is called on the outer DAX (hdax.xml).

The 'multiple_soi_vs_all.pl' has 2 options to control the execution:

-maxjobs         maximum number of SIPHT workflow run concurrently, by default 50
-maxpre  maximum number of pegasus-plan invoke at the some time, by default 5

maxpre is used to avoid the case where to many pegasus-plan are called at the same time (too many instance of JVM) and may cause a high load on the submit host.
A typicall command line to invoke this script is :

#>source /scratch/auto_sRNAPredict/pegasus/setup-with-pegasus
#>/scratch/auto_sRNAPredict/pegasus/pegasus_multiple_vs_all.pl -all -c /scratch/auto_sRNAPredict/config/default_all_search.config -maxjobs 40

With the last run (sept_24_2009) 1612 DAXs produced a result and 28 failed.
There are 3 different causes of failure :

  • missing files (total=6)
  • for 5 of them, the binary ('sRNAPredictMB2_nocondor') of the SIPHT worklfow do not produce the output file (OutCandidates).
    here is the list of soi concerned by this issue :
    NC_005090
    NC_010530
    NC_012438
    NC_012440
    NC_012526
  • for 1 of them ,the binary ('sRNAPredictMB2_nocondor') of the SIPTH workflow do not produce the output file (Seq_NC_010999).
    here is the list of soi concerned by this issue :
    NC_010999
  • 'rnamotif' failed with signal 11 (total=22)
    here is the list of soi concerned by this issue :
    NC_009377
    NC_009378
    NC_010175
    NC_010486
    NC_010660
    NC_010688
    NC_010942
    NC_010997
    NC_011003
    NC_011498
    NC_011526
    NC_011602
    NC_011603
    NC_011890
    NC_011901
    NC_011988
    NC_011995
    NC_012125
    NC_012226
    NC_012483
    NC_012489
    NC_012560
    here is the condor error associate with this error:
    09/27 00:40:57 ERROR: the following job(s) failed:
    09/27 00:40:57 ---------------------- Job ----------------------
    09/27 00:40:57       Node Name: RNAMotif_ID000002
    09/27 00:40:57          NodeID: 1
    09/27 00:40:57     Node Status: STATUS_ERROR
    09/27 00:40:57 Node return val: -11
    09/27 00:40:57           Error: Job proc (316302.0) failed with signal 11
    09/27 00:40:57 Job Submit File: RNAMotif_ID000002.sub
    09/27 00:40:57   Condor Job ID: (316302)
    09/27 00:40:57       Q_PARENTS: 30, <END>
    09/27 00:40:57       Q_WAITING: <END>
    09/27 00:40:57      Q_CHILDREN: 26, <END>
    09/27 00:40:57 ---------------------------------------  <END>
    
  • No labels