For pegasus 3.1, we identified changes to the stampede schema

  1. Ability to connect the jobs in the DAX with the corresponding tasks in the stampede DB
    • For that we need to have dax_job_id column in the task table . This column will be populated by monitord from the kickstart record. The derivation entry in the kickstart records will hold the dax job id's.
  2. job type should also have a type dax and dag for the condor jobs.
    • the job type should be enum in the schema
  3. in the job table there we need a subwf_id column, in order to determine the condor job responsible for a particular sub workflow. This is required for the plotting tools.
  4. Contents of metrics file in the stampede DB.
    • Pegasus creates a metrics file that has a breakdown count of the jobs.
      The breakdown is according to different types of jobs . At a very minimum we need to populate the number of jobs/tasks in the DAX and the number of jobs in the executable workflow in the workflow table. We can have a separate metrics table that gives a finer breakdown of the tasks or jobs for e.g the number of compute jobs, the number of cleanup jobs etc
  • No labels