You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 27 Next »

Table Of Contents:

Last Updated March 25th , 2011

Main Focus

  • Add notification support to Pegasus
  • Moving auxillary tools to stampede db
  • Other Stampede Related Changes
  • User Guide Reorganization 
  • Testing Framework and Testing.

Notification Support in Pegasus via monitord

  • Monitord needs to be managed by Condor. What happens if monitord crashes or condor/system crashes? We want monitord to come up automatically as Condor recovers after a restart. [Fabio,Gaurang]*

  • Monitord needs to support notifications [Fabio]

  • Requires changes to Pegasus to generate input file for monitord [Rajiv,Karan]

  • Come up with default notify scripts in the toolkit that notify the user and generate some status reports. [Gaurang]

  • Changes to DAX Schema
    • Addition of invoke element at the workflow level
    • Changes to python API [Gideon]

    • Changes to Perl API [Jens]

    • Changes to JAVA API [Gaurang]

    • Change to JAVA Parser [Karan]

    • Instead of pegasus-run launching monitord, monitord should appear as an independent job in the workflow with the highest priority [Karan]

    • Fabio needs to make sure exitcodes are thrown correctly and restarts are handled correctly. [Fabio]

Monitord Management

Open Question

  • Notifications are required at the workflow level. But how does it affect the DAX/DAG jobs ? In the parent workflow, the dax and dag jobs are jobs, and at same time they have separate sub workflows associated with them. So notifications for a DAX/DAG jobs in the parent workflow will clash with workflow level notifications in the sub workflows.

Auxillary Tools to Stampede DB

  • pegasus-statistics [Prasanth]

  • pegasus-plots [Prasanth]

  • pegasus-analyzer [Fabio]

Monitord Changes [Fabio]

  • Monitord also needs to be able to account for newer versions of Condor DAGMan creating a jobstate.log file.

Stampede Related Changes

  • Improve Rescue DAG semantics [Rajiv]

  • Additional DB schema changes to be able to connect jobs/tasks in the DAX with corresponding kickstart records [Fabio,Monte]

  • Addition of workflow metrics file containing distribution of jobs into the DB [Fabio,Karan]

    • Revive the metrics file created by Pegasus . Should be populated in the submit directory. [Karan]

    • Monitord picks up the metrics file and stores it in the DB [Fabio]

Usability Changes

  • Addition of -conf option
    • Java Clients [Prasanth,Rajiv,Karan]

    • Python Clients [Prasanth,Fabio]

      • pegasus-statistics, pegasus-plots, pegasus-analyzer, monitord
    • Perl Clients [Gaurang]

  • Improvements to pegasus-tc-client [Prasanth]

  • Improvements changes to pegasus-rc-client [Rajiv]

    • Investigation of RLS compatibility issues
  • Addition of default categories to allow for easier specification of category based knobs at DAGMan level
    • cleanup jobs
    • subdax jobs

User Guide Reorganization [Bill]

  • Dependant on Bill

Testing Framework and Testing

People Involved [Jens,Gaurang]

Porting the VM to 3.1.0

People Involved [Karan,Rajiv]

  • Addition of new exercises
  • No labels