Last Updated Feb 28th , 2011
- Improving how we get data to worker nodes. Being able to retrieve data on the worker nodes from various sources.
- Add notification support to Pegasus
- Moving auxillary tools to stampede db
- Other Stampede Related Changes
- User Guide Reorganization
- Testing Framework
Data to Worker Nodes
The main change for 3.1. The developers document has complete details
Download Developers Document
- Refactor how clustered jobs are handled
- Addition of staging-sites option to pegasus-plan
- New Shell Job Wrapper for jobs when running on worker nodes
- Changes to pegasus-cleanup
- Changes to pegasus-createdir
- Special S3 support into pegasus-transfer, pegasus-cleanup and pegasus create dir
- Bypassing of staging-site while staging in input data
- Bypassing of staging-site while staging out output data
- Transfer of braindump file to remote workflow execution directories
Notification Support in Pegasus via monitord
- Monitord needs to support notifications
- Requires changes to Pegasus to generate input file for monitord
- Come up with default notify scripts in the toolkit that notify the user and generate some status reports.
- Monitord needs to be managed. What happens if monitord crashes or condor/system crashes? We want monitord to come up automatically as Condor recovers after a restart.
- Instead of pegasus-run launching monitord, monitord should appear as an independent job in the workflow with the highest priority
Auxillary Tools to Stampede DB
- Monitord also needs to be able to account for newer versions of Condor DAGMan creating a jobstate.log file.
Stampede Related Changes
- Improve Rescue DAG semantics
- Additional DB schema changes to be able to connect jobs/tasks in the DAX with corresponding kickstart records
- Addition of workflow metrics file distribution of jobs into the DB
User Guide Reorganization
- Dependant on Bill