June 2013
June 3rd, 2013
Pegasus Lite Paper
- Breakdown of the runtimes , experiments
- In case of sharedfs, the kickstart runtimes in the breakdown file will be longer
- for the S3 case we can calculate the S3 transfer time by calculating the difference between the cumulative runtimes
- doing two experiments rosetta(cpu intensive) and montage( io intensive)
May 2013
May 20th, 2013
Confluence is going slow. Mats is going to look.
Analytics are set up on Confluence now.
Pegasus Transfer
- Mats committed a new version that has support for 2-stage transfers
Pegasus S3 Client
- Gideon changed .s3cfg to .pegasus/s3cfg
Pegasus Lite Paper
- Mats is working on the experiments
- We have two weeks to the deadline
PMC Paper
- Experiments on Amazon comparing Pegasus, Pegasus w/ Clustering, PMC alone
Pegasus Service
- Finished setting up users and test suite
- Next is a quick-and-dirty ensemble manager implementation
- Gideon is going to commit a change to Pegasus that removes the dashboard components. They will live in the pegasus-service repository from now on.
Summer Student
- Need to think up a project. Needs to be research-oriented and relatively small.
- Cleanup? Precip?
Contacting users
- Find out if they need anything.
Examples
- Simple examples in Perl, Python and Java
- Gideon will add them to the examples in the pegasus Git repo
April 2013
April 22nd, 2013
- monitord prescript handling fixed
- pegasus-analyzer should detect prescript failures, and the prescript exitstatus should be logged in the database
- pegasus-statistics was updated for the job instance report
- pegasus planner
- need to confirm all checkin's are complete
- do we want to get LIGO to do a test or just release?
Pegasus statistics across workflows - Rajiv
Pegasus Lite Paper
- Mats will do the runs on Amazon
- Karan will work on paper when he comes back
pegasus-hold and pegasus-release
- any difference between doing a hold on the dagman directly or pegasus-dagman
- we need to do more investigations on monitord
BOSCO
- Mats is trying to run on HPCC
- a single job is running fine.
April 8th, 2013
- Work on it towards this week
- monitord prescript issue to fix
- pegasus statistics extensions
- across root workflows
- https://jira.isi.edu/browse/PM-507
- condor temp file
Pegasus Posters
- One at XSEDE
- joint one with BOSCO team
Pegasus Lite Paper
- Submission to IEEE Big Data
New Programmer Hire
- expanded posting on confluence
- New Programmer Hire
- will send out to HPC Wire , RENCI and USC SC Connect
April 1st, 2013
Pegasus Lite Paper
- Waiting on Ewa
- Not much we can do about the IEEE conference. The page limit is 8 , the current size of the paper.
XSEDE Poster
- Pegasus Poster. Karan will send update
- Also a joint Pegasus BOSCO poster
- Also as part of that we will get the MPI workflows up and running through Pegasus and BOSCO
Pegasus Development
- Bypass of staging input files for Pegasus Lite Case
- Inplace cleanup bug fixes done.
- pegasus-s3
- gideon checked in changes of copy from one file to another
- mats adds a pegasus transfer
- workflow cleanup nodes
- separate cleanup node in the workflow
- for hierarchal workflows we only delete the outermost workflow
- what happens if no output-site specified
- the ligo case!
- backward compatiblity for LIGO
- Pegasus Dashboard
- general javascript updates
- Generic Pegasus Slides
- 2-3 slides.
March 2013
March 25th, 2013
- Pegasus Lite Paper Submission
- We will try for https://sites.google.com/site/sweetworkshop2013/
- Karan will move the paper to the ACM format
- Pegasus-statisitcs
- Waiting on Scott to get back with the list of metrics
- Rajiv will be working on it
- pegasus-s3 changes
- we want to be able to copy output files from one s3 bucket to another
- requires changes to pegasus-transfer and pegasus-s3
- final node for cleaning up remote directories
- also related is getting the cleanup algorithm working when we bypass first level staging.
March 18th, 2013
- Mats has an RPM almost sorted out for LIGO that does not require us to have PYTHONPATH set. Instead the libraries go into standard locations
- Karan is testing this RPM at on spice-dev1 and has setup a page with instructions on how to submit a test workflow to VIRGO
- Statistics across root workflows
- earlier gaurang had generated statistics for scec runs by hand... executiing queries on the msql command line
- he does not have the queries documented anywhere
- this is something we have talked about in context of 4.3 with Rajiv
- will follow up with scott on wednesday's call
- 4.2.1 release
- backward compatibility for LIGO . still to be done
- probably next week after the pegasus annual report
- RPM to handle native python installation
- Pegasus Annual Report
- Karan will work on it this week
- Try to follow the same template as earlier.
March 4th, 2013
- Sent link on DAGMan metrics to DAGMan Metrics Reporting to Ewa
- Metrics for Rob Quick's workflow
- Gideon pushed out kickstart changes
- Rajiv has pushed changes to the queries for the dashboard.
- Setup meeting with Jaime and Derrick at OSG AHM to discuss
- remote_initialdir
- extra attributes for glite/bosco submissions
- mpi workflows.
- OSG Poster to be made this week. And 4.2 Release slides.
February 2013
February 11th, 2013
Direct submission of workflows to PBS
- Glite submission in Condor. We setup a VM that hosts a PBS scheduler and using that too test
- Karan prepared an example for 4.2 that can be used to submit directly to local PBS using the glite interfaces in Condor
- the remote_initialdir / +remote_iwd does not work
- problem for MPI codes
- for the time being, the example prepared relies on kickstart to change the directory before launching a job
- there is also a ssh style that allows us to use BOSCO to do remote submissions using SSH to a PBS cluster
- that one also has the issue of remote initialdir
- the remote_initialdir / +remote_iwd does not work
- jobstate.log refactoring.
- data transfer ( support for globus online)
- lightweight tracing
- task stats. net link socket pegasus-kickstart . how much memory the task used and io used.
- add task stats to kickstart
- ptrace
- trace linux equivalent is system tap
- dashboard improvements
- single api for clients
- last week drop down
- performance run on large workflows.
February 4th, 2013
- CCGrid / Pegasus Lite Paper
- Performance section
- remove the experiments section?
- OR
- extra experiments section
- have the squid proxy cache
- find a workshop to submit the paper
- Cloud Paper
- Ewa is working on it.
- Ewa is working on it.
- Git HUB Migration
- - couple of branches like monitord , pmc and dang are branches
- - svn will be made read only .
- - update the website with all the development information
- - bamboo scripts
- - documentation ( long term )
- - nightly builds
- SSH Submission
- - gsissh submission for blue waters
- - ssh to blue waters is required for OTP
- - passing of parameters to PBS
- - SSH key
- - ssh agent.
- - queue keyword
- - Batch session
- - submit jobs to HPCC
- - Gideon will do that.
- monitord memory explosion
- - long term for monitord
- - pegasus-dagman replacement
- minor release 4.2.1
- - potential monitord bug issue
- - long term dagman replacement
- Response time for metrics page
- - occasionally it is slow