You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 28 Next »

April 2013

April 22nd, 2013

Pegasus 4.2.1 Release
  • monitord prescript handling fixed
    • pegasus-analyzer should detect prescript failures, and the prescript exitstatus should be logged in the database
    • pegasus-statistics was updated for the job instance report
  • pegasus planner
    • need to confirm all checkin's are complete
  • do we want to get LIGO to do a test or just release?

Pegasus statistics across workflows - Rajiv

Pegasus Lite Paper

  • Mats will do the runs on Amazon
  • Karan will work on paper when he comes back

pegasus-hold and pegasus-release

  • any difference between doing a hold on the dagman directly or pegasus-dagman
  • we need to do more investigations on monitord

BOSCO

  • Mats is trying to run on HPCC
  • a single job is running fine.

April 8th, 2013

Pegasus 4.2.1 Release
  • Work on it towards this week
  • monitord prescript issue to fix
Pegasus 4.3

Pegasus Posters

  • One at XSEDE
  • joint one with BOSCO team

Pegasus Lite Paper

  • Submission to IEEE Big Data

New Programmer Hire

  • expanded posting on confluence
  • New Programmer Hire
  • will send out to HPC Wire , RENCI and USC SC Connect

April 1st, 2013

Pegasus Lite Paper

  • Waiting on Ewa
  • Not much we can do about the IEEE conference. The page limit is 8 , the current size of the paper.

XSEDE Poster

  • Pegasus Poster. Karan will send update
  • Also a joint Pegasus BOSCO poster
  • Also as part of that we will get the MPI workflows up and running through Pegasus and BOSCO

Pegasus Development

  • Bypass of staging input files for Pegasus Lite Case
  • Inplace cleanup bug fixes done.
  • pegasus-s3
    • gideon checked in changes of copy from one file to another
    • mats adds a pegasus transfer
  • workflow cleanup nodes
    • separate cleanup node in the workflow
    • for hierarchal workflows we only delete the outermost workflow
    • what happens if no output-site specified
      • the ligo case!
  • backward compatiblity for LIGO
  • Pegasus Dashboard
    • general javascript updates
  • Generic Pegasus Slides
    • 2-3 slides.



 

March 2013

March 25th, 2013

  • Pegasus Lite Paper Submission
  • Pegasus-statisitcs
    • Waiting on Scott to get back with the list of metrics
    • Rajiv will be working on it
  • pegasus-s3 changes
    • we want to be able to copy output files from one s3 bucket to another
    • requires changes to pegasus-transfer and pegasus-s3
  • final node for cleaning up remote directories
    • also related is getting the cleanup algorithm working when we bypass first level staging.

March 18th, 2013

  • Mats has an RPM almost sorted out for LIGO that does not require us to have PYTHONPATH set. Instead the libraries go into standard locations
  • Karan is testing this RPM at on spice-dev1 and has setup a page with instructions on how to submit a test workflow to VIRGO
  • Statistics across root workflows
    • earlier gaurang had generated statistics for scec runs by hand... executiing queries on the msql command line
    • he does not have the queries documented anywhere
    • this is something we have talked about in context of 4.3 with Rajiv
    • will follow up with scott on wednesday's call
  • 4.2.1 release
    • backward compatibility for LIGO . still to be done
    • probably next week after the pegasus annual report
    • RPM to handle native python installation
  • Pegasus Annual Report
    • Karan will work on it this week
    • Try to follow the same template as earlier.

March 4th, 2013

  • Sent link on DAGMan metrics to DAGMan Metrics Reporting to Ewa
  • Metrics for Rob Quick's workflow
  • Gideon pushed out kickstart changes
  • Rajiv has pushed changes to the queries for the dashboard.
  • Setup meeting with Jaime and Derrick at OSG AHM to discuss
    • remote_initialdir
    • extra attributes for glite/bosco submissions
    • mpi workflows.
  • OSG Poster to be made this week. And 4.2 Release slides.

February 2013

February 11th, 2013

Direct submission of workflows to PBS

  • Glite submission in Condor. We setup a VM that hosts a PBS scheduler and using that too test
  • Karan prepared an example for 4.2 that can be used to submit directly to local PBS using the glite interfaces in Condor
    • the remote_initialdir  / +remote_iwd  does not work
      • problem for MPI codes
      • for the time being, the example prepared relies on kickstart to change the directory before launching a job
    • there is also a ssh style that allows us to use BOSCO to do remote submissions using SSH to a PBS cluster
      • that one also has the issue of remote initialdir

 - jobstate.log refactoring. 

 - data transfer ( support for globus online) 

- lightweight tracing

 -  task stats. net link socket pegasus-kickstart . how much memory the task used and io used. 

 - add task stats to kickstart

 - ptrace

 - trace  linux equivalent is system tap

 

- dashboard improvements

 - single api for clients

 - last week drop down

 - performance run on large workflows.

 

February 4th, 2013

  • CCGrid / Pegasus Lite Paper
    •  Performance section
    •  remove the experiments section?
    •  OR
    •  extra experiments section 
    •  have the squid proxy cache
    • find a workshop to submit the paper
  • Cloud Paper
    •  Ewa is working on it.

  • Git HUB Migration
    •  - couple of branches like monitord , pmc and dang are branches
    •  - svn will be made read only . 
    •  - update the website with all the development information
    •  - bamboo scripts
    •  - documentation ( long term )
    •  - nightly builds
  • SSH Submission
    •  - gsissh submission for blue waters
    •  - ssh to blue waters is required for OTP
    •  - passing of parameters to PBS
    •  - SSH key
    •  - ssh agent.
    •  - queue keyword
    •  - Batch session
    •  - submit jobs to HPCC
    •  - Gideon will do that. 

  • monitord memory explosion
    •  - long term for monitord 
    •  - pegasus-dagman replacement 

  •   minor release 4.2.1
    •  - potential monitord bug issue
    •  - long term dagman replacement

  • Response time for metrics page
    •  - occasionally it is slow
  • No labels