Table of Contents
April 2019
April 12th 2019
- Pegasus 5.0
- Site Catalog Conversion to YAML
- mukund is mainly done
- pushed out his changes
- trying to make the tests green
- Checkpointing changes to accomodate LIGO use of vanilla universe
- Karan and Mats will explore and see if it is possible
- cumulative stdout|stderr
- what about time and duration values
- since there is no DAG Node retry and job just goes on HELD state
- Composite Events
- Kibana dashboard needs to be updated
- dropping __ in the event names
- George wants the AMQP library updated
- Will create a JIRA item
- Office Hours video
- Karan will work Jasmine to upload the video
- Site Catalog Conversion to YAML
- Papers
- RACE Paper submitted last week
- PEARC Paper this week
- Proposals
- Army Research
- enabling in-situ supports for ExaScale
- linked with what Tu is doing
- SCEC Proposal Submitted
- have a good chance
- Exascale one with Michigan
- the call will come out soon
- Ewa , Rafael and Deborah
- NSF GCR Proposal
- Modelling wild fires
- Has PRICE school input and also Deborah Post DOC
- Army Research
- EScience
- Pegasus Tutorial Proposal
- May 6, 2019: Tutorial Proposal Deadline
- Also trying for the workflow comparison paper
- Dynamo paper by George
- Pegasus connect discussion
- tabled it for later when Mats is around
- HTCondor Week
- Karan will be doing a Pegasus talk and Pegasus workshop
- Pegasus OLCF Poster
- combine the panda poster
- can also submit to EScience
- Ryan's work
- Loic is moving pachyderm setup to AWS
- Loic Rafael and Tu are working on a paper for Cluster
- Software X
March 2019
March 29th 2019
- 4.9.1 Release
- done and working on 4.9.2
- Site Catalog Conversion to YAML
- mukund working on it
- i still need to look at the bamboo tests
- bamboo faling on mount scratch thing that condor thing
- we have to fix in pegasus also. to fail on credentials in /tmp
- check and do condor_config_val on the key and check if /tmp is in there
- mainly affects all the users that use x509
- LIGO has also tripped over it . Both with Pegasus and without Pegasus
- Condor vanilla checkpointing
- karan asked him about what he is trying to do
- composite events
- check for keys with same values
- also do we need to pad extra keys for all events?
- Extensions to Jupyter Integration
- Pegasus Connect
- will discuss on whiteboard on April 12th
- will discuss on whiteboard on April 12th
March 1st 2019
- 4.9.1 Release
- moving it to early next week
- Pending Issues
- https://jira.isi.edu/projects/PM/versions/11891
- Execution environment for titan
- service dependencies
- PyOpen SSL
- Rajiv Mayani please look at that and the flask dependencies
- PyOpen SSL
- HPSS transfer client incorporation
- Set the transfers to do remotely
- Office Hours
- On Friday March 22nd on real time monitoring
- transformation catalog for 5.0
- Mukund will work on it next
- EScience?
- Paper
- pegasus-exitcode test
- success message not parsed correctly
- Programmer
- will interview the
February 2019
February 22nd 2019
- 4.9.1 Release
- Pending Issues
- https://jira.isi.edu/projects/PM/versions/11891
This raises the larger issue of how long we want to support externals packages
there are some packages we need to ship because of worker packages dependencies.
Consensus:
We remove mysql python externals package for 4.9.1 and 5.0.0And also remove the dependencies from our deb and RPM builds.
- Transfers within containers
- We are only going to transfer from within the container till people complain
- George Papadimitriou will add to the documentation.
- non ascii encoding in the stdout
- Support HPSS storage
The tools we use are htar and hsi
https://docs.nersc.gov/filesystems/archive/
- Pending Issues
- Office Hours
- George on real time monitoring.
- Date?
- George on real time monitoring.
- EScience?
- Paper
- Tutorial submission
February 1st 2019
- 4.9.1 Release
- ascii encoding breaks while parsing for monitoring events. monitors should have the population working and have log a warning error.
- but we should ensure that stdout in database still gets populated
- Karan will fix this
- ascii encoding breaks while parsing for monitoring events. monitors should have the population working and have log a warning error.
- New TC Format
- Shifter Support in Pegasus
- is in 4.9 branch
- Pegasus Annual Report
- will be working on it in coming weeks
- will ask for input
- next year report will be tricky . in terms of effort allocation.
January 2019
January 25th 2019
- 4.9.1 Release
- ascii encoding breaks while parsing for monitoring events. monitors should have the population working and have log a warning error.
- but we should ensure that stdout in database still gets populated
- ascii encoding breaks while parsing for monitoring events. monitors should have the population working and have log a warning error.
- YAML format for the TC
- the line numbers should be mentioned in the errors
- GitHub commits don't trigger bamboo builds right now
- move to webhooks?
- slack token in bamboo.yml .
- mats will look into it further
- SCEC for HPC Transfer certificate issue
- Globus online certificates messed up hpc-transfer issue.
- Data Storage at NERSC
- almost full
- Singularity container with the entry point.
- docker → singularity container conversion does not add the entry point.
January 18th 2019
- 4.9.1
- container execution
- data transfers happen within the container
- python3 issue
- vague rules to discover what python to use
- Singularity HUb URL's updated
- Documentation and tutorials need to be updated
- montage examples
- python stuff: create JIRA item
- LIGO pull requests
- Build pull request
- PAM module
- subprocess package thing
- also related to Python3 movement
- container execution
- Transformation Catalog Implementation
- Astro Py
- Shifter support at NERSC
- Panda Integration
- CENON NT
- Rusio data pull in
- fetching data might be easier
- Journal Paper
- need to write something about containers
December 2018
December 13rd, 2018
- Pegasus 4.9.1 release
- local site catalog entry creation
- based on the pegasus version on the submit host
- encoding issue in the stdout.
- local site catalog entry creation
- Pegasus 5.0 Release
- TC yaml implementation
- mukund will create a yaml schema compatible with the TC
- backwards compatibly
- case by case basis
- definitely for
- catalogs
- dax
- pegasus-transfer
- TC yaml implementation
- SWIP Paper
- we are in good shape
- Titan
- under the PBS batch gahp.
- ZTF
- the pipeline is based on docker-compose
- peter will visit ISI with postdoc Danny in January
- Tutorial at TACC
- karan has updated pegasus-init to work on wrangler
- will update the tutorial notes accordingly
- OLCF accounts
- make sure they work
- get karan and mats can login
November 2018
Nov 29th, 2018
- Ryan
- working on comparison paper with george on workflow systems
- mats, karan shared neon meeting notes with Ryan
- Pegasus 4.9.1 release
- Due for december end
- potential issue in monitord in reference to hierarchal organization of submit directories
- pegasus-submitdir
- ADASS Paper
- due tomorrow
- need to add information about sample run
- SWIP paper
- mats and karan will work on it tomorrow afternoon.
- cull out sections
- add information about updated monitoring in 4.9
- OLCF Kubernetes
- Condor is installed and configured as root
- George tried condor log directory to lustre as condor in container has to run as user not as root
- LOG_DIR should be /tmp
- volumes can be attached to container to contain workflows etc
- Dynamo
- Do dynamic scheduling
- George thinking of using flocking
- similar to what is done in OSG
- non-sharedfs deployments should work
Nov 1st, 2018
- Pegasus 4.9.0 and 4.8.5 Released
- We released it this week.
- Pegasus Business Card
- Advocate for job postings.
- Postdoc options
- Programmers
- pegasus.isi.edu/jobs
- We should take to conferences with us
- Advocate for job postings.
- Pegasus JAVA 8 dependence in RPM
- there is a disconnect between RPM and common.sh
- ADASS
- Karan working on a wlpipe demo example
- New Student
- Mukund
- Duncan started using 4.9.0 and has updated pyCBC to use singularity
- changed our container execution model
- all transfers done within the container now.
October 2018
Oct 12th, 2018
- Rescheduling meetings
- New time is Thursdays 2PM starting from last week of October
- DAX APi reporting
- Perl DAX API - Rajiv
- Atlas visit
- Wednesday we have Scientific Computing Seminar
- Will involve writing a Pegasus code generator
- Panda is second biggest after Condor on OSG
- Thursday
- Karan and George will be there.
- Mats might be available remotely
- Wednesday we have Scientific Computing Seminar
- 4.9.0 Release
- Mats preference is to skip the beta tag
- Aim for the full release
- Documentation freeze on Oct 26th
- Try and do the builds over the weekend
- Duncan container usecase
- cvmfs hosted container images
- Demo repository
- panorama data and some runs from exogeni / nersc
- Mats has two new elastic search VM's and are part of Elastic Search cluster
- these vm's data is backed up also
Oct 5th, 2018
- Rescheduling meetings
- Either Tuesday or Thursdays
- Karan will circulate a doodle poll
- Either Tuesday or Thursdays
September 2018
September 28th, 2018
- Rescheduling meetings
- Either Tuesday or Thursdays
- Karan will circulate a doodle poll
- Either Tuesday or Thursdays
- Pegasus 4.9.0 Release
- transformation selection issue
- karan has not been able to recreate it yet.
- will look into it more today
- docker singularity pulls
- container symlink
- deprecate api's
- modify DAX generators to indicate version/ DAX API used.
- will look into ways on how to do it
- one way is workflow metadata attributes
- second is attribute to ADAG object.
- rajiv will check how it gets stored in the metrics server
- transformation selection issue
- ADASS
- will try and do a poster with Mike at ADASS
- deadline is Oct 8th
September 21st, 2018
- Rescheduling meetings
- Either Tuesday or Thursdays
- Pegasus 4.9 release
- integrity error reporting
- pegasus-statistics reporting information about integrity errors
- the unicorn dashboard for internal swip purposes
- errors are appearing in the stream
- more brainstorming required. the data is there
- not clear whether to use grafana or kibana
- does not have drill down functionality
- mix of production and test workflows
- create different queues in AMQP exchanges
- container mount point support
- karan is close to have that being implemented
- transferring outputs to multiple location
- lets say one for portal and the other for
- list of output sites
- good feature to add for 4.9.1
- update --output-site option to pegasus-plan
- pull docker images for singularity runs
- we should do for 4.9.0
- planner needs to tell pegasus-transfer an extra attribute.
- add a type attribute
- Papers
- Github private papers repo
- Deprecate stuff
- perl api
- old catalog formats
- pegasus-plots
- Hiring
- integrity error reporting
August 2018
August 24th, 2018
...