To facilitate evaluation of workflow algorithms and systems on a range of workflow sizes, we have developed a workflow generator. This generator uses the information gathered from actual executions of scientific workflows on the Grid as well as our understanding of the processes behind these workflows to generate synthetic workflows resembling those used by real world scientific applications.
These workflows come from a paper by Bharathi, et. al. .
LIGO Inspiral Analysis
The code used to generate these workflows is available here.
A large collection of DAXes similar to the ones listed above is available here. Note that it is about 290MB.
These workflows come from a report by Ramakrishnan and Gannon .
|Workflow Type||Figure in Report||Example||DAX|
|LEAD Mesoscale Meteorology||Figure 1||leadmm.xml|
|LEAD ARPS Data Analysis System||Figure 2|
|LEAD Data Mining Workflow||Figure 3||leaddm.xml|
|Storm Surge SCOOP Workflow||Figure 4|
|Floodplain Mapping||Figure 5||floodplain.xml|
|Motif Network||Figure 8|
|Molecular Sciences||Figure 10||molsci.xml|
|Avian Flu||Figure 11|
|Pan-STARRS Load||Figure 13|
|Pan-STARRS Merge||Figure 14|
The code used to generate the above DAX files was written in Python and can be downloaded here.
 S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi, “Characterization of Scientific Workflows”, 3rd Workshop on Workflows in Support of Large Scale Science (WORKS 08), 2008.
 L. Ramakrishnan and D. Gannon, "A Survey of Distributed Workflow Characteristics and Resource Requirements", Indiana University Technical Report TR671, 2008.