Changes:
job.uses(file, link=Link.INPUT)
is now job.uses(file, type=LinkType.INPUT)
job.uses(file, type=LinkType.INPUT, transfer=True, register=True)
is now job.uses(file, type=LinkType.INPUT, stage_out=True, register_replica=True
add_PFN(PFN("uri", "site_name"))
is now just add_pfn(*uris, site="local"))
add_profile(Profile(namespace, key, value))
is now just add_profile(namespace, key, value)
job.uses(....)
is now split up into job.add_inputs(*inputs)
and job.add_outputs(*outputs, stage_out=True, register_replica=True) (stage_out and register_replica would apply to all of *outputs, and so this can be called multiple times if different values of stage_out and register_replica must be set for some subset of *outputs)
transformation.uses(exec)
is now transformation.uses(*execs)
obj.metadata(key, value)
is now obj.add_metadata(key, value)
dag.depends(parent=some_parent, child=some_child)
is now dag.add_dependency(parent, *children)
Given what we have in DAX4 (middle column below), we could rewrite it to be something like what we have in the right most column below. Functionality it is the same. At least in this example, it looks a little less cluttered and simpler to read than the middle column while saving a few lines in the process...
|
|
|
---|
|
|
---|
|
|
---|
''' Trying to be declarative, but then workflow.add(...) might need to be workflow.jobs(Job(...)) to be consistent. Don't like how the function names aren't verbs. ''' job = workflow.add(Job(transformation, arguments))\ .profiles(namespace, "key", "value")\ .metadata({"time": "60"})\ .metadata({"key": "value"})\ .hooks(ShellHook(args))\ .inputs(a)\ .outputs(b1, b2, stage_out=True) ''' Verbose, explicitly stating operations on job. The following two assignments are the same. ''' job = workflow.add_job(transformation, arguments...)\ .add_profile(namespace, "key", "value")\ .add_metadata(**{"time": "60", "key": "value"})\ .add_shell_hook(args)\ .add_inputs(a)\ .add_outputs(b1, b2, stage_out=True) job = workflow.add_job(Job(transformation, args))\ .add_profile(Profile(namespace, "key", "value"))\ .add_metadata(*[Metadata("time", "60"), Metadata("key", "value")])\ .add_shell_hook(args)\ .add_inputs(a)\ .add_outputs(b1, b2, stage_out=True) ''' List/Set semantics.Can't chain as job.<member>.add(something) won't return back a ref to the job (unless we make it but that would look awkward) ''' job = workflow.add(Job(transformation, arguments)) job.profiles.add(namespace, "key", "value") job.metadata.add({"time": "60"}) job.metadata.add({"key": "value"}) job.hooks.add(ShellHook(args)) job.inputs.add(a) job.outputs.add(b1, b2, stage_out=True) |