c y l c  

cyclical workflow automation

introduction + demo

NOAA EMC, Nov 2017 - Hilary Oliver

outline

a bit of history

NIWA forecasting systems ~2007+

a (non-cyclical) workflow

https://cylc.github.io/cylc/

who's using cylc?

(as of Sept 2016)

Many of these are UM sites, but cylc is not in any way UM-specific.

NIWA (NZ) *
Met Office (UK) *
Max-Planck-Institut für Meteorologie (DE)
Deutches Klimarechenzentrum (DE)
Bureau of Meteorology (AU) *
NRL Marine Meteorology Division (US)
557th Weather Wing (US) *
Geophysical Fluid Dynamics Laboratory (US)
Meteorological Service Singapore (SG) *
South African Weather Service (ZA) *
National Centre for Medium Range Weather Forecasting (IN) *
Korean Meteorological Administration (KR) *
National Center for Atmospheric Research - NCAR (US)

* used with Rose, a framework for managing meteorological suites.

what's a cyclical workflow?

you might need...

how to manage cyclical workflows

1. statically?

represent each instance of a cyclical job with a different task

static (.webm vid)

.

2. fixed cycling?

finish each cycle entirely before starting the next cycle

.

fixed cycling (.webm vid)

.

3. a single ongoing workflow composed of individually cycling tasks

may continue indefinitely - cylc generates workflow

cycling (.webm vid)

.

static (.webm vid)

cylc (.webm vid)

cycling, w/ delay (vid.)

cycling (vid.)

cylc system overview

distributed architecture

Supercomputer as fashion accessory! Cray XMP early 80s?

user interfaces

cylc-7 daemons are web servers

basic workflow definition

suite (workflow) definition


 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye"
 [runtime]
    [[hello]]
       script = echo "Hello World!"
    [[goodbye]]
       script = goodbye.exe
  

 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye"
.[runtime]
.   [[hello]]
.      script = echo "Hello World!"
.   [[goodbye]]
.      script = goodbye.exe
  

[scheduling]
determines WHEN tasks can run


.[scheduling]
.   [[dependencies]]
.      graph = "hello => goodbye"
 [runtime]
    [[hello]]
       script = echo "Hello World!"
    [[goodbye]]
       script = goodbye.exe
  

[runtime]
determines WHAT to run
(and WHERE and HOW to run it - not shown)


# Hello World! 1
[scheduling]
   [[dependencies]]
      graph = "hello"
[runtime]
   [[hello]]
      script = "echo Hello World!"  # <-- inlined scripting

(what this does...)


# Hello World! 2
[scheduling]
   [[dependencies]]
      graph = "hello"
[runtime]
   [[hello]]
      script = hello-world.sh  # <-- external script or program

(what this does...)

 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye & farewell"
 [runtime]
    [[hello]]
       script = echo "Hello World!"
    [[goodbye]]
       script = echo "Goodbye World!"
    [[farewell]]
       script = echo "Farewell World!"
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye & farewell"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = """
               hello => goodbye
               hello => farewell
               """
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = """# a comment
               hello => goodbye
                  # another comment
               hello => farewell
               """
 # ...

task trigger states

hello => goodbye
is short for:
hello:succeed => goodbye

but we can trigger off of other task states too, e.g: hello:submit
hello:start
hello:fail
hello:finish
hello:file1_done

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = echo "Hello World!"
 # ...

inlined scripting

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = hello-world.sh
 # ...

external script or program...

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = hello.sh "World" 10
 # ...

...with command line arguments...

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = hello.sh  # says 'Hello' to $OBJECT
       [[[environment]]]
          OBJECT = World  # set $OBJECT to 'World'
 # ...

... with a custom environment...

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = """
           echo 'Hello World!' > hello.txt
           cat farewell.txt
                """
 # ...

... or multi-lined scripting to do whatever you like

 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = echo "Hello World!"
 # ...
 # Hello World! Plus

 # ...
 [runtime]
    [[hello]]
       script = echo "Hello World!"
       [[[remote]]]
          host = hpc-1.niwa.co.nz
       [[[job]]]
          batch system = pbs
          execution time limit = PT1H 
       [[[directives]]] 
          -q = big_jobs 
          -A = QXZ5W2
 # ...

(plus WHERE and HOW to run the task job. Default is background job on localhost)

 # Hello World! Plus

 # ...
 [runtime]
    [[goodbye]]
       script = echo "Goodbye World!"
    [[farewell]]
       script = echo "Farewell World!"
 # ...
 # Hello World! Plus

 # ...
 [runtime]
    [[goodbye]]
       script = echo "Goodbye ${OBJECT}!"
       [[[environment]]]
          OBJECT = World 
    [[farewell]]
       script = echo "Farewell World!"
 # ...
 # Hello World! Plus

 # ...
 [runtime]
    [[goodbye]]
       script = echo "Goodbye ${OBJECT}!"
       [[[environment]]]
          OBJECT = World
    [[farewell]]
       script = echo "Farewell ${OBJECT}!"
       [[[environment]]]
          OBJECT = World
 # ...
 # Hello World! Plus

 # ...
 [runtime]
    [[goodbye]]
       script = echo "Goodbye ${OBJECT}!"
       [[[environment]]]
          OBJECT = World  # <---
    [[farewell]]
       script = echo "Farewell ${OBJECT}!"
       [[[environment]]]
          OBJECT = World  # <---
 # ...

if tasks share any configuration...

 # Hello World! Plus

 # ...
 [runtime]
    [[BYE_ALL]]
       [[[environment]]]
          OBJECT = World  # <--
    [[goodbye]]
       inherit = BYE_ALL
       script = echo "Goodbye ${OBJECT}!"
    [[farewell]]
       inherit = BYE_ALL
       script = echo "Farewell ${OBJECT}!"
 # ...

...factor it out into a task family

 # Hello World! Plus

 # ...
 [runtime]
    [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World
    [[goodbye]]
       inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
    [[farewell]]
       inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 # ...
 [runtime]
    [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
    [[goodbye]]
       inherit = BYE_ALL, HPC-1
.      script = echo "Goodbye ${OBJECT}!"
    [[farewell]]
       inherit = BYE_ALL, HPC-2
.      script = echo "Farewell ${OBJECT}!"
 # ...
    [[HPC-1]]
       # ...
    [[HPC-2]]
       # ...

multiple inheritance - avoid all duplication

 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = hello => goodbye & farewell
 [runtime]
    [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
    [[goodbye]]
       inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
    [[farewell]]
       inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = hello => BYE_ALL
 [runtime]
    [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
    [[goodbye]]
       inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
    [[farewell]]
       inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = hello => BYE_ALL
.[runtime]
.   [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
.   [[goodbye]]
.      inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
.   [[farewell]]
.      inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = "hello => BYE_ALL => leave"
.[runtime]
.   [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
.   [[goodbye]]
.      inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
.   [[farewell]]
.      inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = """hello => BYE_ALL
                  BYE_ALL:succeed-all => leave"""
.[runtime]
.   [[BYE_ALL]]
.      [[[environment]]]
.         OBJECT = World 
.   [[goodbye]]
.      inherit = BYE_ALL
.      script = echo "Goodbye ${OBJECT}!"
.   [[farewell]]
.      inherit = BYE_ALL
.      script = echo "Farewell ${OBJECT}!"
 # ...
 # Hello World! Plus

 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye & farewell"
 [runtime]
    [[hello]]
       script = echo "Hello World!"
    [[goodbye]]
       script = echo "Goodbye World!"
    [[farewell]]
       script = echo "Farewell World!"
 # Hello World! Plus


 [scheduling]
    [[dependencies]]
       graph = "hello => goodbye & farewell"
.[runtime]
.   [[hello]]
.      script = echo "Hello World!"
.   [[goodbye]]
.      script = echo "Goodbye World!"
.   [[farewell]]
.      script = echo "Farewell World!"
 # Hello World! Plus


 [scheduling]
    [[dependencies]]
       graph = """hello

           => goodbye & farewell
          
               """
.[runtime]
.   [[hello]]
.      script = echo "Hello World!"
.   [[goodbye]]
.      script = echo "Goodbye World!"
.   [[farewell]]
.      script = echo "Farewell World!"
 #!Jinja2
 {% set SAY_BYE = true %}

 [scheduling]
    [[dependencies]]
       graph = """hello
 {% if SAY_BYE %} 
           => goodbye & farewell
 {% endif %}
               """
.[runtime]
.   [[hello]]
.      script = echo "Hello World!"
.   [[goodbye]]
.      script = echo "Goodbye World!"
.   [[farewell]]
.      script = echo "Farewell World!"
 #!Jinja2
 {% set SAY_BYE = true %}
 {% set FWTASK = 'farewell' %}
 [scheduling]
    [[dependencies]]
       graph = """hello
 {% if SAY_BYE %} 
           => goodbye & {{FWTASK}}
 {% endif %}
               """
.[runtime]
.   [[hello]]
.      script = echo "Hello World!"
.   [[goodbye]]
.      script = echo "Goodbye World!"
.   [[farewell]]
.      script = echo "Farewell World!"

(also loop constructs, etc.)

families (inheritance)


[scheduling]
   [[dependencies]]
      graph = """
          pre => ENSEMBLE:succeed-all => post
                model_0 => check
              """
[runtime]
   [[ENSEMBLE]]
      # (all shared config here) 
   [[model_0, model_1, model_2]]
      # (note could generate these member tasks automatically!)
      inherit = ENSEMBLE

dependency graph: $ cylc graph SUITE

inheritance graph: $ cylc graph -n SUITE

parameterized tasks

graph = "pre => sim => post => done"
graph = "pre => sim => post => done"
graph = "pre => sim<m> => post<m> => done"  # m = 1..5
graph = "pre => sim => post => done"
graph = "pre => sim<m> => post<m> => done"  # m = 1..5
graph = "prep => init => sim => post => close => done"
graph = "prep => init => sim => post => close => done"
graph = "prep => init<r> => sim<r,m> => post<r,m> => close<r> => done"
# for:    r = 1..3    and:    m = a, b, c
graph = "prep => init => sim => post => close => done"
graph = "prep => init<r> => sim<r,m> => post<r,m> => close<r> => done"
# for:    r = 1..3    and:    m = a, b, c

cycling configuration

via ISO 8601 date-time (and integer) recurrence expressions

 [scheduling]
    [[dependencies]]
        graph = "foo => bar & baz => qux"

 [scheduling]
.
    [[dependencies]]
.
.
.
          graph = """foo => bar & baz => qux"""
. 
.

 [scheduling]
    initial cycle point = 2010-01  # <---
    [[dependencies]]
.
.
       [[[R/^/P1M]]]               # <---
          graph = """foo => bar & baz => qux"""
.
.

 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
.
.
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux"""
.
.

 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
.
.
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux"""
.
.

 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
.
.
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux"""
.
.

 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
       [[[R1/^]]]               # <---
          graph = prep => foo   # <---
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux"""

 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
       [[[R1/^]]]
          graph = prep => foo
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux
                     foo[-P1M] => foo"""  # <---
 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
       [[[R1/^]]]
          graph = prep => foo
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux
                     foo[-P1M] => foo"""
 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
       [[[R1/^]]]
          graph = prep => foo
       [[[R/^/P1M]]]
          graph = """foo => bar & baz => qux
                     foo[-P1M] => foo"""
       [[[R2/^+P2M/P1M]]]                 # <---
          graph = baz & qux[-P2M] => boo  # <---
 [scheduling]
    initial cycle point = 2010-01
    [[dependencies]]
       [[[R1/^]]]  # (or just R1)
          graph = prep => foo
       [[[R/^/P1M]]]  # (or just P1M)
          graph = """foo => bar & baz => qux
                     foo[-P1M] => foo"""
       [[[R2/^+P2M/P1M]]]
          graph = baz & qux[-P2M] => boo

other features

demo...

(note: two more slides to come)

cylc workflow management

we recommend Rose for this https://github.com/metomi/rose
Rose has several distinct aspects that can be used separately:

acknowledgements

thanks to all cylc code contributers; particularly the Modelling Infrastructure Support Systems team at the UK Met Office

git shortlog -s -n
Hilary Oliver, Matt Shin, Ben Fitzpatrick, Andrew Clark, Oliver Sanders, Declan Valters, Luis Kornblueh, Kerry Day, Prasanna Challuri, Tim Whitcomb, David Matthews, Scott Wales, Bruno P. Kinoshita, Annette Osprey, Jonathan Thomas, Rosalyn Hatcher, Domingo Manubens Gil, Jonny Williams, Milton Woods, Alex Reinecke, Tomek Trzeciak, Chandin Wilson, Kevin Pulo, Martin Dix, Sadie Bartholomew



end