PushMetrics allows you to develop even complex reporting workflows in a simple, straightforward way - all from inside your notebooks.
For us, a (data) workflow means an executable process that performs one or many different tasks in a specific order.
In most cases, this means getting data from someplace, manipulating it, and then - maybe only under certain conditions - sending the data somewhere else.
You might also call this a data pipeline or DAG (Directed Acyclic Graph).
Here's a simple example:
In PushMetrics, every notebook is a workflow - if it has at least one executable block (that is SQL queries, API calls, or message blocks).
This is automatically the case. You don't need to turn a notebook into a workflow or build workflows in a different place - notebooks are compiled into executable DAGs automatically.
The example graph above could look like this in a notebook:
- a SQL query
- an Email block, using data from the SQL query
On execution, PushMetrics detects that the email needs the results of the query. Therefore it will run the query first and only send the email afterwards. The order of the blocks on the page is irrelevant.
A DAG is an acronym for Directed Acyclic Graph, which is a fancy way to say that a workflow:
The most common example of a DAG is probably a simple spreadsheet: Cells can reference each other, but not in a circular way. Then, the spreadsheet figures out in which order to calculate all the cells.
Translated to a PushMetrics notebook this means:
query_1
are used in query_2
, you can't use results from query_2
in query_1
.The good news is that you don't really need to know or worry about this when working in PushMetrics. Compiling the notebook into a DAG and executing it in the right order is something that just happens in the background automatically.
In order to build workflows in PushMetrics, we need to
In PushMetrics, tasks are defined by blocks in the notebook, dependencies and execution logic are specified using Jinja templating syntax.
At this point, PushMetrics supports the following executable tasks:
You can add such a task by simply adding a block with that type to a notebook.
Task dependencies are created when data from one block is used in another block.
For example:
parameter_1
can be referenced in a SQL query like this {{ parameter_1 }}
query_1
can be referenced in another query like this {{ query_1 }}
query_1
can be referenced like this {{ query_1.data() }}
(which returns a JSON representation of the results table)example
in the results of query_1
would be called like this: {{ query_1.data().[0]['example'] }}
{{ api_call_1.data() }}
Execution logic is expressed using Jinja syntax, simply by wrapping the executable blocks inside of Jinja expressions.
For example:
{% if some_condition = true %}
do this
{{ else }}
do that
{% endif %}
{% for i in range(0,10) %}
do a task using {{ i }}
{% endfor %}
In a notebook, it could look like this: