rockstarETL is a one-stop, codeless ETL/ELT orchestration platform.
Pipelines -> Steps -> Jobs
rockstarETL enables you to define pipelines.
A pipeline is a sequence of steps executed in order from Step 1 through to the last step specified.
Each step can have multiple jobs. These will be executed in parallel.
Below, in Step 1, there are two jobs. They will be executed at the same time!
All previous steps have to succeed before subsequent steps are executed.
In the pipeline above, Step 1 must complete successfully before Step 2 will execute. And all four jobs in Step 2 must complete successfully before Step 3’s job will run.
This is important for ETL/ELT workloads:
You may want certain jobs to run in parallel. This is possible when your jobs are not dependent on each other. (It’s also more time efficient)
Other steps will be dependent on previous steps first completing successfully.
Multiple pipelines can be defined. This is useful eg: one of your pipelines could be for a production ELT process whilst another could be for exploratory ad-hoc purposes.
Each pipeline is a separate sequence of steps. Each pipeline has it’s own schedule.
You can schedule your pipeline to run up to a maximum of every 5 minutes. (This is subject to the pipeline being able to completely execute in this time-frame)
Example: If a pipeline takes 1 hour to complete, it will not be re-executed every 30 minutes. If you instruct your pipeline to run every 30 minutes, it will effectively skip every second run and only run once per hour (not twice per hour)