I want to schedule quite a simple dagster job which needs to run everyday
Scheduling a Dagster job to run every day is a straightforward process. Below are the steps to create and schedule a simple Dagster job:
Define the Job: First, you need to define the job that you want to run. This involves creating the necessary solids (or ops in newer versions) and then combining them into a job.
Create a Schedule: Next, you create a schedule that specifies when the job should run.
Add the Schedule to the Repository: Finally, you add the schedule to your Dagster repository so that it can be picked up by the Dagster scheduler.
Here is an example to illustrate these steps:
from dagster import job, op
@op
def my_simple_op(context):
context.log.info("Hello, Dagster!")
@job
def my_simple_job():
my_simple_op()
from dagster import schedule
@schedule(cron_schedule="0 0 * * *", job=my_simple_job, execution_timezone="UTC")
def daily_schedule(_context):
return {}
In this example, the cron_schedule="0 0 * * *"
specifies that the job should run every day at midnight UTC.
from dagster import repository
@repository
def my_repository():
return [my_simple_job, daily_schedule]
Here is the complete code in one file:
from dagster import job, op, schedule, repository
@op
def my_simple_op(context):
context.log.info("Hello, Dagster!")
@job
def my_simple_job():
my_simple_op()
@schedule(cron_schedule="0 0 * * *", job=my_simple_job, execution_timezone="UTC")
def daily_schedule(_context):
return {}
@repository
def my_repository():
return [my_simple_job, daily_schedule]
To run the scheduler, you need to start the Dagster Daemon. You can do this by running:
dagster-daemon run
Make sure that the Dagster Daemon is running in the background to pick up the schedules and execute the jobs at the specified times.
execution_timezone
parameter in the @schedule
decorator to ensure the job runs at the correct time in your desired timezone.That's it! Your Dagster job is now scheduled to run every day.