A KubernetesPodOperator is a type of operator in Apache Airflow that allows you to launch a Kubernetes pod as a task in an Airflow workflow. This can be useful if you want to run a containerized workload as part of your pipeline, or if you want to use the power of Kubernetes to manage the resources and scheduling of your tasks.
Here is an example of how you might use a KubernetesPodOperator in an Airflow DAG:
from airflow import DAG from airflow.operators.kubernetes_pod_operator import KubernetesPodOperator from airflow.utils.dates import days_ago default_args = { 'owner': 'me', 'start_date': days_ago(2), } dag = DAG( 'kubernetes_sample', default_args=default_args, schedule_interval=timedelta(minutes=10), ) # Define a task using a KubernetesPodOperator task = KubernetesPodOperator( namespace='default', image="python:3.6-slim", cmds=["python", "-c"], arguments=["print('hello world')"], labels={"foo": "bar"}, name="test-pod", task_id="test-pod", is_delete_operator_pod=True, dag=dag, )
In this example, we are defining a task that will launch a Kubernetes pod in the default namespace, using the python:3.6-slim Docker image. The pod will run a single command, print('hello world'), using the python interpreter. The task is given a label of foo: bar and a name of test-pod.
There are many other parameters that you can use to customize the behavior of the KubernetesPodOperator, such as setting resource limits and requests, specifying environment variables, and mounting volumes. You can find a full list of available parameters in the Airflow documentation.
Comments
Post a Comment