Airflow Xcom Exclusive 💯 Official
@dag(start_date=datetime(2023,1,1), schedule=None, catchup=False) def xcom_exclusive_pipeline():
Think of XCom as a high-speed messenger service between your tasks—perfect for sharing metadata, file paths, status updates, and small result sets, but not designed for heavy payloads like DataFrames or large JSON blobs.
Mastering Airflow XCom Exclusive Data Sharing: A Comprehensive Guide airflow xcom exclusive
# Task A and Task B run in parallel task_a >> task_c task_b >> task_c
Example:
By configuring a custom backend, you can instruct Airflow to intercept every XCom push and pull. Instead of saving data to PostgreSQL or MySQL, Airflow writes the payload to external cloud storage (like AWS S3, Google Cloud Storage, or Azure Blob Storage) and saves only the object URI reference in the metadata database. Implementing an AWS S3 Custom XCom Backend
: It excels at generating complex, code-driven pipelines using Python. Common Criticisms Steep Learning Curve : Onboarding is often described as non-intuitive. Operational Overhead Implementing an AWS S3 Custom XCom Backend :
: Data is only available to tasks you explicitly link in your Python code. 2. Manual Scoping via xcom_pull
def try_claim(session, claim_id, worker_id): row = session.execute(update(claim_xcom) .where(claim_xcom.c.id==claim_id) .where(claim_xcom.c.status=='available') .values(status='claimed', claimed_by=worker_id, claimed_at=func.now()) .returning(claim_xcom)).first() return row # None if already claimed and small result sets
# Pushing context['ti'].xcom_push(key='my_key', value='my_value') # Pulling value = context['ti'].xcom_pull(task_ids='push_task', key='my_key') Use code with caution. Crucial XCom Best Practices & Constraints
The modern TaskFlow API simplifies data passing. When you return a value from a decorated @task , Airflow creates an implicit connection. : You don't manually call xcom_pull .