HousingAnywhere is an online rental accommodation platform with more than 8 million users in over 60 countries and 400+ cities. And they happily use Apache Airflow through Google Cloud's fully managed Cloud Composer service for scheduling and executing their data pipeline - all within a fully automated CICD framework.
Data pipeline artifacts are versioned on Github, with automated testing using pytest and deployment into Airflow using a self-triggered Cloud Build job.
Join Massimo Belloni, Team Lead of Data Engineering at HousingAnywhere, as he discusses:
1. The CICD architecture for Airflow using GCP and pytest
2. Thinking and processes which led to the data pipeline design
3. Insights gleaned from implementation challenges
4. Hands-on examples with code
---
BIO
Massimo Belloni is currently Team Lead - Data Engineering at HousingAnywhere, with a mixed focus on machine learning projects (RentRadar, Penelope) and data engineering. He is a data scientist with a solid computer engineering background (MSc in Computer Science and Engineering from Politecnico di Milano, in Italy), funny human being, burgers and kebabs evangelist.
---
TLDR: Join this online event on the CICD design and operation of a GCP hosted data pipeline using Cloud Composer (Airflow) and pytest.
#Data_Engineering #Airflow #Composer #CICD #pytest
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More