Data Engineering on Azure
by Michael KoflerA comprehensive guide to data engineering principles on Azure, focusing on cloud integration and best practices.
Designing Data-Intensive Applications
by Martin KleppmannExplores the architecture of data systems, emphasizing data quality and error handling in complex workflows.
Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing
by Tyler Akidau, Slava Chernyak, and Reuven LaxAn in-depth look at stream processing architectures, crucial for building responsive data pipelines.
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
by Ralph Kimball and Margy RossA foundational text on data warehousing and quality, essential for understanding data pipeline architecture.
Building Data Streaming Applications with Apache Kafka
by Manish KumarFocuses on integrating streaming data into pipelines, enhancing data quality and error handling strategies.
Airflow in Action
by Denny Lee and James DensmoreA practical guide to Apache Airflow, detailing advanced features for effective workflow management.
Data Quality: The Accuracy Dimension
by Jack E. OlsonA critical examination of data quality principles, vital for ensuring reliable data workflows.
Cloud Data Management and Storage
by Gurpreet Singh and Ramesh RaghunandanCovers cloud integration techniques, essential for modern data engineering practices.