π Comprehensive Data Engineering Documentation Resources π
π Comprehensive Data Engineering Documentation Resources π
This document provides links to official documentation for various technologies and platforms commonly used by Data Engineers.
1. βοΈ Cloud Platforms
AWS
- π AWS Data Engineering Docs
- π AWS Glue
- ποΈ AWS Redshift
- π AWS Kinesis
- π AWS EMR
Google Cloud (GCP)
- π Google Cloud Data Engineering
- π BigQuery
- π Dataflow (Apache Beam)
- β¨ Dataproc (Spark)
- π‘ Pub/Sub
Microsoft Azure
- π Azure Data Engineering
- π Azure Data Factory
- β‘ Azure Databricks
- π Azure Synapse Analytics
- π Azure Stream Analytics
2. π Big Data & Processing Frameworks
Apache Spark
- π₯ Spark Official Docs
- π PySpark API
Apache Kafka
- π« Kafka Documentation
Apache Flink
- π Flink Docs
Apache Beam
- π¦ Beam Documentation
Apache Hadoop
- π Hadoop Docs
Apache Airflow
- π Airflow Docs
3. ποΈ Databases & Data Warehousing
SQL & NoSQL Databases
- π PostgreSQL
- π¬ MySQL
- π MongoDB
- π Cassandra
Data Warehouses
4. π Data Pipelines & ETL Tools
ETL & Workflow Tools
- π Apache NiFi
- 𧩠Talend
- π Informatica
- 𧱠dbt (Data Build Tool)
5. π» Programming & Querying
Python for Data Engineering
- π Python Official Docs
- πΌ Pandas
- πΉ PyArrow
6. π οΈ DevOps & Orchestration
Docker & Kubernetes
- π³ Docker Docs
- βΈοΈ Kubernetes Docs
CI/CD
- π GitHub Actions
- π¦ GitLab CI/CD
7. π Monitoring & Logging
Monitoring Tools
- π Prometheus
- π Grafana
- π ELK Stack (Elasticsearch, Logstash, Kibana)