01 Zakres zadań
- Build the Conformed Silver Layer: Hands-on engineering and implementation of the Conformed Silver Layer, ensuring all data is unified, schema-enforced, and ready for high-stakes business logic.
- Implement Refined Data Models: Translate complex business rules into technical DWH structures (Star Schemas, Medallion Gold tables).
- Feature Engineering: Develop and operationalize specific feature sets within the data platform to support downstream Data Science and ML workflows.
- Develop Generic Frameworks: Build and maintain reusable, metadata-driven ETL/ELT frameworks in PySpark to accelerate onboarding of new data sources.
- Optimize for Downstream Shipping: Tune Spark jobs (partitioning, bucketing, caching) to ensure refined data layers are delivered with minimum latency and maximum reliability.
- Data Quality Engineering: Implement automated reconciliation and validation gates at the Silver-to-Gold boundary to ensure data integrity before it reaches the end-user or model.
