Description
Position Name – Advance Data Engineer
Type of hiring – Fulltime
Location – Remote Canada
Job Description:
Python & PySpark
- Proficient in both, with strong understanding of software engineering best practices.
Data Exploration & Troubleshooting
- Ability to investigate data quality issues, debug pipelines, and explore datasets beyond surface-level analysis.
CI/CD & GitHub
- Experience with GitHub and GitHub Actions for version control and automation; familiarity with CI/CD practices for testing and deployment.
Azure & Databricks
- Hands-on experience with Azure cloud services and Databricks, including:
- Databricks Jobs, Clusters, Unity Catalog
Data Pipeline Development
- Design, build, and maintain robust, scalable, and automated data pipelines for both batch and streaming data ingestion using Databricks Workflows.
- Implement data quality checks, profiling, validation, and root cause analysis to ensure data accuracy and consistency.
- Design and implement data models and architectures that align with business needs and support efficient processing, analysis, and reporting.
Orchestration & Monitoring
- Use workflow orchestration tools (e.g., Airflow) to automate pipeline execution and manage dependencies.
- Integrate monitoring and alerting mechanisms to track pipeline health and proactively address issues.
Agile & Collaboration
- Work in Agile environments, participating in sprint planning, stand-ups, reviews, and retrospectives.
- Collaborate across teams and be adaptable to rotating responsibilities and changing priorities





