Description
Senior Site Reliability Engineer (Founding Role)
Location: Canada
About the Role
This team is building a brand-new fintech platform from the ground up and is looking for an experienced Senior Site Reliability Engineer to join as one of the founding members of the team. This is a high-impact, staff-level position where you’ll shape the early architecture, define reliability practices, and directly influence how the platform scales.
If you love solving hard problems, building things that last, and having your fingerprints all over a greenfield project, this one’s for you.
What You’ll Do
- Build and own the SRE program from scratch, including processes, tools, and best practices
- Lead incident response, on-call, alerting, escalation, and post-incident reviews
- Partner closely with engineering and infrastructure to design reliable, scalable systems
- Define and meet uptime SLAs using metrics, tracing, and observability tools
- Own platform orchestration (Airflow, Prefect, etc.)
- Develop automation to improve deployment speed, reliability, and performance
- Strengthen CI/CD pipelines (GitHub Actions, ArgoCD, or similar)
- Contribute to architectural decisions that shape the product’s long-term success
What You Bring
- 10+ years of overall technical experience, including 5–8+ years in SRE, DevOps, or Systems Engineering roles in high-volume, 24×7 production environments
- Proficiency with AWS and Linux systems
- Deep experience with containerization (Docker, Kubernetes, EKS)
- Strong understanding of observability (logs, traces, metrics)
- Programming/scripting in Python or Bash for automation
- CI/CD experience (GitHub Actions, ArgoCD, or similar)
- Excellent problem-solving skills and attention to detail
- Clear communication skills and a collaborative mindset
Nice to Have
- Fintech or startup experience
- Familiarity with TypeScript/Node.js
- Experience with Kafka, Redis, and RDS
The Challenge
This is a “build it from the ground up” role. You’ll initially take full ownership of reliability and on-call, which means a heavier on-call load early on until the team grows. Once additional hires are made, this will transition into a more balanced rotation. You’ll be instrumental in designing the playbooks, tools, and culture for reliability excellence.
Compensation
- Salary: $180K–$200K (CAD)
- 100% remote work
- Unlimited vacation
- Comprehensive health and dental benefits





