Dev Ops Lead / Site Reliability Engineer
CoupDog is looking for an experienced addition to our platform engineering team, with strong Dev Ops and Site Reliability Engineering skills. Strong Django & Python skills are an important asset.
Candidate will be responsible for the ultimate design and engineering of the production environment for the CoupDog platform. Specifically you will be responsible for creating a scalable, secure, high-availability and fast transactional processing system.
The candidate will work closely with our Software Product Engineering team on our testing, release and deployment strategy, and will be closely involved in the overall platform strategy for CoupDog.
Lead overall Site Reliability Engineering (SRE) efforts
Design technical infrastructure - including hosting architecture, testing, deployment and monitoring platforms
Defining CoupDog’s Dev Ops support team structure, training & resourcing strategy
Leading data and platform security
Must have Skill Sets:
Expertise leading the engineering of a high-volume, fast and secure transaction processing environment.
Hands on experience with:
Python and Django web framework
Working with version control systems (Git, Bitbucket) and continuous deployment systems (e.g. Bitbucket Pipelines)
Performance monitoring tools (e.g. Datadog, Sentry)
Experience designing a high availability and fault tolerant environment
Experience with cloud computing environments (e.g. AWS, Azure, Google Cloud)
Experience with load balancing across multiple data centres
Nice to skills:
In addition it would be ideal if the candidate has experience in some of the following areas:
MongoDB database experience
Containerization - e.g. Docker & Kubernetes
Experience leading Info and Data security audits and reviews
Strong experience in writing code
Degree in Computer Science, Software Engineering or similar
Competitive salary + equity package
Full time roleIdeally based in Toronto, Canada - but we are open to candidates working from other locations