Join a leading enterprise as a SRE Tech Lead. This is a crucial leadership role focused on driving system uptime, establishing Site Reliability Engineering (SRE) best practices, and ensuring a seamless digital experience for our customers. We are seeking a hands-on technical leader who can stabilize critical production environments and mentor a high-performing support team.
Technology Accountabilities & FocusThe successful candidate will be a pivotal player in ensuring the stability and performance of business-critical applications.
- System Reliability & SRE: Uphold system uptime and help develop/implement SRE best practices, working to build a culture of blameless fault finding, robust documentation, and uniform incident response.
- Monitoring & Observability: Leverage logging and APM tools such as Datadog, NewRelic, and Splunk to monitor business-critical applications and services.
- Incident Leadership: Oversee and execute the incident response process and Postmortem review, providing technical summaries and ensuring timely communication with stakeholders/product owners during outages.
- Team Leadership & Mentoring: Mentor team members and enable partnering teams to deliver robust solutions into production efficiently. Support the project backlog by breaking down complex tasks into achievable pieces.
- Service Metrics: Assist in establishing SLIs and SLOs based on key business metrics and objectives.
- Risk & Compliance: Play a pivotal role in managing risk and regulatory compliance, ensuring team processes are sustainable and compliant.
- On-Call & Escalation: Participate as an escalation point for the existing on-call rotation team.
- Leadership Experience: Experience as a delivery lead/tech senior in a SRE Tech Lead role or similar function.
- Incident Management: Experience managing production incident response or solving high-severity production issues.
- Alerting & Monitoring: Prior experience developing, implementing, and managing an alerting/monitoring framework or error logging system.
- Cloud & Infrastructure: Comprehensive knowledge of leading cloud platforms (Certifications are a plus) and familiarity with Infrastructure as Code (IaC) using tools such as Terraform.
- Scripting Proficiency: Ability to create scripts in PowerShell, Python, or other scripting languages.
- Methodology: Working knowledge of Agile and Project Management Methodologies.
- Beneficial: Experience with ITSM tools is advantageous.
- Technical Design: Contribute to developing architecture principles and overseeing the execution of detailed technical design decisions.
- Duration: 6-Month Contract, with a chance of extension
- Location: Auckland, New Zealand (Hybrid Working Model)
- Start Date: ASAP
If you are a seasoned SRE Tech Lead expert ready to lead reliability initiatives in a fast-paced enterprise environment, apply now or contact Amaan for more details on amaan.kazmi@randstadigital.co.nz or call at 0220607986.
At Randstad, we are passionate about providing equal employment opportunities and embracing diversity to the benefit of all. We actively encourage applications from any background.
...