Senior DevOps / SRE Engineer
Own observability, reliability, and FinOps in a complex AWS/GCP environment where you can shape platform standards and drive measurable impact.
About the Company
Avaron helps you find assignments that match your skills and ambitions. As a permanently employed consultant with us, you get competitive terms – combined with the variety and growth that a consulting career offers.
About the Assignment
You will step into a senior technical role with clear ownership of observability, reliability engineering, and cloud cost optimization in a large-scale multi-cloud environment. The assignment spans AWS, GCP, Kubernetes, and distributed production systems, with a strong focus on turning operational complexity into measurable improvements.
You will help move the organization from reactive incident handling to a more systematic SRE way of working. That means shaping how services are monitored, how incidents are governed, how risks are reduced, and how cloud spend is optimized without sacrificing performance or user experience.
This is a great opportunity for you if you enjoy combining deep hands-on engineering with platform thinking, and want to influence technical standards across a modern cloud landscape.
Job Description
- You will design and implement an end-to-end observability setup across infrastructure, platform services, applications, and business-critical flows.
- You will define observability architecture, data flows, and standards for metrics, logs, traces, structured logging, and visualization.
- You will build dashboards for infrastructure health, platform services, and key API and business KPIs.
- You will establish alerting models with clear severity levels, reduced alert noise, and automated routing by team or service.
- You will define and track SLI, SLO, and SLA frameworks, including error budgets for critical services.
- You will lead FinOps initiatives across AWS and GCP, creating cost visibility and driving optimization across compute, storage, databases, and network traffic.
- You will introduce cost-aware ways of working, including CI/CD checks, ownership models, budget limits, and recurring cost reviews.
- You will improve production reliability through incident frameworks, RCA practices, rollback mechanisms, and gradual rollout strategies.
- You will identify systemic risks and technical debt, and turn them into a prioritized remediation roadmap.
- You will optimize EKS and GKE platforms to improve cluster stability, resource utilization, and overall platform resilience.
- You will take part in on-call work and some travel as part of the assignment.
Requirements
- 5+ years of DevOps, SRE, or Cloud Platform experience.
- At least 3 years in a Staff, Principal, or Tech Lead role.
- Experience operating large-scale distributed systems in production.
- Deep expertise in both AWS and GCP.
- Ability to design cross-cloud architectures.
- Strong experience with Terraform, Pulumi, or CDK.
- Proven experience designing and implementing observability from scratch.
- Deep hands-on experience with Prometheus, Grafana, Loki, Elastic, and Kibana.
- Deep understanding of Kubernetes internals, including Scheduler, Controllers, etcd, CNI, and CRI.
- Experience managing large-scale production Kubernetes clusters.
- Proficiency in Java or Python or Go.
- Exceptional verbal and written English.
- A methodical approach to troubleshooting and root cause analysis.
Nice to have
- Background in Google SRE or deep practical SRE experience.
- Experience with Chaos Engineering.
- Documented FinOps success cases.
- Knowledge of eBPF and performance profiling.
- Open-source contributions.
- Experience designing multi-cloud disaster recovery, including Active-Active or Active-Passive setups.
- Professional working proficiency in Mandarin for collaboration with development teams in China.
What We Offer
- Permanent employment at Avaron AB
- Occupational pension
- Wellness allowance of SEK 5,000 per year
Application
Selections are made on an ongoing basis – apply as soon as you can.
- Platser
- Göteborg
Göteborg
Om Avaron AB
Avaron förser företag i hela Sverige med tekniska konsulter och specialister. Vi fokuserar på IT, mjukvaruutveckling, teknik, projektledning och andra tekniska domäner. Grundat 2018 av en mjukvaruingenjör som tröttnade på bemanningsföretag som inte förstod rollerna de rekryterade till.