Site Reliability Engineer - Multicloud Platform
Your work days are brighter here.
At Workday it all began with a conversation over breakfast. When our founders met at a sunny California diner they came up with an idea to revolutionize the enterprise software market. And when we began to rise one thing that really set us apart was our culture. A culture which was driven by our value of putting our people first. And ever since the happiness development and contribution of every Workmate is central to who we are. Our Workmates believe a healthy employee-centric collaborative culture is the essential mix of ingredients for success in business. That’s why we look after our people communities and the planet while still being profitable. Feel encouraged to shine however that manifests: you don’t need to hide who you are. You can feel the energy and the passion it's what makes us unique. Inspired to make a brighter work day for all and transform with us to the next stage of our growth journey? Bring your brightest version of you and have a brighter work day here.
About the Team
The primary function of the SRE team is to ensure the reliability and availability of the platform to meet the desired SLAs reduce operational load and to scale sustainably in alignment with business growth.
Be a key member of team of versatile SREs responsible for software engineering and operations with an emphasis on reducing operational toil.
Automation and improvement is planned by following scrum practices with two week sprints.
The scrum team is autonomous - on-call function is follow-the-sun
Tech stack is Cloud Native (Kubernetes Istio OPA GoLang Prometheus Grafana etc)
Responsible for the safe change and reliability of customer environments with SLO gated multi-stage deployment automation. Mission is to improve platform reliability observability and overall customer happiness.
Develop and launch effective SLIs to ensure that SLOs are achieved through building an extendable Observability architecture runbook automation and establishing new processes.
Partner with platform service teams to craft and implement a range of SRE standards for their respective services to meet. Define benchmarks and automation to qualify services to move to production environments.
About the Role
Are you a Site Reliability Engineer with who loves the challenge of automating operating and improving innovative cloud native service platforms? Do you love digging into a production problem and seeing it through to resolution and follow through? We’re the team that deploys operates and supports our cloud native technology platform that was designed from scratch for the cloud. We own the reliability for the complete stack and tools that delivers and supports Workday products across public clouds (e.g. AWS GCP Azure). The platform is built using Cloud Native technologies (CNCF) on a foundation of Kubernetes in Public Cloud environments. This provides a secure platform on which Workday service teams and Platform development teams can build and test their pre-release code through deployment to production on a continuous basis. Engineers from this team have shared their experiences at Cloud Native conferences including KubeCon
About You
You have a passion for identifying and solving problems on distributed environments scaling across configuration Linux Operating System and network. You have hands-on experience handling distributed environments (Kubernetes experience is a big plus). You are interested in improving operational efficiency and believe that automation is the key to operating large-scale systems. You are driven to ensure customer success.
Basic Qualifications:BS in Computer Science or related field or equivalent years of experience3 years in handling and solving distributed systems in a public cloud3+ years of SRE experience in a distributed systems environment.Experience with AWS GCP or AzureStrong experience with KubernetesExperience with LinuxProficiency with a programming language such as GoLang Python or Ruby (preferably GoLang (Go))Experienced with software development standard methodologies such as code management CI/CD testingOther Qualifications:Passionate for automation with a track record of referenceable examples.Can work independently and with the demeanor that everything can be automated.Skills to operate maintain support and sustain the platform.Energised by working in a fast-paced environment. Experience collaborating with multi-functional global and remote teams with a diverse set of backgrounds.Excellent documentation skills experience with developing detailed runbooks processes
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race religion color national origin sex gender gender expression sexual orientation age marital status veteran status or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process to perform essential job functions and to receive other benefits and privileges of employment. Please contact us to request accommodation.
Our Approach to Flexible Work
With Flex Work we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections maintain a strong community and do their best work. We know that flexibility can take shape in many ways so rather than a number of required days in-office each week we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers prospects and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business team and personal needs while being intentional to make the most of time spent together. Those in our remote 'home office' roles also have the opportunity to come together in our offices for important moments that matter.
Are you being referred to one of our roles? If so ask your connection at Workday about our Employee Referral process!
All vacancies from "Workday" ⟶
views: 7.3K
valid through: 2024-01-29