4 months ago
• Azure Cloud Operations Site Reliability Engineer in Swindon/Open to other locations
• £42,370 - £61,201 a year on a 35 hour contract.
• Send in your application by 18th April 2019.
Want to get involved in Cloud based technical challenges besides just reading blog articles? Are you someone who lives and breathes Azure, AWS, Google Cloud? Are you someone who appreciates similarities and differences between DevOps and SRE?
Our adoption of cloud is fundamental to the success of our material investment in technology for now and the future. The creation of first-class cloud products and capabilities that can be harnessed and leveraged by our tech teams is critical to achieving our strategy, delighting our members and the foundation of our digital capability. Our level of ambition is set high, that’s why we need Cloud Operations Site Reliability Engineers to drive forward our ambitions of Cloud adoption at scale.
Part of Nationwide’s exciting and growing team of Cloud Operations Site Reliability Engineers (SRE) you’ll use your Azure expertise in the creation of technical artefacts to deliver against Cloud Operations SRE objectives that are aligned to our engineering principles and the architectural strategy. As part of the IT Operations and Service Delivery community and working within our Cloud Centre of Excellence, we are looking for talented people with the ability to act like a Developer and think like a Systems Operator.
Who we're looking for
As a minimum requirement you’ll:
• Exposure to and experience in Azure Public Cloud
• Hands-on experience in automating the deployment and monitoring of services in Azure Public Cloud
• Ability to design and implement Azure Cloud platforms for elasticity and scalability
• Hands on experience in infrastructure as code and post deployment configuration management using tools such as Terraform, ARM, DSC, Puppet, Chef, Ansible etc.
• Understanding of continuous deployment using CI/CD tools like Jenkins, SCM (Git, SVN) along with code reviews
• Experience in making changes to production & live systems ensuing service uptime adhering to SDLC best practices
• Exposure with Docker, Kubernetes and cloud native development frameworks (Serverless & PaaS)
• Good scripting & development skills
• A strong understanding of core network protocols and services (TCP/IP, DNS).
• Experience in system administration including configuration and troubleshooting.
• A strong understanding of IAM roles and policies
• Ability to analyse network behaviour, performance and application issues using standard tools.
• Knowledge of database and replication methodologies
• Experience with distributed systems design, maintenance & disaster recovery
It would be nice if you also had:
• Ability to make design decisions that minimise and optimise infrastructure cost
• Experience of scripting against APIs would be advantageous.
• Azure Developer and / or Administrator certifications
What you'll be doing
As a core member of the team responsible for the reliability, availability and performance of Azure cloud platforms and services that underpin Nationwide. You will:
• Influence architectural & design decisions with regards to the operational reliability of Azure cloud platforms
• Collaborate with architecture, security, engineering teams and the wider Cloud Centre of Excellence to set up best in class Azure cloud platforms and subsequently provide guidance on their consumption
• Drive efficiency, automation, and cost reduction by automating manual and repetitive tasks
• Develop features and codified artefacts for platform improvements leveraging automation and infrastructure as code
• Provide technical expertise at parts and/or all stages of the delivery lifecycle
• Enforce standards and provide governance and contribute to process improvement
• Identify and escalate dependencies, risks and exceptions that will affect implementation of technical artefacts
• Share knowledge with peers to contribute to the growth of knowledge within Cloud Operations SRE