Jobs | Jobs Hiring near me

Proactive Appointments Guildford, Surrey

Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Jan 09, 2026

Full time

Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Senior DevOps Systems Administrator

Proactive Appointments Guildford, Surrey

Senior DevOps Systems Administrator Salary: £60,000 - £65,000 DOE Location: Guildford (Hybrid Working, One day a month required onsite) Job Type: Permanent, Full-Time Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Jan 09, 2026

Full time

Senior DevOps Systems Administrator Salary: £60,000 - £65,000 DOE Location: Guildford (Hybrid Working, One day a month required onsite) Job Type: Permanent, Full-Time Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Senior DevOps Systems Administrator

Proactive Appointments Guildford, Surrey

Senior DevOps Systems Administrator Salary: £60,000 - £65,000 DOE Location: Guildford (Hybrid Working, One day a month required onsite) Job Type: Permanent, Full-Time Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Jan 09, 2026

Full time

Senior DevOps Systems Administrator Salary: £60,000 - £65,000 DOE Location: Guildford (Hybrid Working, One day a month required onsite) Job Type: Permanent, Full-Time Are you an experienced DevOps Systems Administrator or Senior DevOps Engineer looking to take control of complex cloud projects? Do you want to join a growing, innovative business where your ideas genuinely shape the way things are done? Our client is a leading value-added reseller and systems integrator, partnering with top technology providers and specialising in cloud infrastructure, DevOps, business communications, contact centres, networking, AI, automation, and systems integration. They are known for delivering innovation, reducing risk, and building trusted client relationships, all while investing in the growth and development of their team. This is your chance to work on impactful infrastructure challenges where your input truly matters. Key Responsibilities Design, deploy, and support AWS and private cloud infrastructure Architect and maintain robust hybrid cloud solutions Automate infrastructure using Terraform and Ansible Build and maintain CI/CD pipelines with GitHub Actions Implement monitoring and observability tools (Grafana, Prometheus, CloudWatch) Improve system reliability, performance, and security across teams Manage IAM, networking, Firewalls, VPNs, and cloud security Participate in the on-call rota and respond to incidents Create and maintain technical documentation and best practices Mentor colleagues and contribute to long-term infrastructure strategy Essential Skills & Experience Required: 5+ years' experience in DevOps or Systems Administration Hands-on experience with AWS Experience with VMware or Proxmox Strong Linux administration skills Infrastructure as Code with Terraform Configuration management using Ansible Experience with Kubernetes (eg EKS) Scripting in Bash, Python, or Go Solid understanding of networking and cloud security Desirable: AWS certifications Experience with container and security tools such as Trivy Benefits Career development and certification support 24 days' holiday plus bank holidays Pension and life insurance Private medical insurance Birthday leave and volunteering day Cycle to Work scheme High street and retail discounts Hybrid role (One day a month required oniste in Guildford) Due to the volume of applications received for positions, it will not be possible to respond to all applications and only applicants who are considered suitable for interview will be contacted. Proactive Appointments Limited operates as an employment agency and employment business and is an equal opportunities organisation We take our obligations to protect your personal data very seriously. Any information provided to us will be processed as detailed in our Privacy Notice, a copy of which can be found on our website

Splunk Site Reliability Engineer

Flint UK Technology Services

Job Title: Splunk Site Reliability Engineer/Migration Specialist (Contract) Location: Birmingham (Hybrid/On-site, required 3 days per week) Contract Type: Contract Duration: 3 months rolling Job Summary: We are seeking an experienced Splunk SME/Migration Specialist to lead and support the migration of observability workloads from Splunk to Elasticsearch (ELK Stack) . The ideal candidate will bring hands-on expertise in Splunk architecture, data ingestion, alerting, and dashboarding, along with experience migrating workloads to Elasticsearch. In addition to migration duties, the candidate will maintain and enhance existing Splunk infrastructure, provide incident support, manage upgrades, and ensure observability platforms remain secure and performant. This role demands a technically strong individual with excellent stakeholder communication and problem-solving skills. Key Responsibilities: Migration: Develop and implement a comprehensive migration strategy from Splunk to Elasticsearch (ELK Stack). Assess existing Splunk configurations (dashboards, alerts, saved searches, data models) and recreate them in Kibana. Collaborate with Elastic teams to configure alerting and monitoring using Kibana, Elasticsearch Watcher, or third-party tools. Ensure migration plans include validation, rollback procedures, and knowledge transfer. Platform Operations & Incident Response: Maintain Splunk infrastructure in both Production and Non-Production environments. Support Splunk SRE and Application teams in incident investigation and resolution. Proactively monitor system health and performance metrics. Upgrades and Change Management: Plan and execute upgrades to Splunk components. Perform pre- and post-upgrade checks and validations. Prepare documentation and submit Change Requests following organizational procedures. Security and Compliance: Work with Puppet and other automation tools to ensure timely patching of vulnerabilities. Implement and verify security best practices for observability platforms. Support compliance initiatives and audits. Documentation and Knowledge Sharing: Maintain accurate and up-to-date technical documentation, including architecture diagrams, configurations, procedures, and troubleshooting guides. Review and update support articles and take ownership of relevant assets. Support knowledge transfer across teams as needed. Troubleshooting and Support: Identify and resolve issues in Splunk and ELK environments. Assist teams with Splunk-related queries and optimization efforts. Skills and Qualifications: Essential: Proven expertise with Splunk architecture , data ingestion, dashboarding, alerting, and administration. Experience migrating Splunk workloads to Elasticsearch (ELK Stack) . Solid understanding of Kibana , Elasticsearch Watcher , and observability tooling. Proficiency in Linux/Unix systems and networking protocols . Hands-on experience with Scripting (eg, Python, Shell/Bash). Experience supporting or working alongside DevOps/SRE teams . Strong analytical, troubleshooting, and communication skills. Desirable: Experience with containerized environments such as Docker or Kubernetes . Industry certifications such as Splunk Certified Power User/Admin/Architect . Knowledge of automation tools (eg, Puppet, Ansible). Bachelor's degree in Computer Science, Information Systems, or related field. Key Attributes: Independent and proactive problem-solver. Collaborative and able to work cross-functionally with infrastructure, security, and application teams. Able to work under pressure and prioritize tasks effectively. Strong communicator, both written and verbal.

Jan 09, 2026

Contractor

Job Title: Splunk Site Reliability Engineer/Migration Specialist (Contract) Location: Birmingham (Hybrid/On-site, required 3 days per week) Contract Type: Contract Duration: 3 months rolling Job Summary: We are seeking an experienced Splunk SME/Migration Specialist to lead and support the migration of observability workloads from Splunk to Elasticsearch (ELK Stack) . The ideal candidate will bring hands-on expertise in Splunk architecture, data ingestion, alerting, and dashboarding, along with experience migrating workloads to Elasticsearch. In addition to migration duties, the candidate will maintain and enhance existing Splunk infrastructure, provide incident support, manage upgrades, and ensure observability platforms remain secure and performant. This role demands a technically strong individual with excellent stakeholder communication and problem-solving skills. Key Responsibilities: Migration: Develop and implement a comprehensive migration strategy from Splunk to Elasticsearch (ELK Stack). Assess existing Splunk configurations (dashboards, alerts, saved searches, data models) and recreate them in Kibana. Collaborate with Elastic teams to configure alerting and monitoring using Kibana, Elasticsearch Watcher, or third-party tools. Ensure migration plans include validation, rollback procedures, and knowledge transfer. Platform Operations & Incident Response: Maintain Splunk infrastructure in both Production and Non-Production environments. Support Splunk SRE and Application teams in incident investigation and resolution. Proactively monitor system health and performance metrics. Upgrades and Change Management: Plan and execute upgrades to Splunk components. Perform pre- and post-upgrade checks and validations. Prepare documentation and submit Change Requests following organizational procedures. Security and Compliance: Work with Puppet and other automation tools to ensure timely patching of vulnerabilities. Implement and verify security best practices for observability platforms. Support compliance initiatives and audits. Documentation and Knowledge Sharing: Maintain accurate and up-to-date technical documentation, including architecture diagrams, configurations, procedures, and troubleshooting guides. Review and update support articles and take ownership of relevant assets. Support knowledge transfer across teams as needed. Troubleshooting and Support: Identify and resolve issues in Splunk and ELK environments. Assist teams with Splunk-related queries and optimization efforts. Skills and Qualifications: Essential: Proven expertise with Splunk architecture , data ingestion, dashboarding, alerting, and administration. Experience migrating Splunk workloads to Elasticsearch (ELK Stack) . Solid understanding of Kibana , Elasticsearch Watcher , and observability tooling. Proficiency in Linux/Unix systems and networking protocols . Hands-on experience with Scripting (eg, Python, Shell/Bash). Experience supporting or working alongside DevOps/SRE teams . Strong analytical, troubleshooting, and communication skills. Desirable: Experience with containerized environments such as Docker or Kubernetes . Industry certifications such as Splunk Certified Power User/Admin/Architect . Knowledge of automation tools (eg, Puppet, Ansible). Bachelor's degree in Computer Science, Information Systems, or related field. Key Attributes: Independent and proactive problem-solver. Collaborative and able to work cross-functionally with infrastructure, security, and application teams. Able to work under pressure and prioritize tasks effectively. Strong communicator, both written and verbal.

Azure CloudOps Engineer

Adecco Croydon, London

Interim Azure Cloud Operations Engineer Location: Croydon (Hybrid) Remote opportunities possible Contract: Interim Competitive Day Rate, 6 months Inside IR35 About the Role Our client is seeking an Azure Certified Specialist to cover the design, implementation, and support of security, cost, and usage monitoring and automation within our Azure environment. This role underpins the Digital Foundations stream of our transformation programme, ensuring our cloud operations are secure, reliable, and cost-efficient. What You'll Do Design and implement secure, automated Azure solutions aligned with best practices. Maintain high availability and reliability of Azure-hosted systems in line with UK Government Digital Service (GDS) standards. Automate infrastructure using Infrastructure-as-Code (IaC) tools like Bicep or Terraform , and scripting with PowerShell/Python . Monitor and optimise cloud performance using Azure Monitor , Service Health , and advanced query tools (KQL). Implement FinOps principles to control costs and ensure efficient use of public funds. Integrate security best practices with Microsoft Defender for Cloud and maintain compliance with UK public sector regulations. Drive rapid incident resolution and improve MTTR through proactive monitoring and automation. What We're Looking For Azure Certification (e.g., AZ-104, AZ-305, or similar). Proven experience in Azure Cloud Operations within complex environments. Strong skills in automation , DevOps , and Site Reliability Engineering (SRE) principles. Knowledge of GDS standards , compliance frameworks, and ITSM integration. Expertise in IaC (Bicep/Terraform) , scripting ( PowerShell/Python ), and observability tools. Commercial awareness and ability to manage cloud costs effectively . Solid understanding of networking, Active Directory, DNS , and hybrid cloud scenarios. Why Join Us? This is a unique opportunity to play a pivotal role in modernising Croydon Council's cloud operations, ensuring secure, reliable, and efficient digital services for our community. Apply Now If you're an Azure expert with a passion for automation, reliability, and cost optimisation, we'd love to hear from you. Adecco acts as an employment agency for permanent recruitment and an employment business for the supply of temporary workers. The Adecco Group UK & Ireland is an Equal Opportunities Employer. By applying for this role your details will be submitted to Adecco. Our Candidate Privacy Information Statement explains how we will use your information - please copy and paste the following link in to your browser (url removed)

Jan 09, 2026

Contractor

Interim Azure Cloud Operations Engineer Location: Croydon (Hybrid) Remote opportunities possible Contract: Interim Competitive Day Rate, 6 months Inside IR35 About the Role Our client is seeking an Azure Certified Specialist to cover the design, implementation, and support of security, cost, and usage monitoring and automation within our Azure environment. This role underpins the Digital Foundations stream of our transformation programme, ensuring our cloud operations are secure, reliable, and cost-efficient. What You'll Do Design and implement secure, automated Azure solutions aligned with best practices. Maintain high availability and reliability of Azure-hosted systems in line with UK Government Digital Service (GDS) standards. Automate infrastructure using Infrastructure-as-Code (IaC) tools like Bicep or Terraform , and scripting with PowerShell/Python . Monitor and optimise cloud performance using Azure Monitor , Service Health , and advanced query tools (KQL). Implement FinOps principles to control costs and ensure efficient use of public funds. Integrate security best practices with Microsoft Defender for Cloud and maintain compliance with UK public sector regulations. Drive rapid incident resolution and improve MTTR through proactive monitoring and automation. What We're Looking For Azure Certification (e.g., AZ-104, AZ-305, or similar). Proven experience in Azure Cloud Operations within complex environments. Strong skills in automation , DevOps , and Site Reliability Engineering (SRE) principles. Knowledge of GDS standards , compliance frameworks, and ITSM integration. Expertise in IaC (Bicep/Terraform) , scripting ( PowerShell/Python ), and observability tools. Commercial awareness and ability to manage cloud costs effectively . Solid understanding of networking, Active Directory, DNS , and hybrid cloud scenarios. Why Join Us? This is a unique opportunity to play a pivotal role in modernising Croydon Council's cloud operations, ensuring secure, reliable, and efficient digital services for our community. Apply Now If you're an Azure expert with a passion for automation, reliability, and cost optimisation, we'd love to hear from you. Adecco acts as an employment agency for permanent recruitment and an employment business for the supply of temporary workers. The Adecco Group UK & Ireland is an Equal Opportunities Employer. By applying for this role your details will be submitted to Adecco. Our Candidate Privacy Information Statement explains how we will use your information - please copy and paste the following link in to your browser (url removed)

Senior Site Reliability Engineer

Stratospherec Ltd

Senior DevOps Engineer / Senior Site Reliability Engineer Fully Remote working for candidates based in the UK Salary to £100k + Benefits We are looking for a Senior DevOps Engineer that has strong C# code knowledge combined with strong knowledge of DevOps tools like Kubernetes (EKS or AKS) and Azure or AWS Cloud platforms. We are looking for a DevOps Engineer with a strong understanding of C# code combined with experience of monitoring tools like DataDog, Grafana and Prometheus to join a growing global Cloud Infrastructure team supporting SaaS products. Our client are a Global Digital SaaS Software Company have a fantastic fully remote opportunity for an experienced Senior DevOps Engineer to join their UK Cloud Infrastructure team. Site Reliability Engineers at this company are responsible for keeping the SaaS products running properly. Using concepts of software and systems engineering, they work to improve the reliability of all cloud systems while keeping levels of manual work low. DevOps are expected to be experienced in software engineering principals, operational discipline, and automation. The Cloud and DevOps team work on a fully remote basis and work in conjunction with their US and Australian teams as well. This company are a market leader in Student community management software, this company s unique SaaS platform is an essential platform in the life of millions of University students across the globe. In this role, you will apply your Software Engineering experience to enhance system performance and reliability, as well as building internal systems and capabilities that eliminate manual work through automation. You'll be joining our Platforms teams with globally-dispersed Site Reliability and Platform Engineers in a "follow the sun" model to operate our products on a multi-region cloud platform. Role Responsibilities: Provide technical leadership and mentoring within the team through knowledge sharing sessions, pair programming, code reviews and solution design Identify and implement technical solutions to improve platform reliability, including the creation of mitigation strategies and operational playbooks. Implement and maintain monitoring/alerting/logging systems to identify and respond to incidents Ensure scalability and efficiency of cloud infrastructure and systems to handle traffic and data growth Conduct performance tests to identify and remediate bottlenecks Develop and maintain platform solutions, automate infrastructure provisioning, configuration, and management tasks using Infrastructure as Code. Monitor, review and tune databases to ensure high availability and performance Collaborate with product engineering teams to design/build fit-for-purpose and observable software Required Skills and Experience: Proven experience in a SR DevOps / Site Reliability Engineering role and having strong code development experience in C# or similar OO development language. Experience of supporting .Net applications as a DevOps Engineer is a big bonus in this role Production experience operating containerization technologies - ideally with Kubernetes and/or Docker. Strong preference for AKS or EKS experience as well. Proficiency with one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and operational playbooks. Useful / Bonus Skills to have: Experience in CI/CD tooling: Azure DevOps/GitHub Actions, Octopus Deploy Relevant certifications in cloud platforms (e.g., Microsoft Certified: Azure Solutions Architect) and DevOps practices (e.g., Certified Kubernetes Administrator) are a plus Experience in database management/performance tuning, particularly MSSQL. Employee benefits: Opportunity to be a part of a 30+ year well-established, high-performance SaaS company. Excellent Company Pension scheme and Life Insurance, Excellent holiday allowance. A supportive team environment with emphasis on learning and development opportunities Working with a team of caring, high-performing, and passionate people who have fun supporting our vision, innovation, and continuous improvement. This Senior Site Reliability Engineer role is working for a market leading global software company and this job is part of a large program of change and improvement in their Cloud SaaS products over the coming years. If you are looking for an interesting SRE role with a forward-thinking global organisation, then this would be a tremendous career opportunity to consider. Please apply with your CV to find out more.

Jan 07, 2026

Full time

Senior DevOps Engineer / Senior Site Reliability Engineer Fully Remote working for candidates based in the UK Salary to £100k + Benefits We are looking for a Senior DevOps Engineer that has strong C# code knowledge combined with strong knowledge of DevOps tools like Kubernetes (EKS or AKS) and Azure or AWS Cloud platforms. We are looking for a DevOps Engineer with a strong understanding of C# code combined with experience of monitoring tools like DataDog, Grafana and Prometheus to join a growing global Cloud Infrastructure team supporting SaaS products. Our client are a Global Digital SaaS Software Company have a fantastic fully remote opportunity for an experienced Senior DevOps Engineer to join their UK Cloud Infrastructure team. Site Reliability Engineers at this company are responsible for keeping the SaaS products running properly. Using concepts of software and systems engineering, they work to improve the reliability of all cloud systems while keeping levels of manual work low. DevOps are expected to be experienced in software engineering principals, operational discipline, and automation. The Cloud and DevOps team work on a fully remote basis and work in conjunction with their US and Australian teams as well. This company are a market leader in Student community management software, this company s unique SaaS platform is an essential platform in the life of millions of University students across the globe. In this role, you will apply your Software Engineering experience to enhance system performance and reliability, as well as building internal systems and capabilities that eliminate manual work through automation. You'll be joining our Platforms teams with globally-dispersed Site Reliability and Platform Engineers in a "follow the sun" model to operate our products on a multi-region cloud platform. Role Responsibilities: Provide technical leadership and mentoring within the team through knowledge sharing sessions, pair programming, code reviews and solution design Identify and implement technical solutions to improve platform reliability, including the creation of mitigation strategies and operational playbooks. Implement and maintain monitoring/alerting/logging systems to identify and respond to incidents Ensure scalability and efficiency of cloud infrastructure and systems to handle traffic and data growth Conduct performance tests to identify and remediate bottlenecks Develop and maintain platform solutions, automate infrastructure provisioning, configuration, and management tasks using Infrastructure as Code. Monitor, review and tune databases to ensure high availability and performance Collaborate with product engineering teams to design/build fit-for-purpose and observable software Required Skills and Experience: Proven experience in a SR DevOps / Site Reliability Engineering role and having strong code development experience in C# or similar OO development language. Experience of supporting .Net applications as a DevOps Engineer is a big bonus in this role Production experience operating containerization technologies - ideally with Kubernetes and/or Docker. Strong preference for AKS or EKS experience as well. Proficiency with one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and operational playbooks. Useful / Bonus Skills to have: Experience in CI/CD tooling: Azure DevOps/GitHub Actions, Octopus Deploy Relevant certifications in cloud platforms (e.g., Microsoft Certified: Azure Solutions Architect) and DevOps practices (e.g., Certified Kubernetes Administrator) are a plus Experience in database management/performance tuning, particularly MSSQL. Employee benefits: Opportunity to be a part of a 30+ year well-established, high-performance SaaS company. Excellent Company Pension scheme and Life Insurance, Excellent holiday allowance. A supportive team environment with emphasis on learning and development opportunities Working with a team of caring, high-performing, and passionate people who have fun supporting our vision, innovation, and continuous improvement. This Senior Site Reliability Engineer role is working for a market leading global software company and this job is part of a large program of change and improvement in their Cloud SaaS products over the coming years. If you are looking for an interesting SRE role with a forward-thinking global organisation, then this would be a tremendous career opportunity to consider. Please apply with your CV to find out more.

Site Reliability Engineer (SRE) - Defence

Talent Locker Farnborough, Hampshire

Site Reliability Engineer (SRE) - Defence / National Security - 75k - Farnborough - Hybrid A permanent opportunity for an experienced Site Reliability Engineer who enjoys building secure, automated, and highly reliable platforms. This role sits within a defence and national security environment, working on modern infrastructure where automation, resilience, and secure-by-design principles are fundamental. You'll work closely with platform engineers, infrastructure teams, and operational stakeholders to take requirements from early design and proof-of-concept through to production. The role blends hands-on engineering with technical design, offering real influence over tooling, standards, and DevOps ways of working. It suits someone curious, detail-oriented, and comfortable working in complex, regulated environments. What you'll be doing Designing, delivering, upgrading, and maintaining core platforms, services, and automations Building and improving monitoring, alerting, and observability platforms Designing secure infrastructure using automation-first approaches Creating and productionising proofs of concept for new tools and technologies Diagnosing and resolving performance, reliability, and availability issues Supporting architecture, documentation, and non-functional requirements Mentoring engineers and helping improve DevOps and SRE practices Essential experience Strong experience with Linux (Ubuntu) and Windows Server environments Hands-on scripting skills (Bash, Python, PowerShell or similar) Proven experience with automation and DevOps tooling (Ansible, Terraform, CI/CD, Git) Experience working with Azure or similar cloud platforms Solid understanding of infrastructure reliability, monitoring, and incident response Strong problem-solving skills and ability to work across multiple priorities Willingness to work in secure, regulated environments (SC eligibility required) Desirable experience Infrastructure-as-Code lifecycle and best practices Containerisation and orchestration (Docker, Kubernetes) Configuration management and desired state tooling Application and platform monitoring tools (Splunk, Nagios or similar) Experience hardening systems and conducting security assessments Understanding of Agile and DevOps principles in practice A collaborative, inclusive culture with strong benefits including competitive pay, bonus, pension, private healthcare, generous leave, professional development, wellbeing perks, and modern on-site facilities.

Jan 06, 2026

Full time

Site Reliability Engineer (SRE) - Defence / National Security - 75k - Farnborough - Hybrid A permanent opportunity for an experienced Site Reliability Engineer who enjoys building secure, automated, and highly reliable platforms. This role sits within a defence and national security environment, working on modern infrastructure where automation, resilience, and secure-by-design principles are fundamental. You'll work closely with platform engineers, infrastructure teams, and operational stakeholders to take requirements from early design and proof-of-concept through to production. The role blends hands-on engineering with technical design, offering real influence over tooling, standards, and DevOps ways of working. It suits someone curious, detail-oriented, and comfortable working in complex, regulated environments. What you'll be doing Designing, delivering, upgrading, and maintaining core platforms, services, and automations Building and improving monitoring, alerting, and observability platforms Designing secure infrastructure using automation-first approaches Creating and productionising proofs of concept for new tools and technologies Diagnosing and resolving performance, reliability, and availability issues Supporting architecture, documentation, and non-functional requirements Mentoring engineers and helping improve DevOps and SRE practices Essential experience Strong experience with Linux (Ubuntu) and Windows Server environments Hands-on scripting skills (Bash, Python, PowerShell or similar) Proven experience with automation and DevOps tooling (Ansible, Terraform, CI/CD, Git) Experience working with Azure or similar cloud platforms Solid understanding of infrastructure reliability, monitoring, and incident response Strong problem-solving skills and ability to work across multiple priorities Willingness to work in secure, regulated environments (SC eligibility required) Desirable experience Infrastructure-as-Code lifecycle and best practices Containerisation and orchestration (Docker, Kubernetes) Configuration management and desired state tooling Application and platform monitoring tools (Splunk, Nagios or similar) Experience hardening systems and conducting security assessments Understanding of Agile and DevOps principles in practice A collaborative, inclusive culture with strong benefits including competitive pay, bonus, pension, private healthcare, generous leave, professional development, wellbeing perks, and modern on-site facilities.

Site Reliability Engineer (Remote)

Rullion Ltd

Key Responsibilities: Design, implement, and maintain scalable, highly available infrastructure and services. Develop automation scripts and tools to improve system reliability and operational efficiency. Monitor and troubleshoot system performance, identifying and resolving issues to minimise downtime. Implement and maintain CI/CD pipelines to support efficient software delivery. Develop and enforce best practices for security, monitoring, and incident management. Collaborate with development teams to enhance application performance and stability. Create detailed documentation and conduct post-incident reviews to identify root causes and implement long-term solutions. Essential Skills and Experience: Proven experience in Site Reliability Engineering, DevOps, or similar roles. Strong understanding of cloud platforms (AWS, Azure, or GCP) and containerisation technologies (Kubernetes, Docker). Proficiency in scripting languages such as Python, Bash, or Go. Hands-on experience with monitoring and observability tools like Prometheus, Grafana, and the ELK stack. Familiarity with infrastructure-as-code tools like Terraform or Ansible. Solid understanding of networking concepts and system security best practices. Excellent problem-solving skills and a passion for automation and continuous improvement. Desirable: Certifications in cloud platforms or DevOps tools. Experience with large-scale distributed systems. This role offers the opportunity to work on mission-critical projects in a fast-paced and collaborative environment, driving innovation and reliability in our technology ecosystem. Rullion celebrates and supports diversity and is committed to ensuring equal opportunities for both employees and applicants.

Jan 06, 2026

Contractor

Key Responsibilities: Design, implement, and maintain scalable, highly available infrastructure and services. Develop automation scripts and tools to improve system reliability and operational efficiency. Monitor and troubleshoot system performance, identifying and resolving issues to minimise downtime. Implement and maintain CI/CD pipelines to support efficient software delivery. Develop and enforce best practices for security, monitoring, and incident management. Collaborate with development teams to enhance application performance and stability. Create detailed documentation and conduct post-incident reviews to identify root causes and implement long-term solutions. Essential Skills and Experience: Proven experience in Site Reliability Engineering, DevOps, or similar roles. Strong understanding of cloud platforms (AWS, Azure, or GCP) and containerisation technologies (Kubernetes, Docker). Proficiency in scripting languages such as Python, Bash, or Go. Hands-on experience with monitoring and observability tools like Prometheus, Grafana, and the ELK stack. Familiarity with infrastructure-as-code tools like Terraform or Ansible. Solid understanding of networking concepts and system security best practices. Excellent problem-solving skills and a passion for automation and continuous improvement. Desirable: Certifications in cloud platforms or DevOps tools. Experience with large-scale distributed systems. This role offers the opportunity to work on mission-critical projects in a fast-paced and collaborative environment, driving innovation and reliability in our technology ecosystem. Rullion celebrates and supports diversity and is committed to ensuring equal opportunities for both employees and applicants.

Embedded Systems Hekp Desk Engineer

NMS Recruit Limited Chester, Cheshire

NMS Recruit are seeking an experienced Embedded Systems Help Desk Engineer to join global energy consultancy based in the Cheshire. This is an exciting opportunity to join a rapidly growing business. You will be required to work 50/50 split between site and home and sponsorship is available. This is an exciting opportunity for a talented Embedded Systems Reliability Engineer with proficiency in modern C++ (C+ or newer).Responsibilities Investigate and resolve complex bugs across embedded and desktop systems, implementing fixes and systemic quality improvements Develop and maintain tools for automated testing, diagnostics and release validation using Python and Bash Enhance and maintain CI/CD pipelines for embedded firmware (Buildroot/make) and desktop applications (CMake/Qt), integrating quality gates and static analysis Define, monitor and drive improvements against key reliability metrics (e.g. crash frequency, memory stability, startup success) Improve diagnostic visibility through structured logging, crash data capture and telemetry via MQTT Collaborate with hardware, software and test engineers to embed quality and reliability throughout the development lifecycle Experience Degree in Software Engineering, Computer Science, Electronics or equivalent working experience Proficiency in modern C++ (C+ or newer) for embedded and cross-platform desktop development Strong scripting experience in Python and Bash for tooling and test automation Experience with CMake, make, and CI/CD systems (e.g., GitLab CI, Azure Pipelines) Familiarity with Docker for embedded software builds and containerised testing Confident in debugging across firmware, OS and application layers Deep understanding of Embedded Linux (Buildroot), system configuration and device-level development Familiarity with MQTT and messaging protocols used in distributed systems Experience with Qt and GUI development for Windows and Linux environments Working knowledge of observability concepts, incident response and long-term reliability strategies Exposure to hardware-in-the-loop (HIL) testing and embedded diagnostics Benefits Up to £60,000 DOE Career development opportunities Holidays: 25 days of annual leave (FTE), plus bank holidays, with an extra day for every three years completed (up to a maximum of 30 days). Ability to buy an additional 5 days Pension contributions of 8% from the employer Group Life Insurance, Income Protection, and Critical Illness cover Private Medical Insurance Important Information: We endeavour to process your personal data in a fair and transparent manner. In applying for this role, NMS Recruit will be acting within your interest and will contact you in relation to the role, either by email, phone or text message. For more information see our Privacy Policy on our website. It is important you are aware of your individual rights and the provisions the company has put in place to protect your data. If you would like further information on the policy or GDPR please get in touch with us here.

Oct 08, 2025

Full time

NMS Recruit are seeking an experienced Embedded Systems Help Desk Engineer to join global energy consultancy based in the Cheshire. This is an exciting opportunity to join a rapidly growing business. You will be required to work 50/50 split between site and home and sponsorship is available. This is an exciting opportunity for a talented Embedded Systems Reliability Engineer with proficiency in modern C++ (C+ or newer).Responsibilities Investigate and resolve complex bugs across embedded and desktop systems, implementing fixes and systemic quality improvements Develop and maintain tools for automated testing, diagnostics and release validation using Python and Bash Enhance and maintain CI/CD pipelines for embedded firmware (Buildroot/make) and desktop applications (CMake/Qt), integrating quality gates and static analysis Define, monitor and drive improvements against key reliability metrics (e.g. crash frequency, memory stability, startup success) Improve diagnostic visibility through structured logging, crash data capture and telemetry via MQTT Collaborate with hardware, software and test engineers to embed quality and reliability throughout the development lifecycle Experience Degree in Software Engineering, Computer Science, Electronics or equivalent working experience Proficiency in modern C++ (C+ or newer) for embedded and cross-platform desktop development Strong scripting experience in Python and Bash for tooling and test automation Experience with CMake, make, and CI/CD systems (e.g., GitLab CI, Azure Pipelines) Familiarity with Docker for embedded software builds and containerised testing Confident in debugging across firmware, OS and application layers Deep understanding of Embedded Linux (Buildroot), system configuration and device-level development Familiarity with MQTT and messaging protocols used in distributed systems Experience with Qt and GUI development for Windows and Linux environments Working knowledge of observability concepts, incident response and long-term reliability strategies Exposure to hardware-in-the-loop (HIL) testing and embedded diagnostics Benefits Up to £60,000 DOE Career development opportunities Holidays: 25 days of annual leave (FTE), plus bank holidays, with an extra day for every three years completed (up to a maximum of 30 days). Ability to buy an additional 5 days Pension contributions of 8% from the employer Group Life Insurance, Income Protection, and Critical Illness cover Private Medical Insurance Important Information: We endeavour to process your personal data in a fair and transparent manner. In applying for this role, NMS Recruit will be acting within your interest and will contact you in relation to the role, either by email, phone or text message. For more information see our Privacy Policy on our website. It is important you are aware of your individual rights and the provisions the company has put in place to protect your data. If you would like further information on the policy or GDPR please get in touch with us here.

Network Engineer

Experis

Role Title: Network Engineer 6 months Rate: 506 Hybrid -3 days on site in London Role Summary UK ITS AS (ENET) Engineer will be a member of the team that is responsible for the support and management of the of the infrastructure that supports the Bank's electronic business (known within other banks as Electronic Trading or Pre-Trade). This will be achieved by leveraging the technical expertise of other teams within UK ITS. Team members will work closely with core Infrastructure teams, development and the business responding to requests, fault reports, often being required to resolve issues quickly, under pressure and sometimes out of hours. In addition to business as usual activity there are a great number of infrastructure projects and tasks that must be completed. Role Description Deliver and support low latency connectivity and monitoring solutions for the Global Markets business, aligned with front-office trading and regulatory needs. Apply SRE principles to improve availability, latency, performance, and capacity planning across trading infrastructure. Collaborate with network and platform engineers to design reliable, self-healing systems and reduce manual intervention through automation. Own and execute the Global Markets connectivity roadmap, from project delivery through to operational handoff and lifecycle management. Partner with business stakeholders and platform owners to ensure infrastructure and observability tooling meets evolving trading requirements. Monitor and manage capacity and performance of global connectivity systems, working with regional teams to aggregate local intelligence. Develop and maintain automated alerting, health checks, and dashboards, supporting proactive detection of issues and system degradation. Lead lab-based testing and benchmarking initiatives for new connectivity solutions, ensuring readiness for production deployment. Produce operational and performance KPIs, SLO/SLI metrics, and executive summaries to support senior decision-making. Requirements Strong background in low latency network engineering, including TCP/IP, multicast, traffic shaping, and performance analysis. Proven experience with packet capture and analysis tools (Wireshark, tcpdump, Corvil/PICO); ability to build custom decoders is highly advantageous. Familiarity with SRE tools and practices, including infrastructure as code (IaC), CI/CD pipelines, error budgets, and reliability-focused SLIs/SLOs. Strong working knowledge of messaging middleware (Solace, 29West, Tibco, LBM, or equivalent) in performance-sensitive environments. Proficient in scripting and automation using Python, Bash, or PowerShell to streamline monitoring, alerting, and recovery workflows. Knowledge of FIX, market data, and order routing protocols in a trading environment. Exposure to observability platforms such as ITRS Geneos, Prometheus, Grafana, or custom telemetry stacks. Comfortable working across Linux systems, hybrid infrastructure, and global production environments. Excellent communication and reporting skills, with ability to translate technical data into actionable business insights.

Oct 07, 2025

Contractor

Role Title: Network Engineer 6 months Rate: 506 Hybrid -3 days on site in London Role Summary UK ITS AS (ENET) Engineer will be a member of the team that is responsible for the support and management of the of the infrastructure that supports the Bank's electronic business (known within other banks as Electronic Trading or Pre-Trade). This will be achieved by leveraging the technical expertise of other teams within UK ITS. Team members will work closely with core Infrastructure teams, development and the business responding to requests, fault reports, often being required to resolve issues quickly, under pressure and sometimes out of hours. In addition to business as usual activity there are a great number of infrastructure projects and tasks that must be completed. Role Description Deliver and support low latency connectivity and monitoring solutions for the Global Markets business, aligned with front-office trading and regulatory needs. Apply SRE principles to improve availability, latency, performance, and capacity planning across trading infrastructure. Collaborate with network and platform engineers to design reliable, self-healing systems and reduce manual intervention through automation. Own and execute the Global Markets connectivity roadmap, from project delivery through to operational handoff and lifecycle management. Partner with business stakeholders and platform owners to ensure infrastructure and observability tooling meets evolving trading requirements. Monitor and manage capacity and performance of global connectivity systems, working with regional teams to aggregate local intelligence. Develop and maintain automated alerting, health checks, and dashboards, supporting proactive detection of issues and system degradation. Lead lab-based testing and benchmarking initiatives for new connectivity solutions, ensuring readiness for production deployment. Produce operational and performance KPIs, SLO/SLI metrics, and executive summaries to support senior decision-making. Requirements Strong background in low latency network engineering, including TCP/IP, multicast, traffic shaping, and performance analysis. Proven experience with packet capture and analysis tools (Wireshark, tcpdump, Corvil/PICO); ability to build custom decoders is highly advantageous. Familiarity with SRE tools and practices, including infrastructure as code (IaC), CI/CD pipelines, error budgets, and reliability-focused SLIs/SLOs. Strong working knowledge of messaging middleware (Solace, 29West, Tibco, LBM, or equivalent) in performance-sensitive environments. Proficient in scripting and automation using Python, Bash, or PowerShell to streamline monitoring, alerting, and recovery workflows. Knowledge of FIX, market data, and order routing protocols in a trading environment. Exposure to observability platforms such as ITRS Geneos, Prometheus, Grafana, or custom telemetry stacks. Comfortable working across Linux systems, hybrid infrastructure, and global production environments. Excellent communication and reporting skills, with ability to translate technical data into actionable business insights.

Embedded Systems Reliability Engineer

NMS Recruit Limited Chester, Cheshire

NMS Recruit are seeking an experienced Embedded Systems Reliability Engineer to join global energy consultancy based in the Cheshire. This is an exciting opportunity to join a rapidly growing business. You will be required to work 50/50 split between site and home and sponsorship is available. This is an exciting opportunity for a talented Embedded Systems Reliability Engineer with proficiency in modern C++ (C+ or newer).Responsibilities Investigate and resolve complex bugs across embedded and desktop systems, implementing fixes and systemic quality improvements Develop and maintain tools for automated testing, diagnostics and release validation using Python and Bash Enhance and maintain CI/CD pipelines for embedded firmware (Buildroot/make) and desktop applications (CMake/Qt), integrating quality gates and static analysis Define, monitor and drive improvements against key reliability metrics (e.g. crash frequency, memory stability, startup success) Improve diagnostic visibility through structured logging, crash data capture and telemetry via MQTT Collaborate with hardware, software and test engineers to embed quality and reliability throughout the development lifecycle Experience Degree in Software Engineering, Computer Science, Electronics or equivalent working experience Proficiency in modern C++ (C+ or newer) for embedded and cross-platform desktop development Strong scripting experience in Python and Bash for tooling and test automation Experience with CMake, make, and CI/CD systems (e.g., GitLab CI, Azure Pipelines) Familiarity with Docker for embedded software builds and containerised testing Confident in debugging across firmware, OS and application layers Deep understanding of Embedded Linux (Buildroot), system configuration and device-level development Familiarity with MQTT and messaging protocols used in distributed systems Experience with Qt and GUI development for Windows and Linux environments Working knowledge of observability concepts, incident response and long-term reliability strategies Exposure to hardware-in-the-loop (HIL) testing and embedded diagnostics Benefits Up to £60,000 DOE Career development opportunities Holidays: 25 days of annual leave (FTE), plus bank holidays, with an extra day for every three years completed (up to a maximum of 30 days). Ability to buy an additional 5 days Pension contributions of 8% from the employer Group Life Insurance, Income Protection, and Critical Illness cover Private Medical Insurance Important Information: We endeavour to process your personal data in a fair and transparent manner. In applying for this role, NMS Recruit will be acting within your interest and will contact you in relation to the role, either by email, phone or text message. For more information see our Privacy Policy on our website. It is important you are aware of your individual rights and the provisions the company has put in place to protect your data. If you would like further information on the policy or GDPR please get in touch with us here.

Oct 02, 2025

Full time

NMS Recruit are seeking an experienced Embedded Systems Reliability Engineer to join global energy consultancy based in the Cheshire. This is an exciting opportunity to join a rapidly growing business. You will be required to work 50/50 split between site and home and sponsorship is available. This is an exciting opportunity for a talented Embedded Systems Reliability Engineer with proficiency in modern C++ (C+ or newer).Responsibilities Investigate and resolve complex bugs across embedded and desktop systems, implementing fixes and systemic quality improvements Develop and maintain tools for automated testing, diagnostics and release validation using Python and Bash Enhance and maintain CI/CD pipelines for embedded firmware (Buildroot/make) and desktop applications (CMake/Qt), integrating quality gates and static analysis Define, monitor and drive improvements against key reliability metrics (e.g. crash frequency, memory stability, startup success) Improve diagnostic visibility through structured logging, crash data capture and telemetry via MQTT Collaborate with hardware, software and test engineers to embed quality and reliability throughout the development lifecycle Experience Degree in Software Engineering, Computer Science, Electronics or equivalent working experience Proficiency in modern C++ (C+ or newer) for embedded and cross-platform desktop development Strong scripting experience in Python and Bash for tooling and test automation Experience with CMake, make, and CI/CD systems (e.g., GitLab CI, Azure Pipelines) Familiarity with Docker for embedded software builds and containerised testing Confident in debugging across firmware, OS and application layers Deep understanding of Embedded Linux (Buildroot), system configuration and device-level development Familiarity with MQTT and messaging protocols used in distributed systems Experience with Qt and GUI development for Windows and Linux environments Working knowledge of observability concepts, incident response and long-term reliability strategies Exposure to hardware-in-the-loop (HIL) testing and embedded diagnostics Benefits Up to £60,000 DOE Career development opportunities Holidays: 25 days of annual leave (FTE), plus bank holidays, with an extra day for every three years completed (up to a maximum of 30 days). Ability to buy an additional 5 days Pension contributions of 8% from the employer Group Life Insurance, Income Protection, and Critical Illness cover Private Medical Insurance Important Information: We endeavour to process your personal data in a fair and transparent manner. In applying for this role, NMS Recruit will be acting within your interest and will contact you in relation to the role, either by email, phone or text message. For more information see our Privacy Policy on our website. It is important you are aware of your individual rights and the provisions the company has put in place to protect your data. If you would like further information on the policy or GDPR please get in touch with us here.

Principal Site Reliability Engineer

LA International Computer Consultants Ltd Wokingham, Berkshire

Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Oct 01, 2025

Contractor

Our client is looking for a Principal Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires an active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Senior Site Reliability Engineer

LA International Computer Consultants Ltd Wokingham, Berkshire

Our client is looking for a Senior Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Oct 01, 2025

Contractor

Our client is looking for a Senior Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and requires active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Site Reliability Engineer

LA International Computer Consultants Ltd Wokingham, Berkshire

Our client is looking for a number of hands on Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and needs active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Oct 01, 2025

Contractor

Our client is looking for a number of hands on Site Reliability Engineers to join their team on a initial three month contract with good scope for extension. They require candidates to be able to go to site in Wokingham twice a week and rest remote. This role is Inside IR35 and needs active SC clearance. Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? Due to the nature and urgency of this post, candidates holding or who have held high level security clearance in the past are most welcome to apply. Please note successful applicants will be required to be security cleared prior to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds. Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.

Senior SRE- SC Cleared

Eteam Workforce Limited Reading, Berkshire

We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you Location: Wokingham (Reading) | Hybrid - 60% remote and 40% onsite Duration: 30/01/2026 - possible extension CONTRACTOR MUST HOLD ACTIVE SC CLEARANCE Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? If you are interested in this position and would like to learn more, please send through your CV and we will get in touch with you as soon as possible. Please note, candidates are often Shortlisted within 48 hours.

Oct 01, 2025

Contractor

We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you Location: Wokingham (Reading) | Hybrid - 60% remote and 40% onsite Duration: 30/01/2026 - possible extension CONTRACTOR MUST HOLD ACTIVE SC CLEARANCE Role Description: Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM, AMQ). Conduct security reviews and implement controls aligned with national infrastructure standards. Mentor junior engineers and promote SRE best practices. Collaborate with vendors and IT teams for incident resolution and platform improvements. Required Skills: Strong communication skills (written and verbal). Experience in remote team collaboration. Deep expertise in OpenShift/Kubernetes and RedHat Linux. Proficiency in Scripting (Bash, Python) and templating (Helm, Kustomize). Experience with CI/CD automation and IaC strategies. Security-first mindset with experience in regulated environments. Experience with VMware vSphere virtualization? If you are interested in this position and would like to learn more, please send through your CV and we will get in touch with you as soon as possible. Please note, candidates are often Shortlisted within 48 hours.

15 jobs found

Modal Window