DevOps Engineer - Observability & AIOps

Reference: GCS/GP 1247_1774390380

DevOps Engineer - Observability & AIOps

Location: Philadelphia
Department: Software Development & Engineering - Observability & AIOps

About the Role

We're looking for a skilled DevOps Engineer to join our Observability & AIOps team. This role is at the heart of ensuring that client's large-scale distributed systems are reliable, observable, and intelligently automated. You'll design, build, and maintain platforms that provide deep visibility into our services, leverage AI/ML for operational insights, and drive automated incident response.

Key Responsibilities

  • Build & Maintain Observability Infrastructure
    • Deploy, configure, and manage tools for metrics, logs, and traces (e.g., Prometheus, Grafana, ELK stack, OpenTelemetry, Jaeger, Datadog, Splunk).
    • Ensure telemetry data is complete, accurate, and accessible across systems and environments.
  • Automation & CI/CD Integration
    • Integrate observability tools with CI/CD pipelines (Jenkins, GitLab CI, ArgoCD).
    • Automate deployment and scaling of monitoring agents using infrastructure-as-code (Terraform, Ansible, Helm).
  • AIOps & Intelligent Alerting
    • Collaborate with data scientists and platform engineers to feed clean observability data into AI/ML pipelines.
    • Implement anomaly detection, alert deduplication, and predictive maintenance solutions.
  • Incident Management & SRE Practices
    • Partner with SRE teams to define and monitor SLIs/SLOs.
    • Reduce mean time to detect (MTTD) and mean time to resolve (MTTR) through automation and intelligent alerting.
    • Contribute to incident response playbooks and post-incident reviews.
  • Dashboards & Developer Experience
    • Build and maintain custom dashboards that visualize service health and performance.
    • Provide self-service observability tools that empower development and operations teams.
    • Treat observability as a product, focusing on usability, reliability, and scalability.

Qualifications

  • 3+ years of experience as a DevOps Engineer, SRE, or similar role in large-scale, cloud-based environments.
  • Solid knowledge of observability concepts (metrics, logs, traces) and tools (Prometheus, Grafana, ELK stack, OpenTelemetry, etc.).
  • Hands-on experience with cloud platforms (AWS, GCP, or Azure), Kubernetes, and Docker.
  • Proficiency with automation and IaC tools (Terraform, Ansible, Helm).
  • Familiarity with incident management tools (PagerDuty, OpsGenie, ServiceNow).
  • Strong scripting skills (Python, Bash, or similar).
  • Excellent problem-solving and communication skills, with an ability to work across teams.

Preferred Skills

  • Experience applying AI/ML techniques to IT operations or monitoring.
  • Knowledge of SRE practices (SLIs/SLOs, error budgets).
  • Background in high-scale, distributed systems.

Example Projects You Might Work On

  • Rolling out distributed tracing with OpenTelemetry across microservices.
  • Building anomaly detection models for latency and error rates.
  • Automating remediation for common failure modes to enable self-healing systems.
  • Reducing alert fatigue with intelligent noise suppression and correlation.

GCS is acting as an Employment Business in relation to this vacancy.

COMPETITIVE SALARY

Philadelphia

Contract

Added 24/03/2026
Reference: GCS/GP 1247_1774390380

DevOps Engineer - Observability & AIOps

Philadelphia
Contract

Other similar jobs

Senior Java Engineer (reliability & observability)

Added 20/03/2026

Purpose of the roleTo design, develop and improve software, utilising various engineering methodologies, that provides business, platform, and technology capabilities for our customers and colleagues. AccountabilitiesDevelopment and delivery of high-quality software solutions by using industry aligned programming languages, frameworks, and tools. Ensuring that code is scalable, maintainable, and optimized for performance.Cross-functional collaboration with product managers, designers, and other engineers to define software requirements, devise solution strategies, and ensure seamless integration and alignment with business objectives.Collaboration with peers, participate in code reviews, and promote a culture of code quality and knowledge sharing.Stay informed of industry technology trends and innovations and actively...

Learn more

DevOps Engineer (IAM - GCP experience mandatory)

Added 12/05/2026

Key Responsibilities:In this role, you will: Design, build, and maintain CI/CD pipelines for IAM components, policies, connectors, microservices, and integrations.Enable automated testing, security scanning, and controlled deployments across DEV/TEST/PROD environments.Implement continuous improvement to streamline IAM release processes.Develop and maintain IaC (Terraform, Ansible for deploying IAM infrastructure, identity policies, directories, and supporting platforms.Ensure consistent, repeatable environments and compliance with architectural standards.Develop scripts and automation for account lifecycle operations, access provisioning, and system integrations.Deploy IAM services or related microservices on Kubernetes, cloud-native platforms, and serverless environments.Manage containerisation, service mesh integrations, certificates, and secrets for IAM workloads.Embed security into the build and deployment...

Learn more

DevOps Engineer (IAM - Identity Access Management)

Added 11/05/2026

In this role, you will: Design, build, and maintain CI/CD pipelines for IAM components, policies, connectors, microservices, and integrations.Enable automated testing, security scanning, and controlled deployments across DEV/TEST/PROD environments.Implement continuous improvement to streamline IAM release processes.Develop and maintain IaC (Terraform, Ansible for deploying IAM infrastructure, identity policies, directories, and supporting platforms.Ensure consistent, repeatable environments and compliance with architectural standards.Develop scripts and automation for account lifecycle operations, access provisioning, and system integrations.Deploy IAM services or related microservices on Kubernetes, cloud-native platforms, and serverless environments.Manage containerisation, service mesh integrations, certificates, and secrets for IAM workloads.Embed security into the build and deployment process,...

Learn more

DevOps Engineer

Added 11/05/2026

To be successful in this role, you should meet the following requirements:Key Skills & ExperienceTechnical SkillsInfrastructure as code (Terraform & GCP Provisioning) Terraform core, GCP Infra, Policy as code, develop the capability to manage, maintain and write policies,Containerization & Kubernetes (GKE), Docker, Kubernetes, Helm / Kustomize, GKE OpsCI/CD engineering, pipeline authoring, artifacts management, testing automation, deployment strategyData pipeline and DevOps (KAFKA / PubSub) - Kafka basics, schema registry, streaming infra and monitoringGraph platform engineering, Neo4j basics, backups, recovery and DR GDS/APOC, observabilityDevSecOps & platform security, security scanning, IAM and identity, network security, complianceRelease engineering and governance, release ops, change management,...

Learn more

Rabbit MQ SME/Platform/DevOps Engineer

Added 01/05/2026

Role OverviewYou will be part of a high-impact engineering team working on a global Privileged Access Management (PAM) transformation programme. RabbitMQ plays a critical role in enabling secure, scalable messaging across highly regulated environments.This is a hands-on technical role involving architecture input, engineering, deployment, and collaboration with internal teams and technology stakeholders. Role: Rabbit MQ/Platform/DevOps Duration: Until November this year extendable Location: Sheffield 2days office in a week. Rate: GBP 600 to 620/Day Inside IR35 Key ResponsibilitiesDesign, implement, and support RabbitMQ deployments across non-production and production environmentsConfigure and manage RabbitMQ clusters (multi-site deployments)Contribute to architecture and low-level design documentationWork closely...

Learn more

GCP DevOps Engineer - Remote

Added 30/04/2026

Senior Platform Engineer (Remote)I'm looking to connect with experienced DevOps / Platform Engineers to help design, secure, and operate a modern cloud platform in a highly regulated environment.This role sits within a team building and running a SaaS platform for the US market, where data security, privacy, and compliance are critical. You'll play a key role in shaping a secure, scalable GCP-based infrastructure while supporting AI-driven workloads and robust CI/CD practices.What You'll DoCloud Infrastructure & Platform EngineeringDesign and manage secure GCP infrastructure using TerraformOperate and harden GKE clusters, and support production workloadsDeploy and maintain cloud-based AI functions (Python) integrated with...

Learn more

DevOps Software Engineer - Contract

Added 17/04/2026

Software Engineer (Full-Stack) - 3-5 Years ExperienceWe're looking for a proactive software engineer to contribute to the development and evolution of our platform. This position suits a full-stack developer who is particularly strong on the backend and has familiarity with DevOps practices and CI/CD pipelines. What You'll DoContribute to the design and implementation of new functionality within the Release Orchestrator applicationDevelop and improve user-facing features using React, Next.js, and TypeScriptBuild and support backend services and APIs using Python (FastAPI or similar frameworks)Connect the platform with external systems such as CI/CD tools (e.g., Jenkins, CloudBees), GitHub, and other engineering utilitiesImplement and...

Learn more

Azure DevOps Engineer

Added 25/03/2026

Azure DevOps EngineerWe are seeing a highly skilled DevSecOps Engineer with a deep expertise in the Microsoft Azure ecosystem to support the design, implementtion, and optimization of secure cloud platforms.This is a hands-on contract role requiring strong expertise in Azure DevOps, inrastructure as code, and embedded security practices across the development lifecycle. Key Responsibilities:Design, build, and secure CI/CD pipelines using Azure DevOpsImplement DevSecOps frameworks to embed security across the SDLCDevelop and manage infrastructure using Bicep and ARM templates.Deploy and manage solutionswithin Power Platform and Micosoft FabricIntegrate security tooling including:SAST (Static pplication Security Testing)DAST (Dynamic Application Security Testing)Secret scanningLead security remediation...

Learn more

Senior DevOps Engineer

Added 24/03/2026

Job Title: Senior DevOps EngineerTeam: Media Analysis Framework (MAF) - ML Video Content PipelinesJob SummaryWe are seeking a Senior DevOps Engineer to design, automate, and operate cloud and on‑prem infrastructure supporting ML‑driven video analytics platforms. This role requires strong hands‑on experience, ownership mindset, and the ability to collaborate across engineering, architecture, and business teams.ResponsibilitiesDesign and implement infrastructure automation using Terraform and GitOps.Build and maintain CI/CD pipelines and Kubernetes‑based deployments.Collaborate with architecture and product teams on scalable deployment solutions.Monitor system health, reliability, and observability following SRE best practices.Troubleshoot issues across development, test, and production environments.Lead or independently deliver infrastructure projects end‑to‑end.Communicate...

Learn more

Azure Devops Engineer

Added 19/03/2026

Azure DevOps Engineer - ContractWe're looking for an experienced Azure DevOps Engineer to join a delivery‑focused engineering team working on AI based cloud platformKey skills & experience:Strong hands‑on experience with Azure DevOps in production environmentsAzure infrastructure and platform engineering experienceCI/CD pipelines (build, release, deployment automation)Infrastructure as Code - Terraform, Bicep or ARMKubernetes (AKS) and containerised workloadsStrong scripting skills (PowerShell and/or Bash)Experience supporting developers and improving DevExSolid understanding of cloud security, monitoring and reliabilityNice to have:GitOps, HelmAzure Landing ZonesExperience working in contract, fast‑paced delivery teams Location: London (remote / hybrid - 1 day per week in the office GCS is acting...

Learn more

DevOps Engineer (Jenkins, Kubernetes, Python)

Added 02/03/2026

My client is looking for an experienced DevOps Engineer to work 3 days a week onsite in their Sheffield office (Candidates can be based in Leeds to work 3 days a week onsite in Leeds office)Top skills needed:- DevOps - 5+ years experience - Jenkins- Python- Kubernetes- Cloud (AWS/Azure/GCP)Please apply if you have the required skillset and can work 3 days in office in Sheffield / Leeds. GCS is acting as an Employment Business in relation to this vacancy.

Learn more

Sr DevOps/Platform Engineer

Added 24/02/2026

We're hiring for a Platform / DevOps Engineer role where the focus is on building Kubernetes platforms (not just day-to-day cluster maintenance).You'll be responsible for designing and running cloud-native infrastructure in AWS/Azure, using Terraform-first delivery. Role: Platform / DevOps EngineerLocation: Dublin (Hybrid)Type: Full Time/PermKey Technical RequirementsMust-HavesStrong AWS/Azure experience.Kubernetes (clusters, deployments).CI/CD - ideally GitHub Actions.Infrastructure-as-Code, especially:TerraformBackground in coding (Java/Go/Python) is advantageousGCS is acting as an Employment Agency in relation to this vacancy.

Learn more

Azure Devops Engineer, terraform, MLops

Added 11/02/2026

Azure Platform Engineer - Terraform / MLOps - Hybrid Location: Hybrid (London / Remote - UK Based) Start Date: ASAP Duration: 6 months initialOverview:We are looking for a talented Azure Platform Engineer with solid data engineering and MLOps experience to join a forward-thinking data platform team. You will play a key role in building and optimising Azure-based infrastructure to support large-scale data and machine learning workloads.This role sits at the intersection of DataOps, DevOps, and MLOps - ideal for engineers who thrive on automation, scalability, and cloud-native design.Key Responsibilities:Design, build, and maintain Azure data and ML platform infrastructureDevelop and manage...

Learn more

DevOps Lead - Contract

Added 15/05/2026

Lead DevOps EngineerRole OverviewWe are seeking an experienced DevOps and Platform Engineering Leader to drive the strategy, delivery, and operational excellence of a large-scale enterprise API and integration platform environment. This role combines technical leadership, people management, platform engineering, DevOps transformation, and stakeholder engagement within a highly collaborative enterprise setting.The successful candidate will lead a team of approximately 10 DevOps and Platform Engineers while partnering closely with Product Owners, Engineering Leads, Enterprise Integration teams, and wider technology stakeholders to define and execute the API Platform roadmap.This is a hands-on leadership role suited to someone with deep expertise in DevOps tooling,...

Learn more

Senior Software Engineer/Data Platform Engineer (Databricks, Graph, APIs)

Added 30/04/2026

Senior Software Engineer / Data Platform Engineer (Databricks, Graph, APIs)Location: Philadelphia, PA The team sits within the network technology organisation and is responsible for building advanced data platforms that support digital twin capabilities across the access network. The group combines network design data, telemetry, mapping technologies, and graph intelligence to improve troubleshooting, planning, operational efficiency, and market competitiveness.The team works on highly scalable engineering products including large data pipelines, graph databases, APIs, and mapping platforms. Their work enables smarter network decisions, faster fault resolution, and better use of operational resources.This is a technically strong team focused on solving complex real-world...

Learn more
At least 8 characters, 1 uppercase, 1 lowercase and 1 special character or number
Your file must be a doc, docx or pdf. No larger than 5MB.