This is a fully remote position, open to applicants in Ireland.

📋 Description

• Triage and investigate production issues by querying logs, correlating events across services, and identifying root causes instead of just surface symptoms.

• Navigate through both Angular front-end and Java back-end codebases to trace issues comprehensively, from reported UI behavior to service logic and data layer.

• Implement precise code fixes for confirmed bugs, ensuring that all modifications are covered by tests, submitted via pull request, and reviewed prior to merging.

• Escalate fixes that necessitate architectural changes or involve high-risk areas of the codebase to the Senior Engineer before proceeding.

• Produce concise post-incident summaries detailing what occurred, the root cause, resolution, and the measures being taken to prevent recurrence.

• Build and maintain internal tools that assist the team in investigating and managing recurring production issues more effectively.

• Identify manual or repetitive investigation steps that are suitable for tooling and prioritize the development of solutions that save substantial time during incidents.

• Maintain existing tools to ensure they remain accurate and beneficial as the platform evolves.

• Proactively research and assess new tools and technologies that could enhance operational efficiency.

• Build and maintain a library of debugging playbooks and how-to guides that address common production issues.

• Update playbooks following every significant incident to integrate new insights.

• Identify gaps within the playbook library and prioritize addressing them based on the frequency and impact of incidents.

⛳️ Requirements

• Approximately three years of professional software engineering experience with hands-on exposure to both Java and Angular.

• Experience in debugging issues in a cloud-hosted or SaaS environment, comfortable working with incomplete information under time constraints.

• Working knowledge of AWS and confidence in navigating cloud infrastructure for investigative purposes.

• Strong log analysis capabilities — able to construct queries, correlate events across services, and derive diagnostic conclusions.

• Clear written communication skills — able to articulate a production issue and its resolution to both technical and non-technical stakeholders.

• Familiarity with Git-based workflows and standard code review practices.

• Experience in building internal tooling or automation to support operational workflows.

• Exposure to observability or incident management platforms such as Datadog, PagerDuty, or Grafana.

• Understanding of relational database query analysis — capable of identifying slow or problematic queries as part of an investigation.

• Familiarity with containerized deployment environments.

🏝️ Benefits

• Comprehensive health, dental, and vision insurance.

• Flexible work hours and remote working options.

• Professional development opportunities and support for ongoing education.

• A collaborative and inclusive company culture.

• Access to wellness programs and resources.

Software Engineer – Support, Operations

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Staff Engineer – API & Data

Senior AI Product Engineer

Full-Stack Engineering Lead

Full Stack Developer

Senior Software Engineer

Senior Software Engineer – Knowledge Graph, GraphRAG

Never miss a great job!