
Software Engineer – Support, Operations
Posted May 20

Posted May 20
This is a fully remote position, open to applicants in Ireland.
• Triage and investigate production issues by querying logs, correlating events across services, and identifying root causes instead of just surface symptoms.
• Navigate through both Angular front-end and Java back-end codebases to trace issues comprehensively, from reported UI behavior to service logic and data layer.
• Implement precise code fixes for confirmed bugs, ensuring that all modifications are covered by tests, submitted via pull request, and reviewed prior to merging.
• Escalate fixes that necessitate architectural changes or involve high-risk areas of the codebase to the Senior Engineer before proceeding.
• Produce concise post-incident summaries detailing what occurred, the root cause, resolution, and the measures being taken to prevent recurrence.
• Build and maintain internal tools that assist the team in investigating and managing recurring production issues more effectively.
• Identify manual or repetitive investigation steps that are suitable for tooling and prioritize the development of solutions that save substantial time during incidents.
• Maintain existing tools to ensure they remain accurate and beneficial as the platform evolves.
• Proactively research and assess new tools and technologies that could enhance operational efficiency.
• Build and maintain a library of debugging playbooks and how-to guides that address common production issues.
• Update playbooks following every significant incident to integrate new insights.
• Identify gaps within the playbook library and prioritize addressing them based on the frequency and impact of incidents.
• Approximately three years of professional software engineering experience with hands-on exposure to both Java and Angular.
• Experience in debugging issues in a cloud-hosted or SaaS environment, comfortable working with incomplete information under time constraints.
• Working knowledge of AWS and confidence in navigating cloud infrastructure for investigative purposes.
• Strong log analysis capabilities — able to construct queries, correlate events across services, and derive diagnostic conclusions.
• Clear written communication skills — able to articulate a production issue and its resolution to both technical and non-technical stakeholders.
• Familiarity with Git-based workflows and standard code review practices.
• Experience in building internal tooling or automation to support operational workflows.
• Exposure to observability or incident management platforms such as Datadog, PagerDuty, or Grafana.
• Understanding of relational database query analysis — capable of identifying slow or problematic queries as part of an investigation.
• Familiarity with containerized deployment environments.
• Comprehensive health, dental, and vision insurance.
• Flexible work hours and remote working options.
• Professional development opportunities and support for ongoing education.
• A collaborative and inclusive company culture.
• Access to wellness programs and resources.
Webedia
TechBiz Global
The Flex
Nodeworthy
Get handpicked remote jobs straight to your inbox weekly.