The Job logo

What

Where

Lead Site Reliability Administrator

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Responsibilities
•Design and develop CI/CD pipeline through industry standard CI/CD technologies  
•Ability to develop shared library to using GitLab or AWS Code Commit to enable build and deploy 
•Able to build and deploy applications to secure government cloud (FedRAMP) infrastructure on AWS 
•Able to work with a cross functional teams to deliver Infra-as-a-code solutions to build infrastructure configuration management solutions 
•Troubleshoot build and deployment related issues on cloud platforms 
•Collaborate with application teams, infrastructure teams, tools teams, and operations teams to develop integrated solution to enable CI/CD pipelines (both on premise and public cloud)
•Act as Scrum Leader or Lead when needed

 

Required Skills 
•5+ years overall deployment/build/support/architecture experience with minimum of 2+ years in DevOps or cloud technologies  
•General knowledge of UNIX/Linux & Windows 
•Experience working with AWS in a solution and deployment role
•Experience with scripting (Shell, Python etc.,) 
•Should have experience on build and deployment process for different technology-based applications like Java, .Net, PHP, Node JS, Angular, Python etc.
•Experience with at least one of CI/CD build and deployment system (Jenkins, GitLab, GitHub or AzureDevOps GitHub, Maven)
•Experience in a scrum team as an SRE or senior DevOps role from start to finish (build/deploy/support)
•Strong understanding of cloud native and container based distributed systems like Kubernetes
•Ability to collaborate with Engineering, Architecture, Infrastructure, and Operations teams to develop design and deliver solutions to drive infrastructure provisioning on AWS using Terraform, Helm, and Ansible
•Ability to work with AppDev/Engineering to develop release mechanisms that improve the product release cycle (enable hooks for APM, develop DR and Automated redeployment strategies)

 

Desired Skills
•Experience deploying and configuring APM and Application Observability tools
•Exposure to open-source technologies
•Experience in Docker, Docker compose, Docker file
•Experience on container orchestration technologies on cloud platforms (AKS/EKS/PKS/GKE Instances)
•Infra automation using Ansible/Terraform/Helm
•Ability to provision infrastructure using Terraform both in a pipeline and outside a pipeline
•Working experience of provisioning and managing container clusters
•Experience managing container-based applications
•Good understanding of docker networking, volumes and registry

Set alert for similar jobsLead Site Reliability Administrator role in Canberra, Australia
Opentext Logo

Company

Opentext

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 years

Locations

Canberra, Australian Capital Territory, Australia

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

Accenture Logo

Infrastructure Security Administrator

Accenture

Melbourne, Victoria, Australia

+4 more

Posted: a year ago

Seeking a Hardware and Software Asset Manager and Configuration Management Database Expert to manage and maintain accurate records of the organization's assets. The role involves collaborating with cross-functional teams to ensure secure authentication, identity management, and data protection. Must have expertise in AD, ADFS, and PKI. Preferred location is Melbourne.

Accenture Logo

Network Engineer - Routing/DCI/AWS DirectConnect/Encryption

Accenture

Sydney, New South Wales, Australia

+3 more

Posted: a year ago

Role Details As a Network Engineer specializing in routing, Data Center Interconnect (DCI), AWS Direct Connect, and encryption, you will be responsible for managing advanced network solutions that ensure seamless connectivity, secure data transfer, and optimized performance between on-premises environments and cloud resources, particularly Amazon Web Services (AWS). Your role will involve collaborating with cross-functional teams to build resilient, high-performance network infrastructures.   Skills required: Possess 5-8 years of experience Experience on Network Security -  Routing/DCI/AWS Direct Connect/Encryption Clear on concepts of Networking and switching  Prefer with CCNA / CCNP certified. Ability to work with minimal supervision. Excellent troubleshooting skills Diagnosis and Investigation of incidents Participate at MIM (Major Incidents) Coordination if multiple teams are involved during investigation and service restoration (e.g., infrastructure application support teams and/or multiple vendors or applications support teams) Regular case/ticket updates (in agreed intervals) Communications with the customer during the Incident Management process Escalation (Technical and Management) to Service Measures Resolution & Service Recovery to Service Measures, including organizing (equipment) faulty parts replacement Closure to Service Measures Excellent customer focus. Proven ability to manage multiple projects Proactive work ethic Available to work after hours as required   Eligibility requirements: You must have resided in Australia for a minimum of 3 continuous years Location: Melbourne preferred. Adelaide, Brisbane, Canberra and Sydney considered

Accenture Logo

Data Center Switching and Wireless Network Engineer

Accenture

Canberra, Australian Capital Territory, Australia

+3 more

Posted: a year ago

Role Responsibilities As a Data Center Switching and Wireless Network Engineer, you will be responsible for managing the data center switching infrastructure and wireless networks to ensure high availability, performance, and security. Your role will involve working with cross-functional teams to support network operations, troubleshoot issues, and optimize network performance within the data center environment.   Skills required Possess 5-8 years of experience Experience on Network -  DC Switching ( Arista/ACI), Campus Switching/Wireless (Cisco/Aruba), Routing. Knowledge of Google cloud platform with network operations. Knowledge of F5 and ACE Load Balancers (nice to have) Prefer with CCNA/CCNP certified., Excellent troubleshooting skills Diagnosis and Investigation of incidents Participate at MIM (Major Incidents) Coordination if multiple teams are involved during investigation and service restoration (e.g. infrastructure  application support teams and/or multiple vendors or applications support teams) Regular case/ticket updates (in agreed intervals) Communications with the customer during the Incident Management process Escalation (Technical and Management)  to Service Measures Resolution & Service Recovery to Service Measures, including organising (equipment)  faulty parts replacement Closure to Service Measures Excellent customer focus. Proven ability to manage multiple projects Proactive work ethic Available to work after hours as required Ability to work with minimal supervision   Location: Melbourne preferred. Adelaide, Canberra and Sydney considered

Opentext Logo

Lead Site Reliability Engineer

Opentext

Waterloo, Ontario, Canada

+2 more

Posted: a year ago

What You Are Great At   Applying broad range of knowledge skills and experiences with an area of expertise to assignments that are received in the form of objectives. Determining how to use resources to meet schedules and goals. Providing guidance to peers within the latitude of established company policy. Using broad knowledge of the organization to impact strategy, policy, and process development as a technical authority and leader with vision for positive business outcomes Leading multi-functional strategic and tactical efforts. Providing leadership by assisting in triage for escalated production incidents. Being a change agent able to develop, implement and maintain policies and processes Collaborating with peer technology organizations, business, clients and management to review application, systems and infrastructure functionality and develop plans for improvement. Leading development and implementation of strategies focused on greater efficiencies to deliver systems. Identifying and implementing strategies to reduce platform Mean-Time-To-Resolution (MTTR) Reliability (SRE) practices and automation principles. Managing continuous improvement of service engineering, delivery, and operational practices. Reduces expenses by eliminating unnecessary downtime and disruptions. Understanding of current business and technology trends to find opportunities for improving services and reducing risk. Adopting and promoting an an SLO mindset with Disaster recovery best practices in mind Effectively navigating organization structure and culture to make positive outcomes.   What It Takes   10+ years of related experience, or equivalent Intermediate and advanced level certifications that demonstrate knowledge of Cloud and security concepts Extensive knowledge of: CaaS Technologies including Kubernetes, Google Anthos/Google Kubernetes Engine (GKE), Ingress and PaaS technologies Knowledge of (IaaS) technologies including Hypervisor (VMWare ESX), Routing (VMWare NSX-T) and Load Balancing (F5, etc.) Knowledge of monitoring and logging technologies including VMWare Tanzu Observability/Wavefront, Dynatrace and Splunk In depth knowledge of Network and Infrastructure security best practices including governance Experience in CI/CD Pipeline implementation Automation of build, Packaging and Release Management activities (Build automation, CI/ CD, GIT, Jenkins, Git) Experience with tools like JIRA, GIT/Bitbucket, Confluence, etc. Build self-healing and automated systems Design and build systems to collect, visualize, and store service health indicators Demonstrates ability to achieve successful outcomes in handling difficult situations and work with various customers and management levels. Demonstrates previously working in Agile team working in SCRUM and Kanban formats. Communicate effectively with technical and non-technical audiences. A self-starter with the ability to work independently and in a collaborative team environment