The Job logo

What

Where

DevOps Engineer - III - Engineering

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
We are looking for a candidate with extensive experience in managing high traffic, large scale microservices and infrastructure. You will be responsible for owning the end-to-end availability, performance, and capacity of applications and their infrastructure. You will also be responsible for providing 24X7 infrastructure and application support, mentorship, and training of junior engineers, and automating processes to improve development and release processes.

Job Requirement

What you’ll do:

  • Bridging the gaps b/w core infra, security, QA and development team.
  • Owning the end-to-end Availability, Performance, Capacity of applications and their infrastructure and creating/maintaining the respective observability with Prometheus/New Relic/ELK/Loki.
  • Providing 24X7 infra & app support, building processes and documenting “tribal” knowledge around the same time.
  • Mentor and train L1 engineers and continually improve app and infra support processes.
  • Managing application deployment & GKE platforms - automate and improve development and release processes.
  • Creating, managing and maintaining datastores & data platform infra using IaC.
  • Owning and onboarding new applications with the production readiness review process.
  • Managing the SLO/Error Budgets/Alerts and performing root cause analysis for production errors.
  • Working with Core Infra, Dev and Product teams to define SLO/Error Budgets/Alerts.
  • Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
  • Identifying observability gaps in application & infrastructure and working with stakeholders to fix them.
  • Managing outages and doing detailed RCA with developers and identifying ways to avoid that situation.
  • Automate toil and repetitive work.

What We're Looking For:

  • 6+ Years of experience in managing high traffic, large scale microservices and infrastructure with excellent troubleshooting skills.
  • Experience in troubleshooting, managing and deploying containerized environments using Docker/containerd, Kubernetes is a must.
  • Must be proficient with the helm with experience in service mesh like Istio, Linkerd.
  • Must be very hands-on in managing and troubleshooting the Kubernetes environment.
  • Extensive experience with Linux administration and a good understanding of the various Linux kernel subsystems (memory, storage, network etc).
  • Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
  • Expertise in GitOps, Infrastructure as a Code tool such as Terraform etc.. and Configuration Management Tools such as Chef, Puppet, Saltstack, Ansible.
  • Expertise in Google Cloud (GCP) and/or other relevant Cloud Infrastructure solutions like AWS or Azure.
  • Experience in building the CI/CD pipelines with tools such as Jenkins, GitLab, Spinnaker, Argo etc.
  • Experience with multiple datastores is a plus (Kafka/RabbitMQ, Redis, Elasticsearch).
  • Must be good in any of the DevOps scripting languages - python or go.
  • A collaborative spirit with the ability to work across disciplines to influence, learn and deliver.
  • A deep understanding of computer science, software development, and networking principles.
Set alert for similar jobsDevOps Engineer - III - Engineering role in Pune, India, Gurgaon, India, or Bengaluru, India
Groww Logo

Company

Groww

Job Posted

a year ago

Job Type

Full-time

WorkMode

Remote

Experience Level

3-7 years

Category

Software Engineering

Locations

Pune, Maharashtra, India

Gurgaon, Haryana, India

Bengaluru, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

Groww Logo

SDET - Engineering

Groww

Bengaluru, Karnataka, India

+2 more

Posted: a year ago

We are looking for a QA tester who will be responsible for improving and extending our test frameworks and automated test sets. You will have complete control over the quality of our products and all deployment cycles. You will work closely with the product development and customer service teams to understand and fix any issues. You will be responsible for both functional and non-functional testing, covering the UI and backend services. Strong programming skills and familiarity with automation frameworks are required. Experience with distributed architectures and cloud native architectures is a plus. A strong focus on test coverage and continuous integration and delivery is also important.