The Job logo

What

Where

Senior DevOps Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
We are seeking a Kubernetes System Administrator to join our team. Your main responsibilities will include designing and implementing Kubernetes clusters, configuring auto provisioning and scaling, and deploying pods and containers. You will also be responsible for monitoring applications and system health, automating workflows, and participating in on-call support. The ideal candidate will have a strong background in CI/CD systems, Kubernetes administration, and data analytics/visualization tools. In addition, proficiency in Python, experience maintaining cloud infrastructure, and familiarity with configuration management tools are required. A Bachelor's or Master's degree in computer science or equivalent experience is preferred.

What you'll be doing:

  • Kubernetes System Administration in a large-scale DevOps CI/CD environment. Designing and implementing clusters, cluster segmentation, internal/external networking for 4+ CI/CD deployment environments; dev, test, staging, production.
  • Implementation of the Kubernetes architectures for configuration, hardening, networking, sizing, scaling etc. to support a CI/CD pipeline for NVIDIA products.
  • Configuring Kubernetes auto provisioning, and auto scaling of CI/CD job/build agents/runners/nodes.
  • Implementing high availability clusters and disaster recovery solutions
  • Large scale pod/container deployments across multiple Kubernetes clusters to support CI/CD pipelines for NVIDIA products.
  • Design and implement monitoring solutions to gain more insight into applications and system health. Implement critical metric using various analytics methods and dashboards.
  • Craft and develop tools needed for automating workflows. Reuse AI techniques to extract useful signals about machines and jobs from the data generated.
  • Take part in prototyping, crafting, and developing cloud infrastructure for Nvidia.
  • Participating in on-call support and critical issue coverage as a SRE engineer.

 

What we need to see:

  • Strong background with Gitlab, Jenkins and/or other CI/CD systems.
  • Proficient with Kubernetes administration, dockers & virtualization. Knowledge of standard methodologies related to security.
  • Proficient with data analytics/visualization & monitoring tools like Kibana, Grafana, Splunk, Zabbix, Prometheus and/or similar systems.
  • Solid programming background in python and/or similar scripting languages.
  • Experience of maintaining cloud infrastructure and highly available production environment.
  • Strong background in dockers, containerization and managing large scale container/pod deployments for Kubernetes clusters.
  • Excellent debugging, problem solving and analytical skills.
  • Strong understanding of architectural requirements and development processes involved in building reliable, robust, scalable data products and pipelines.
  • Experience in Databases both SQL (MySQL) and NoSQL (MongoDB, AstraDB).
  • Proficient with configuration management tools like Ansible, Chef, Puppet and source code management & binary repository systems like GitLab, GitHub, Artifactory etc.
  • Demonstrable experience working in large scale enterprise production systems.
  • 8+ years of proven experience.
  • Bachelor’s or Master’s degree in computer science, Software Engineering, or equivalent experience.



 

Set alert for similar jobsSenior DevOps Engineer role in Pune, India
NVIDIA Logo

Company

NVIDIA

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

8-12 Years

Category

Engineering

Locations

Pune, Maharashtra, India

Qualification

Bachelor or Master

Applicants

Be an early applicant

Related Jobs

NVIDIA Logo

Senior DevOps Engineer

NVIDIA

Pune, Maharashtra, India

Posted: a year ago

Monitoring and supporting critical high-performance, large-scale services running on a farm of 10000+ hosts.