The Job logo

What

Where

Senior DevOps Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Monitoring and supporting critical high-performance, large-scale services running on a farm of 10000+ hosts.

What you'll be doing:

  • Monitoring & supporting critical high-performance, large-scale services running on a farm of 10000+ hosts.
  • Ensure more than 95% availability for the build and test farms.
  • Participate in triaging & resolution of complex build and test infra related issues.
  • Collaborate with our other engineering teams to expose any defects and constraints.
  • Collaborate with software development teams to deliver reliable, robust, and high-performance capability of the underlying infra.
  • Perform Root Cause Analysis & Implement Corrective Actions for any persistent & user impacting issues.
  • Implementing high availability infrastructure and disaster recovery solutions.
  • Large scale deployments across multiple Kubernetes, ESXi clusters to support CI/CD pipelines for NVIDIA products.
  • Design and implement monitoring solutions to gain more insight into applications and system health. Implement critical metric using various analytics methods and dashboards.
  • Craft and develop tools needed for automating workflows.
  • Take part in prototyping, crafting, and developing cloud infrastructure for Nvidia.
  • Participating in on-call support and critical issue coverage as a SRE engineer.

 

What we need to see:

  • Solid programming background in python/tcl and/or similar scripting languages.
  • Strong background with CI/CD workflows, GitLab/Jenkins or any other CI/CD tools.
  • Proficient with configuration management tools like Ansible, Chef, Puppet and source code management & binary repository systems like GitLab, GitHub, Artifactory etc.
  • Demonstrable experience working in large scale enterprise production systems.
  • Proficient with Kubernetes administration, dockers & virtualization. Knowledge of standard methodologies related to security.
  • Proficient with data analytics/visualization & monitoring tools like Kibana, Grafana, Splunk, Zabbix, Prometheus and/or similar systems.
  • Strong background in dockers, containerization and managing large scale container/pod deployments for Kubernetes clusters.
  • Excellent debugging, problem solving and analytical skills.
  • Strong understanding of architectural requirements and development processes involved in building reliable, robust, scalable data products and pipelines.
  • Experience in writing complex queries for MySQL or similar DB.
  • 5+ years of proven experience.
  • Bachelor’s or Master’s degree in computer science, Software Engineering, or equivalent experience.

 

Ways to stand out from the crowd:

  • Experience/Knowledge of supporting Java based applications, webservers etc is a plus .
  • Thrives in a multi-tasking environment with constantly evolving priorities.
  • Ability to analyze complex problems into simple sub problems and then reuse available solutions to implement most of those. Ability to design simple systems that can work efficiently without needing much support.
  • Prior experience with large scale operations team.
  • Outstanding interpersonal skills and communication with all levels of management.
Set alert for similar jobsSenior DevOps Engineer role in Pune, India
NVIDIA Logo

Company

NVIDIA

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 Years

Category

Software Engineering

Locations

Pune, Maharashtra, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

NVIDIA Logo

Senior DevOps Engineer

NVIDIA

Pune, Maharashtra, India

Posted: a year ago

We are seeking a Kubernetes System Administrator to join our team. Your main responsibilities will include designing and implementing Kubernetes clusters, configuring auto provisioning and scaling, and deploying pods and containers. You will also be responsible for monitoring applications and system health, automating workflows, and participating in on-call support. The ideal candidate will have a strong background in CI/CD systems, Kubernetes administration, and data analytics/visualization tools. In addition, proficiency in Python, experience maintaining cloud infrastructure, and familiarity with configuration management tools are required. A Bachelor's or Master's degree in computer science or equivalent experience is preferred.

Zensar Technologies Logo

DevOPS Developer

Zensar Technologies

Pune, Maharashtra, India

Posted: a year ago

Description   Roles and Responsibilities Administer and manage our Azure DevOps environment. Work with developers to implement pipelines and releases for Power platform and Azure Services using Azure DevOps. Help implement the bank’s strategic aims, promote operational efficiencies, decrease time to market, while increasing environment consistency and resilience. Support the adoption of cloud technology and automated code driven deployments. Raise awareness of operational risks by regularly evaluating and escalating them via the bank's risk framework. Understand common best practice working methods, processes and tools, across the Power Platform and  Azure Services, in the ongoing development of the bank's cloud services. Ensure cloud services meet the agreed non-functional requirements, such as service levels and availability. Troubleshoot and resolve technical problems. Stay current with new technologies, development methods and trends relating to cloud technology and information technology more broadly. Produce technical documentation supporting the design and operation. Ensure the Microsoft platforms are running in a cost-efficient way, without reducing resilience or increasing risk of service disruption. Support pre-production acceptance testing to help assure the quality of the bank’s cloud based technology services.   Required Experience In-depth experience in working and supporting CI\CD pipelines in Azure DevOps. Understanding of Power Platform, specifically Power Apps, Power Automate and Dataverse. Experience of Azure services including but not limited to Azure SQL, Azure Data Factory, Logic Apps, Azure Functions. In-depth experience in deploying applications and services using immutable methodologies. In-depth practical experience in supporting applications deployments. In-depth practical experience in developing scripts in PowerShell. In-depth experience in using a common version control system like Git in a team environment. Practical experience in managing Azure Services using Terraform. Understanding of network topologies and common network protocols and services. Experience in traditional and agile development/project methodologies. Experience in automated testing. Experience of leading and growing an Azure team function within an organisation   Primary Location :  India-Maharashtra-Pune Experience Required (In Years):   Minimum-  5  Maximum-  10

Amber Logo

Associate Devops Engineer

Amber

Pune, Maharashtra, India

Posted: 2 months ago

About Amber (amberstudent.com): Long-term accommodation booking platform for students (think booking.com for student housing). Amber helps 80M+ students worldwide, find and book full-time accommodations near their universities, without the hassle of negotiation, non-standardized and cumbersome paperwork, and broken payment process. We are the largest and fastest-growing student housing platform globally, with 1M+ units listed in 6 countries and across 100+ cities. We are growing rapidly and targeting $2B in annual gross bookings value by 2024. If you are passionate about making international mobility and living, seamless and accessible, then - Join us in building the future of student housing! (We are among the fastest growing companies in the Asia Pacific as per Financial Times https://www.ft.com/high-growth-asia-pacific-ranking-2022.) About the role: We are looking for a talented Associate Devops Engineer  to join us in building best-in-class DevOps infrastructure for scaling to the next 100M users. The ideal candidate should be excited to work on a multitude of different problems while having full ownership. Key Responsibilities : Architect, Build, Maintain, and Upgrade highly available software systems Run a highly available Cloud-based software product on AWS Monitor system events to ensure health, maximum system availability, and service quality Work very closely with the Software Development team to design and implement new systems Set up and maintain CI/CD systems. Develop, maintain, and optimize data pipelines for extracting, transforming, and loading (ETL) data from various sources. Manage and maintain databases and data warehouses, including performance tuning and schema design. Continuously improve the security/cost posture of the Amber platform Continuously improve the efficiency of day-to-day operations at Amber What we are looking for? 1+ years experience in DevOps / SRE role Hands on experience with Amazon Web Services, Google Cloud Platform, or Azure Experience with Infrastructure as Code tools, such as Terraform, Ansible, and Pulumi Strong scripting skills in Python, Bash, Groovy, etc Strong working knowledge of source control branching in Git. Experience in an Agile software development environment is a plus Experience with operability tools, such as Grafana, Prometheus Familiarity with Ruby, React is a plus What will you get from Amber Fast-paced growth (can skip intermediate levels) Total freedom and authority (everything under you, just get the job done!) Open and Inclusive Environment Competitive salary and opportunities for professional growth