The Job logo

What

Where

Manager Site Reliability Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Assigns and monitors work of technical personnel, ensures application development and deployment is done in the best possible way, implements quality control and review systems. Manages design and development of custom tools and integration with existing tools to increase engineering productivity. Takes responsibility for the architecture and technical leadership of the entire DevOps infrastructure.

What is the Job like?

  • Assigns and monitors work of technical personnel, ensuring that application development and deployment is done in the best possible way, and implements quality control and review systems throughout the development and deployment processes
  • Manage design and development of custom tools and integration with existing tools to increase engineering productivity
  • Take responsibility for the architecture and technical leadership of the entire DevOps infrastructure
  • Display technical leadership and oversight of implementation and deployment planning, system integration, quality assurance, delivery, operations, and sustainability of technical solutions
  • Manages operational aspect of production and development servers including developing, training in, and validating compliance with procedures and checklists related to disk space usage, monitoring solutions, deployment, conventions, access to the production and development sources, source control access and usage, performance monitoring, code modifications validation, scheduling, and more
  • Possesses high level understanding in the areas of web application programming, API, databases, and system design
  • Provides process improvement recommendations based on best practices and industry standards
  • Mentor team and work towards collaborative efforts for driving projects within organisation
  • Help solve business needs with technology by evaluating technology options and implementing newer technology stacks
  • Provides process improvement recommendations based on best practices and industry standards
  • Resolves conflicts by demonstrating leadership and appropriate decision-making competencies
  • Drive to big picture goals and milestones while valuing and maintaining a strong attention to detail

 

What do we look for?

  • 8-12 years of relevant work experience
  • Must have knowledge of Kubernetes, Docker, ELK, Prometheus, Nagios, Chef/Ansible, Terra-form, Sparkle Formation, GitLab, Jenkins etc,
  • Must be comfortable with Unix Administration
  • Deep understanding of Kernel, Networking and OS fundamentals
  • Strong experience with web technologies - Nginx, HAProxy, Apache, Nodejs 
  • Proven record of infra automation and programming skills in any of these languages - Python, Ruby, Perl, Javascript
  • Experience with Java stack and its intricacies
  • Good Database understanding
  • Strong experience in managing both development and operations
  • Good communication and interpersonal skill
  • Must be comfortable working with a distributed team
  • Must have worked in an Agile development environment
  • Ability to drive to big picture goals and milestones while valuing and maintaining a strong attention to detail
  • Ability to quickly identify and drive to the optimal solution when presented with a series of constraints
  • Demonstrated ability in people management, strategic planning, risk management, change management, and project management
Set alert for similar jobsManager Site Reliability Engineer role in Bengaluru, India
Zeta Logo

Company

Zeta

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

8-12 Years

Category

Engineering

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

Zeta Logo

Principal Site Reliability Engineer

Zeta

Bengaluru, Karnataka, India

Posted: a year ago

The System Reliability Engineer is responsible for 24/7 availability for Zeta’s cloud SaaS platform. Build, Deploy and Manage business applications to cloud platforms using Containers orchestration, Service mesh, API gateways, CI/CD components & Observability stacks. Collaborate with Product managers, Designers and Developers in self-sufficient teams to implement and follow best SRE practices.

Netskope Logo

Staff Site Reliability Engineer

Netskope

Bengaluru, Karnataka, India

Posted: a year ago

About the role Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The SRE Data / Provisioner team supports the Netskope Data Product Suite, and Provisioner, a critical component of our foundational technologies and the single source of truth for all user data across all Netskope Apps. We are a team of software engineers focused on improving availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. Job Responsibilities   Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure Develop innovative ways to smartly measure, monitor & report application and infrastructure health Gain deep knowledge of our application stack Experience improving the performance of micro-services and solve scaling/performance issues Capacity management and planning Function well in a fast-paced and rapidly-changing environment Participate in 24X7 on-call rotations. Preferred Qualifications BS or MS in Computer Science or equivalent technical degree or related practical experience Preferred Technical Skills: 10+ years experience with troubleshooting Unix/Linux Understanding of Networking concepts - TCP/IP, SSL/TLS, IPSec, GRE, VPN Experience with algorithms, data structures, complexity analysis, and software design Experience in one or more of the following: C, C++, Python, Go Experience in managing a large-scale web operations role Bonus points for experience with Ansible, Kubernetes, SQL and NoSQL datastores, CI/CD Hands-on working with private or public cloud services in a highly available and scalable production environment.  Desired Technical Skills: Knowledge of distributed systems is a big plus.   Additional Skills Great written and verbal communication Ability to work for a geo-distributed cross-functional group Demonstrated ability to own and deliver projects independently Demonstrated ability of technical mentoring and coaching  Strong interpersonal communication skills (including listening, speaking, and writing) and the ability to work well in a diverse, team-focused environment with other SREs, developers, Product Managers, etc