The Job logo

What

Where

Lead Site Reliability Engineering, Cloud Networking

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Be a part of our global team responsible for Continuous Cloud Operations and Continuous Cloud Innovation for webMethods.io iPaaS platform. Solve problems, automate processes, and optimize security, performance, and availability of our cloud services for maximum reliability.

The Role:

Solve problems relating to mission critical services and build automation to prevent problem recurrence, with the goal of automating response to all non-exceptional service conditions. You have deep expertise in analyzing complex systems, anticipating problems and finding ways to mitigate risk. By incorporating your knowledge of SRE processes to be focused on maximum availability, reliability, security, and performance for Software AG cloud services.

You will be an integral part of our webMethods.io iPaaS platform Cloud Engineering Operations global team, who are responsible for providing Continuous Cloud Operations and Continuous Cloud Innovation for webMethods.io product portfolio.

Responsibilities:

  • Good knowledge of virtualization technologies and container technologies
  • Solid experience in networking, cloud networking, Tenant Isolation SaaS Architecture models and VPN-related technologies
  • Design, write and deliver software to improve the availability, scalability, latency, and efficiency of webMethods.io iPaaS cloud services
  • Expertise of Observability Platform (application telemetry, tracing, and Log aggregation).
  • Influence and create new designs, architectures, standards and methods for large-scale distributed systems. 
  • Collaborate with a world-class engineering team to propose features that solve recurring patterns of customer complaints.
  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning
  • Find scalability bottlenecks and areas for performance improvements
  • Deep technical knowledge in Cloud Infrastructure, Operations, Support, Networking, Systems, IAC, Automated Deployments, Cloud Platforms and Dev Ops
  • Work with our architects to design a secure network setup, both inhouse and at customers
  • Experience with Kubernetes, Docker, Azure and AWS
  • Experience with EKS/AKS, CNI ( flannel, calico, cilium. And Integration with Cloud providers VPC’s) 
  • Good with Network Monitoring and Troubleshooting tools.
  • Exp with Unix/Linux-OS Internals and administration (e.g. Filesystems, inodes, system calls, etc) or Networking (e.g. TCP/IP, routing, network topologies, and hardware, SDN, etc)
  • Participate in on call rotation, Participate, collaborate and provide guidance in retrospectives.

 

Qualifications

Requirements:

  • Bachelor’s degree in software engineering, computer science, computer engineering, or related technical field
  • Experience with Amazon Web Services and/or any other public cloud
  • Experience with Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) technology stacks.
  • Experience with containers and HA clusters; experience with Docker and Amazon ECS /Kubernetes is mandatory
  • Good knowledge of virtualization technologies and container technologies
  • Solid experience in networking, cloud networking, Tenant Isolation SaaS Architecture models and VPN-related technologies
  • Experience with Kubernetes, Docker, Azure and AWS
  • Experience with EKS/AKS, CNI ( flannel, calico, cilium. And Integration with Cloud providers VPC’s)  
  • Good with Network Monitoring and Troubleshooting tools.
  • Exp with Unix/Linux-OS Internals and administration (e.g. Filesystems, inodes, system calls, etc) or Networking (e.g. TCP/IP, routing, network topologies, and hardware, SDN, etc)
  • Experience in Cloud Software Engineering, Cloud Site Reliability Engineering, & Cloud Operations
  • Implement secure Networking, key management, user management, access management, process management
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system reliability.

 

Additional Information

  • Familiar with Cloud Availability Patterns ( SLI , SLO , SLA etc..)
  • Expertise in designing, analyzing and troubleshooting large-scale distributed systems
  • Firm grasp of at least one modern programming language (Java/Go/Python/Ruby), beyond basic scripting (Shell,Perl,Bash)
Set alert for similar jobsLead Site Reliability Engineering, Cloud Networking role in Chennai, India or Bengaluru, India
Software AG Logo

Company

Software AG

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 years

Category

Software Engineering

Locations

Chennai, Tamil Nadu, India

Bengaluru, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

KONE Logo

SAP Site Reliability Engineering

KONE

Chennai, Tamil Nadu, India

Posted: a year ago

We are looking for a Configuration Owner to join our team. As a Configuration Owner, you will be responsible for configuring Enterprise IT solutions and platforms according to the requirements. You will also be accountable for the technical design and specification of these solutions, ensuring they align with enterprise architecture and process. Your role will involve creating technical designs, providing technical coaching, and collaborating with stakeholders. Additionally, you will be responsible for technology management, cyber security compliance, and supporting operational excellence initiatives. Apply now to be part of our team!

Freshworks Logo

Staff Engineer - Site Reliability

Freshworks

Chennai, Tamil Nadu, India

Posted: a year ago

Job Description SRE at Freshworks Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Freshwork's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance. Much of our SRE focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.    Responsibilities: Design, write, and deliver software to improve the availability, latency, and efficiency of Freshwork’s Products & Platforms. Manage availability, latency and performance of mission critical services and build automation to prevent problem recurrence. Independently determine and develop architectural approaches and Infrastructure solutions. Defining strategy, vision, and roadmap to develop CI/CD, Application hosting, Security and Compliance standards and guidelines across Freshworks. Drive blameless postmortems for large scale incidents. Define and drive automation and orchestration strategies. Strategize cost optimization across Freshworks Cloud environment.   Qualifications Requirements: 12+ years of Software Engineering and Coding Experience in C# / Python / JavaScript / Golang (one or more).  12+ years of Experience handling Linux and Windows Systems at a very large scale.  6+ years of Hands-on experience on Containers & Container Orchestration Tools. 10+ years of proven Experience with designing, building, supporting and observing large-scale distributed systems/services/infrastructure. Strong Experience in Microservices Architecture, Service Mesh implementation and instrumenting XaaC (Infrastructure, Software, Network, Policy, Security) across global scale systems Hands-on Experience in defining and driving Disaster Recovery across Freshworks Products & Platforms. Proficiency in implementing FinOps and cloud cost optimization strategies. Experience and knowledge of incorporating testing, compliance and security requirements within code release pipelines.  Proficiency in algorithms, data structures, complexity analysis, and software design. Ability to turn technical deep-dives into code, networking, operating systems, and storage, with ability to participate in an executive strategy discussion. Data Mining & Data Analytics experience utilizing big data and\or relational data bases technologies. Excellent experience in designing & architecting solutions using OpenSource Software (OSS). Intellectual Curiosity, Problem Solving and Storytelling presentation.