Staff Engineer - Site Reliability
Freshworks
Chennai, Tamil Nadu, India
Job Description SRE at Freshworks Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Freshwork's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance. Much of our SRE focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. Responsibilities: Design, write, and deliver software to improve the availability, latency, and efficiency of Freshwork’s Products & Platforms. Manage availability, latency and performance of mission critical services and build automation to prevent problem recurrence. Independently determine and develop architectural approaches and Infrastructure solutions. Defining strategy, vision, and roadmap to develop CI/CD, Application hosting, Security and Compliance standards and guidelines across Freshworks. Drive blameless postmortems for large scale incidents. Define and drive automation and orchestration strategies. Strategize cost optimization across Freshworks Cloud environment. Qualifications Requirements: 12+ years of Software Engineering and Coding Experience in C# / Python / JavaScript / Golang (one or more). 12+ years of Experience handling Linux and Windows Systems at a very large scale. 6+ years of Hands-on experience on Containers & Container Orchestration Tools. 10+ years of proven Experience with designing, building, supporting and observing large-scale distributed systems/services/infrastructure. Strong Experience in Microservices Architecture, Service Mesh implementation and instrumenting XaaC (Infrastructure, Software, Network, Policy, Security) across global scale systems Hands-on Experience in defining and driving Disaster Recovery across Freshworks Products & Platforms. Proficiency in implementing FinOps and cloud cost optimization strategies. Experience and knowledge of incorporating testing, compliance and security requirements within code release pipelines. Proficiency in algorithms, data structures, complexity analysis, and software design. Ability to turn technical deep-dives into code, networking, operating systems, and storage, with ability to participate in an executive strategy discussion. Data Mining & Data Analytics experience utilizing big data and\or relational data bases technologies. Excellent experience in designing & architecting solutions using OpenSource Software (OSS). Intellectual Curiosity, Problem Solving and Storytelling presentation.