The Job logo

What

Where

Senior Site Reliability Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Join a high-performing team supporting global enterprise applications by managing operations, driving strategies, ensuring infrastructure reliability, and fostering innovation. Collaborate with diverse stakeholders to enhance service quality and compliance with ITIL processes. Opportunity at Thomson Reuters in Bengaluru, Karnataka, India. Full-time Hybrid role.

Job description 

About the Role 

This role will be a part of a high performing team of talented SRE specialists who provide world-class support for Commercial Engineering. You will be responsible for day-to-day operations for this team, work as part of a larger global team, and help develop and drive strategies for supporting and continuously improving our global enterprise. This team manages ongoing incident detection and resolution, change planning and implementation, and compliance for a portfolio of applications and infrastructure built on a variety of technologies such as Boomi, Java, Linux, Microsoft, relational databases, message queuing, AWS cloud services, and more. 
 

Deliver reliable 24x7 infrastructure and application operations according to business expectations across the application portfolio. 

Partner with application development teams to deliver operational readiness for new applications and features. 

Collaborate with stakeholders such as business teams, product owners, and project management in defining roadmaps for applications and processes. 

Drive continual service improvement and innovation in productivity, software quality, and reliability, including meeting/exceeding SLAs. 

Thorough understanding of ITIL processes related to incident management, problem management, application life cycle management, operational health management.  

Experience in supporting applications built on modern application architecture and cloud infrastructure, Javascript frameworks and Libraries, Boomi, HTML/CSS/JS, Node.JS, TypeScript, jQuery, Docker, AWS/Azure.  

Responsible for developing, monitoring and analyzing business operational and technical key metrics. 

Effectively articulate complex problems, concepts, and solutions to varied audiences. 

Contribute to the strategy of the department and drive implementation of department goals that support the company’s core values. 

Participate in complex initiatives such as large-scale upgrades. 

Partner with security, data center, and service governance teams to deliver compliance with internal and external standards, expectations, and certifications. 

Ensure documentation, processes, and procedures are updated regularly. 

Participate in a continuous learning culture and a curiosity about emerging technologies.  



 About You 

You’re a fit for the role if your background includes: 

5+ Years of experience in software development and/or technology infrastructure.

Bachelor’s degree or equivalent required; Computer Science or related technical degree preferred. 

Fluent in speaking and writing English. 

Thorough understanding of ITIL processes related to incident management, problem management, application life cycle management, operational health management. 

Experience in supporting applications built on modern application architecture and cloud infrastructure, Javascript frameworks and Libraries, Boomi, HTML/CSS/JS, Node.JS, TypeScript, jQuery, Docker, AWS/Azure. 

Broad understanding of the technologies used to build and operate distributed application systems including experience managing data center systems/infrastructure. 

Proven track record of success driving projects and initiatives even with ambiguous details provided. 

Strong customer service, problem solving, organizational and conflict management skills. 

Strong IT Service Management and standards experience. 

Excellent critical thinking, communication, presentation, documentation, troubleshooting, and collaborative problem-solving skills. 

Proven ability to learn new technologies quickly. 

Hands-on experience with programming and scripting languages. 

Comfortable in a fast-paced environment and motivated by complex technical and business challenges. 

ITIL Certification preferred. 

Set alert for similar jobsSenior Site Reliability Engineer role in Bengaluru, India
Thomson Reuters Logo

Company

Thomson Reuters

Job Posted

a year ago

Job Type

Full-time

WorkMode

Hybrid

Experience Level

3-7 Years

Category

Software Engineering

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor or Master

Applicants

Be an early applicant

Related Jobs

Thomson Reuters Logo

Senior Site Reliability Engineer

Thomson Reuters

Hyderabad, Telangana, India

Posted: a year ago

As a Senior Site Reliability Engineer at Thomson Reuters, you will manage cloud environments, troubleshoot application issues, conduct end-to-end application testing, maintain system documentation, and collaborate with cross-functional teams. This hybrid full-time role in Hyderabad requires 6+ years' experience in cloud with Windows systems, AWS DevOps expertise, ITIL knowledge, .Net application experience, scripting skills, database knowledge, and strong communication and analytical abilities.

NVIDIA Logo

Senior Site Reliability Engineer

NVIDIA

Bengaluru, Karnataka, India

Posted: 2 years ago

What you will be doing: Design, implement and support large scale Kubernetes clusters with monitoring, logging and alerting. Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement. Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity management and launch reviews. Maintain services once they are live by measuring and monitoring availability, latency and overall system health. Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity. Practice sustainable incident response and blameless postmortems. Be part of an on call rotation to support production systems.   What we need to see: A minimum of 3 years of hands-on experience in setup, administration and maintenance of multiple large (100+ nodes) Kubernetes clusters on-prem and Cloud Service Providers like AWS, Azure, GCP, OCI. Strong coding experience in one or more of the following languages: Go, Python, Perl, Java, C, C++, Ruby. Hands-on system administration experience of at least 2 years on large scale UNIX production environments, with validated debugging and troubleshooting skills. Ability to maintain platform SLAs through accurate resolutions. Outstanding teammate who can collaborate and influence in a multifaceted environment. Demonstrable experience in handling algorithms, data structures, complexity analysis and software design. BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics).   Ways to stand out of a crowd: Experience in using or running large private and public cloud systems based on Kubernetes, OpenStack and Docker. Demonstrated ability to automate routine tasks, debug and optimize existing code. Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive. Hands-on experience on network and storage administration. Unit testing and benchmarking are an integral part of your code. Ability to reason and choose the best possible algorithm to meet scaling and availability challenges. Ability to decompose complex requirements into simple tasks and reuse available solutions to implement most of those.

Criteo Logo

Senior Site Reliability Engineer

Criteo

Barcelona, Barcelona, Spain

+2 more

Posted: 2 years ago

What You'll Do: Sr. SRE acts as an expert in both operations on GNU/Linux systems and cloud providers as well as in automation tools and practices. Main responsibilities are creating, supporting and improving the infrastructure. Key Responsibilities: Setup and maintain projects using Infrastructure as a Code (IaC) principals Investigate issues Providing to the application developers an ability to deploy and update the applications in the production environment Support the integration with services such as log collection, metric collection and monitoring Participate in the development of configuration management, deployment and monitoring of infrastructure, automation of the processes Participate in the architecture design of new software components or their parts Explore and apply modern technologies and practices where practical Consider the cost effectiveness of the production infrastructure. Working with other teams to ensure that commonly used technical components created by Iponweb integrate well into the production infrastructure Maintain up-to-date documentation on processes and code utilised by the team Who You Are: You have good Linux and Unix Shell knowledge, particularly Ubuntu/Debian based Linux systems You have experience with cloud providers (AWS, GCP). Programming skills are required, familiarity with languages such as Python, Go or similar is required You have a good understanding of TCP/IP networking principles You have experience with monitoring and metric systems, such as Zabbix, Prometheus, Graphite and similar You have ability to set priorities and to take responsibility Have a preference to solve problems in the production by means of automation instead of doing operations manually Have experience with container technologies, such as Kubernetes, Docker and tools used in conjunctions with these Have experience with modern configuration management tool (Puppet, Ansible) Have experience managing databases such as MongoDB, PostgreSQL Decent communication skills Decent English skills