The Job logo

What

Where

Principal Site Reliability Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Join our team and leverage your experience in designing, implementing, debugging, and launching commercial software products or web services. We are looking for someone with a strong SRE background in cloud - Azure. As part of our team, you will be responsible for collaborating with customers, defining SLOs and SLIs, developing automated solutions, and optimizing system performance. Join us and be a part of our customer-centric and inclusive work environment.

Job description 

Qualifications

10+ years of experience with designing, implementing, debugging and launching commercial software products or web services. 3+ years of SRE experience in cloud - Azure (or AWS/GCP)
• Degree: Bachelor’s or master’s degree in computer engineering (or equivalent)
• Customer Obsession: Passion for customers and focus on delivering the right customer experience.
• Growth Mindset: Openness and ability to learn new skills and technologies in a fast-paced environment.
• Excellent Communication: Must have the ability to empathize with customers and convey confidence. Able to explain highly technical issues to varied audiences. Able to prioritize and advocate customer’s needs to the proper channels. Take ownership and work towards a resolution.
• Technical Skills:
o Proven expertise in implementing and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for cloud customers. 
o Extensive experience with SLO monitoring tools and platforms
o Advanced certifications in SRE or related fields.
o Experience in observability, SRE OpenTelemetry, Prometheus, Grafana, Dynatrace, Datadog, AzureMonitor, AI, ML
#AZCXP #AZCXPACE #ACES500 #AZCXPSUPPORT, #AzureCXP

Responsibilities

Collaborate with customers to jointly define and establish SLOs and SLIs that align with their business goals and expectations.
• Instrument code to measure SLOs , develop solutions to detect SLO breaches 
• Develop automated solutions and troubleshooting guides to remediate or mitigate SLO breaches.
• Collaborate closely with service engineering teams to develop solutions for corelating customer-defined SLOs with relevant platform SLOs, signals to effectively pinpoint, address, and resolve customer-impacting issues.
• Ensure customer-centric SLOs are consistently exceeded through cross-functional collaboration. 
• Analyze SLO data for trends, improvements, and reliability risks, proposing remediation plans.
• Proactively engage customers on SLO performance, addressing concerns and offering insights.
• Lead optimization efforts for system performance, scalability, and efficiency to exceed SLOs.
• Develop and maintain documentation related to customer-specific SLOs, SLIs, and monitoring processes.
• Exemplify Microsoft culture and foster a diverse, inclusive work environment.

Set alert for similar jobsPrincipal Site Reliability Engineer role in Hyderabad, India
Microsoft Logo

Company

Microsoft

Job Posted

10 months ago

Job Type

Full-time

WorkMode

Hybrid

Experience Level

8-12 Years

Category

Software Engineering

Locations

Hyderabad, Telangana, India

Qualification

Bachelor or Master

Applicants

Be an early applicant

Related Jobs

Microsoft Logo

C Modeling Principal Engineer

Microsoft

Hyderabad, Telangana, India

Posted: 9 months ago

Join our team at Microsoft as a C Modeling Principal Engineer. This is a full-time opportunity for individuals with a Bachelor's or Master's degree in Electrical, Computer Engineering, or Computer Science and at least 10 years of experience. You will be responsible for IP block hardware model development in C/C++ and test platform development. Preferred qualifications include machine learning/neural network experience and silicon architecture/microarchitecture experience. This position is located in Hyderabad, Telangana, India. Apply now and be a part of our diverse and inclusive workplace.

JPMorgan Chase & Co. Logo

Site Reliability Engineer III

JPMorgan Chase & Co.

Hyderabad, Telangana, India

Posted: a year ago

JOB DESCRIPTION There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Consumer and Community Banking of Infrastructure and Production Management, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. Job responsibilities Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines Collaborates with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications Implements infrastructure, configuration, and network as code for the applications and platforms in your remit Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers Develop, test and debug automated tasks (Apps, Systems, Infrastructure) Troubleshoot priority incidents, facilitate blameless post-mortems    Required qualifications, capabilities, and skills Minimum 7 years of over all experience in IT industry Formal training or certification on site reliability engineering concepts and 3+ years applied experience Proficient in at least one programming language such as Python, Java/Spring Boot Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.) Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker Preferred qualifications, capabilities, and skills Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm Adept in the development of automated tools, systems, and services in multiple technology domains Working knowledge of infrastructure components. (E.g. routers, load balancers , cloud products , container systems , compute, storage and networks) Excellent debugging and trouble shooting skills   ABOUT US JPMorgan Chase & Co., one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management. We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as any mental health or physical disability needs. ABOUT THE TEAM Our Consumer & Community Banking division serves our Chase customers through a range of financial services, including personal banking, credit cards, mortgages, auto financing, investment advice, small business loans and payment processing. We’re proud to lead the U.S. in credit card sales and deposit growth and have the most-used digital solutions – all while ranking first in customer satisfaction.

Microsoft Logo

Principal Product Security Engineer

Microsoft

Hyderabad, Telangana, India

Posted: 10 months ago

Job description  Qualifications Required/Minimum Qualifications 7+ years experience in software development lifecycle, large scale computing, modeling, cyber security, anomaly detection OR Bachelor's Degree in Statistics, Mathematics, Computer Science, Risk Management, Cyber Security, or related field OR equivalent experience. Experience with Code scanning tool such as Veracode, SonarQube, Checkmarks, Netsparker, etc. Software engineering SDLC experience Experience with at least one programming language. An understanding of architectural or security architecture principles   Additional or Preferred Qualifications Java or C# experience Leadership experience Experience with Container Security Security certification – Kubernetes, Docker, AZ-500 Knowledge of objective frameworks – e.g. NIST 800-53, ISO 27002, HITRUST etc. #MSRC #DSR #NuanceSecurity #MSFTSecurity   Responsibilities • Support the Nuance Global Security Systems Security Engineering team  • Be able to lead software vulnerability triage engagements. • Be able to lead security architecture review engagements. • Be able to lead Threat Modeling engagements • Document security standards, as well as reports • Communicate/document implementation approaches and patterns for standards-based information security objectives (NIST 800-53, ISO 27002 etc.) • Coach and support junior personnel. • Coordinating with other Global Security Service teams to ensure operation consistency and effectiveness