The Job logo

What

Where

Principal Site Reliability Developer - Oracle Exadata Cloud

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Hiring for Principal Site Reliability Developer at Oracle with a focus on security, resiliency, scale, and performance of mission-critical services. Responsibilities include designing, delivering, and improving service architecture. Role involves troubleshooting, automation, and performance optimization. Full-time, On-site position in Bengaluru, India.

Job Description

Detailed Description:

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add world-class capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for sophisticated or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the effect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.A BS or MS in Computer Science, or equivalent. Identifies and implements complex solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies and implements complex solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 8+ years experience of running large scale customer facing web services.

Career Level - IC4


Responsibilities

In this role you will:

  • Build new monitoring/administration solutions including architecture, provisioning, configuration, deployment, and patching of network components
  • React to production deficiencies by continuously implementing automation, self-healing, and real-time monitoring to production systems 
  • Conduct periodic on call duties
  • Solve complex and difficult problems and build automation to prevent problem recurrence
  • Participate in cloud service capacity planning and demand forecasting, software performance analysis and system tuning.
  • Partner with distributed teams in prototyping new solutions
  • Stay informed of new technologies
  • Innovate

 

Required Qualification:
7+ years of software development experience a distributed systems environment, preferably in the cloud 

  • BS or MS degree in Computer Science, or equivalent experience
  • Proficient with scripting skills (Shell, Perl and Python); and programming languages C/C++/Java/Python
  • Strong experience with Continuous integration and Continuous Deployment (CI/CD) using tools like GIT/Bit Bucket, TeamCity, Artifactory, jira, Phabricator and Octopus or equivalent
  • Strong knowledge of different development environments (Git, Atlassian tools: JIRA, Confluence, Bitbucket)
  • Good knowledge on containerization using Docker/Kubernetes
  • Experience with configuration management tools
  • Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
  • Systematic problem solving approach, combined with a strong sense of ownership and drive.
  • Possess a passion for technical leadership and mentoring
  • Possess strong verbal and written communication skills

 

Set alert for similar jobsPrincipal Site Reliability Developer - Oracle Exadata Cloud role in Bengaluru, India
Oracle Logo

Company

Oracle

Job Posted

4 months ago

Job Type

Full-time

WorkMode

On-site

Experience Level

8-12 Years

Category

Software Engineering

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor or Master

Applicants

Be an early applicant

Related Jobs

Oracle Logo

Oracle Exadata DBMA

Oracle

Bengaluru, Karnataka, India

Posted: 10 months ago

As an Advisory Systems Engineer, you will be responsible for delivering post-sales support and solutions to Oracle customers. Your role involves resolving non-technical customer inquiries, providing technical assistance, and serving as a primary point of contact. You will be expected to solve complex customer issues, provide expert technical advice, and mentor others. A Bachelor's and Master's degree in Computer Science or Engineering is required, along with 8 years of related experience. Good communication skills and certifications are a must.

Netskope Logo

Staff Site Reliability Engineer

Netskope

Bengaluru, Karnataka, India

Posted: a year ago

About the role Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The SRE Data / Provisioner team supports the Netskope Data Product Suite, and Provisioner, a critical component of our foundational technologies and the single source of truth for all user data across all Netskope Apps. We are a team of software engineers focused on improving availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. Job Responsibilities   Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure Develop innovative ways to smartly measure, monitor & report application and infrastructure health Gain deep knowledge of our application stack Experience improving the performance of micro-services and solve scaling/performance issues Capacity management and planning Function well in a fast-paced and rapidly-changing environment Participate in 24X7 on-call rotations. Preferred Qualifications BS or MS in Computer Science or equivalent technical degree or related practical experience Preferred Technical Skills: 10+ years experience with troubleshooting Unix/Linux Understanding of Networking concepts - TCP/IP, SSL/TLS, IPSec, GRE, VPN Experience with algorithms, data structures, complexity analysis, and software design Experience in one or more of the following: C, C++, Python, Go Experience in managing a large-scale web operations role Bonus points for experience with Ansible, Kubernetes, SQL and NoSQL datastores, CI/CD Hands-on working with private or public cloud services in a highly available and scalable production environment.  Desired Technical Skills: Knowledge of distributed systems is a big plus.   Additional Skills Great written and verbal communication Ability to work for a geo-distributed cross-functional group Demonstrated ability to own and deliver projects independently Demonstrated ability of technical mentoring and coaching  Strong interpersonal communication skills (including listening, speaking, and writing) and the ability to work well in a diverse, team-focused environment with other SREs, developers, Product Managers, etc