AppD Senior Devops Engineer with distrubuted systems & Microservices (exp 10 -14 Yrs)
Cisco
Bengaluru, Karnataka, India
DevOps Engineer (Data Platform) – Site Reliability – Web Services Engineering | Bengaluru About AppDynamics AppDynamics is an Application Intelligence company. With AppDynamics, enterprises have real-time insights into application performance, user performance, and business performance to move faster in an increasingly sophisticated, software-driven world. Our integrated suite of products is built on our innovative, full-stack observability platform that enables our customers to make faster decisions that enhance customer engagement and improve operational and business performance. AppDynamics is uniquely positioned to enable enterprises to accelerate their digital transformations by actively monitoring, analyzing, and optimizing complex application environments at scale, which has led to proven success and trust with the world’s most innovative companies. About You We are looking for a talented, motivated engineer to join our engineering team and help us continue to build and scale metrics platforms. In this role, you are expected to work on distributed systems that handle real-time ingestion and analytics at a massive scale. You are passionate about data engineering, scalability, availability, and performance. You also have, Minimum of a bachelor’s degree in CSE, EE, CSM, or related technical discipline. Minimum of a combined 8 - 11 years of Site Reliability, DevOps, and/or Software Development experience, ideally in a growth-stage environment Experience operating within, and supporting, complex SaaS production or revenue-critical 24/7 web services environments Must have experience developing and operationalizing system installations and upgrades Experience with Unix/Linux system administration, especially in RedHat Linux (CentOS) Experience running and administering services in AWS or other cloud platforms (Azure, GCP) Significant experience with one or more scripting/coding languages, ideally with Ansible, Terraform, or Python Experience with big data platform engineering Experience with scaling and operationalizing distributed data stores, file systems, and services (Kafka, Elasticsearch, HBase, Druid, etc) Experience with virtualization and containerization platforms (Docker), container orchestration tools (Kubernetes), and aspects of Kubernetes to facilitate ease of delivery (Istio/Helm/Kube2Iam) Availability for occasional on-call after-hours support Day-to-day responsibilities include: Building systems that ensure the reliable operation of distributed data stores Helping to build infrastructure to facilitate rapid service deployments Documenting findings and recommendations for improvement Maintaining and enhancing deployment tools and methodologies; leading in advancing our 'Infrastructure as code' architecture. Improving the monitoring systems that support our service reliability Creating repeatable, efficient, and scalable artifact deployment pipelines Making recommendations to and interfacing with engineering to ensure 100% application uptime Monitor the SaaS environment and work with QA, Developers, and Ops to identify and tackle problems Ensure that failover mechanisms are in place and are working correctly Responding to and resolving technical emergencies About the Role The data platform powers AppDynamics’ Application Intelligence Platform. It handles billions of requests and massive amounts of metrics, events, and other data. It is real-time, very scalable, and highly available. It is the data source for performance monitoring and troubleshooting, policy evaluation, workflow automation, data visualization, and slice & dice data analysis. Come join the Data Platform team and build the world’s next phenomenal Application Intelligence Platform. The engineers in this team are passionate about big data and analytics, infinitely scalable and highly available platforms. They understand the importance of data collected from every application and component in a software-defined business environment - web, mobile, server, infrastructure, and hardware, in enabling the most advanced and effective business and IT decision-making.