Job description
In one sentence
Cloud Data Engineer with Data Analyst and Data Modelling Experience
What will your job look like?
We are actively searching for a highly skilled Cloud Data Engineer with an extensive track record in data engineering. The perfect candidate will be central to crafting, deploying, and upholding our cloud-based data pipeline infrastructure. This role demands a deep-seated proficiency in cloud computing solutions, data engineering methodologies, and a thorough comprehension of data analytical processes and data migration methodologies.
All you need is...
Site Reliability Engineer.
What you’ll own:
• Monitor & analyse production systems for reliability
• Production systems are scaled up/out to have performance stability while being cost-efficient
• Production incident response and if needed perform triage, escalation to required team and implementation of mitigation solutions
• Collaboration with developers on the set-up of monitoring, alerting and scaling capabilities as part of the product delivery life cycle
• Increase system reliability and reduce manual interventions for tasks by R&D of automated processes.
• Knowledge share by creating documents and conducting sessions with team members
• Support key business peak periods like Saturdays by being On-Call or at office based on need/rotations
• Timely and accurate communication with variety of stakeholders
• Automation, quality & effectiveness of monitors, alerts, and autoscaling of system configured in production environments
• Availability and reliability metric for production system is 99.9%
• Response time to incidents within the defined SLA
• Flexibility towards work and supporting team members
Skills required:
• Strong knowledge of Linux/Unix systems and command line tools.
• Hands-on scripting in languages like Python, Shell, Perl, PowerShell, Javascript, SQL, etc
• Experience with configuration management tools like Ansible, Puppet, or Chef.
• Expert knowledge of cloud platforms such as AWS, Azure, or Google Cloud.
• Understanding of networking principles and protocols (TCP/IP, HTTP, DNS, etc.).
• Knowledge of containerization technologies (Docker, Kubernetes) and orchestration tools.
• Expertise in monitoring and logging tools such as Prometheus, Grafana, ELK stack, or Splunk, DataDog, Azure app-insights, Perfmon, Redgate, etc.
• Strong problem-solving and troubleshooting skills, with the ability to analyze and resolve complex technical issues.
• Excellent communication and collaboration skills to work effectively with cross-functional teams.
• Strong attention to detail and ability to work in a fast-paced, dynamic environment.
• 3-5 years’ experience doing SRE or a similar role
Why you will love this job: