Senior Software Engineer - Big Data
Freshworks
Chennai, Tamil Nadu, India
Job Description The primary responsibilities of the role include: Design and develop a real-time data pipeline for Data ingestion for real-time business usecases Develop complex and efficient functions to transform raw data sources into powerful, reliable components of our data lake Grow our analytics capabilities with faster, more reliable data pipelines, and better tools, handling petabytes of data every day. Brainstorm and create new platforms features, which can help in our quest to make data available to cluster users in all shapes and forms, with low latency and horizontal scalability. Make changes to our data platform, refactoring/redesigning as needed and diagnosing any problems across the entire technical stack. Think outside the box with to implement solutions with new components and various emerging technologies in AWS, and Open Source for successful execution of various projects Optimize and improve existing features or data processes for performance and stability. Write unit tests and support continuous integration. Be obsessed with quality and ensure minimal production downtimes. Mentor peers, share information and knowledge, and help build a great team. Monitor job performances, file system/disk-space management, cluster and database connectivity, log files, management of backup/security, and troubleshoot various user issues. Collaborate with cross-functional and business teams Qualifications We are looking for a candidate with proven experience in Big Data Engineering role with hands-on expertise in Apache SparkTM (Scala or PySpark Preferred) and associated performance optimization Advanced working Knowledge in SQL and working familiarity with a variety of databases. Working knowledge of various API interfaces for Bulk or Stream-based data extraction and load processes is a must Experience building and deploying a range of data engineering pipelines into production, including using automation best practices for CI/CD Experience performing root cause analysis on all data and processes to answer specific questions and identify opportunities for improvement. Build processes supporting data transformation, data structures, metadata, dependency and workload management. A successful history of manipulating, processing and extracting value from large disconnected datasets. Working knowledge of Kafka, Spark, stream processing, and scalable 'big data' data stores. Experience with cloud solutions on top of AWS Good to have ML-ops Knowledge Preferred Experience: 3-5 Years