Job description
The ideal candidate will be comfortable in a dynamic environment that is constantly trying new technologies and approaches to address business challenges. The person must be open to learn as well as be willing to teach the team. Important, this individual is creative, wants to have fun, and works well with colleagues.
Responsibilities
· Implement Logical and physical data models as per the business needs.
· Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
· Implementing ETL process
· Implementing testing framework for the ETL processes
· Performance tuning for the Spark, Hive and Hadoop stack.
Qualifications(Must have):
· 7+ years’ experience.
· Strong coding skills in Python is necessary.
· Experience in updating and optimising the local and metadata models.
· Evaluate the implemented data system for variances, discrepancies and efficiencies.
· Troubleshoot and optimise the existing data flows, models and processing jobs by modularising.
· Explore ways to enhance the data quality and reliability.
· Strong in writing UDF
· Previous knowledge on implementing DQ framework on spark would be an added advantage.
· Good in writing programming using Python/Scala/PySpark is a must.
· Strong knowledge of spark framework.
· Understanding the dynamics of a Hadoop cluster ecosystem, with all included services such as spark jobs etc.
· Enthusiasm to solve any ongoing issues with operating the cluster
· Good knowledge of Big Data querying tools, such as Hive, SparkSQL
· Experience with Spark is strongly required, coding using Pyspark and Scala is beneficial
· Experience with integration of data from multiple data sources
· Knowledge of various ETL techniques and frameworks, like AWS Glue, AWS Lambda preferred
Good to have:
· Experience with Big Data ML toolkits, such as scikit-learn,MLLib, SparkML, or H2O will be a big PLUS
· Knowledge in Salesforce (SFDC) and Oracle Eloqua data models is a plus
· Proficient with one or more scripting/programming languages like Scala, R and Java
· Front-end development using Javascript frameworks advantageous