Looking for a candidate with expertise in PySpark, Spark SQL, and working with dataframes. Must have knowledge of advanced data transformations and good understanding of SQL. The candidate should be experienced in error handling, logging, and monitoring. Knowledge of Unix and a scheduling tool is necessary. Performance tuning and generic process development skills are required. Familiarity with the banking domain is preferred. Experience in ETL estimation and Teradata BTEQ is a plus.
Job Description
Competencies Required (Technical/Behavioral Job Description
Essentials:
.PySpark and Spark SQL
.Working with Dataframes using different APIs
.Spark Functions
.Advanced Data transformations
.Good knowledge in SQL
.Applying UDFS
.Error handling, logging and monitoring
.Unix, Scheduling Tool
.Performance Tuning & Generic process development
.Banking Domain Knowledge
Desirable:
.ETL Estimation
.Teradata BTEQCompetency)