The Job logo

What

Where

Lead Data Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
5-9 years of experience designing technological solutions to complex data problems. Develop & test modular, reusable, efficient, and scalable code. Strong proficiency in Scala/PySpark required. Experience with distributed computing frameworks, cloud computing platforms, and Linux environment. Problem-solving skills are essential.

Job description 

Description:

5-9 years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.

Ideally, this would include work on the following technologies:

Expert-level proficiency in Scala/PySpark knowledge is a strong advantage. Exp in at least one of Java, Scala or Python (preferred) 
Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop (YARN, MR, HDFS) and associated technologies — one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, Impala, etc.
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
Operating knowledge of cloud computing platforms (AWS/Azure/GCP)
Experience working within a Linux computing environment, and use of command line tools including knowledge of Shell/Python scripting for automating common tasks
Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works or any version control tools

In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.

Experience:

Must Have (hands-on) experience:

Scala or Python/PySpark expertise
Distributed computing frameworks (Hadoop Ecosystem & Spark components)
Cloud computing platforms (AWS/Azure/GCP)
Linux environment, SQL and Shell scripting

 

Requirements:

Qualification & Experience

5-9 years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.

Ideally, this would include work on the following technologies:

Expert-level proficiency in Scala/PySpark knowledge is a strong advantage. Exp in at least one of Java, Scala or Python (preferred) 
Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop (YARN, MR, HDFS) and associated technologies — one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, Impala, etc.
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
Operating knowledge of cloud computing platforms (AWS/Azure/GCP)
Experience working within a Linux computing environment, and use of command line tools including knowledge of Shell/Python scripting for automating common tasks
Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works or any version control tools

In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.

Experience:

Must Have (hands-on) experience:

Scala or Python/PySpark expertise
Distributed computing frameworks (Hadoop Ecosystem & Spark components)
Cloud computing platforms (AWS/Azure/GCP)
Linux environment, SQL and Shell scripting

 


 

Job Responsibilities:

5-9 years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.

Ideally, this would include work on the following technologies:

Expert-level proficiency in Scala/PySpark knowledge is a strong advantage. Exp in at least one of Java, Scala or Python (preferred) 
Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop (YARN, MR, HDFS) and associated technologies — one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, Impala, etc.
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
Operating knowledge of cloud computing platforms (AWS/Azure/GCP)
Experience working within a Linux computing environment, and use of command line tools including knowledge of Shell/Python scripting for automating common tasks
Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works or any version control tools

In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.

Experience:

Must Have (hands-on) experience:

Scala or Python/PySpark expertise
Distributed computing frameworks (Hadoop Ecosystem & Spark components)
Cloud computing platforms (AWS/Azure/GCP)
Linux environment, SQL and Shell scripting

Set alert for similar jobsLead Data Engineer role in Bengaluru, India
GlobalLogic Logo

Company

GlobalLogic

Job Posted

a year ago

Job Type

Full-time

WorkMode

Hybrid

Experience Level

3-7 Years

Category

Software Engineering

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor or Master

Applicants

Be an early applicant

Related Jobs

GlobalLogic Logo

Data Analyst

GlobalLogic

Bengaluru, Karnataka, India

Posted: a year ago

The focus of this role is 100% on data and reporting, providing end to end support for any and all data needs from our Product and Engineering teams. Your expertise in performing data analysis and analytics reporting will provide actionable insights and advance our Product Development and Technology practices. Your work will enable a data-driven approach to measuring and improving upon our ways of working. This role often works cross-functionally with Product Management, Product Experience, Engineering, and Program Management.