The Job logo

What

Where

Data Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Implement data pipelines that are efficient, scalable, and maintainable. Implement best practices including source control, code reviews, validation and testing. Act as mentor and help junior data engineers. Experience in designing big data processing pipelines. Familiarity with tools like Hadoop, Spark, Kafka, and databases like PostgresSQL, MYSQL. Strong programming skills in Python, Java, Scala. Familiarity with data visualization tools like Tableau, Power BI. Experience in working with AWS, Azure, or GCP. Knowledge of CI/CD, machine learning, deep learning, and optimization in automotive domain.

Job/Position Summary

 

Responsibilities:

  • Implement data pipelines that meet design and are efficient, scalable, and maintainable
  • Implement best practices including proper use of source control, participation in code reviews, data validation and testing
  • Timely deliveries while working on projects
  • Act as advisor/mentor and helps junior data engineers in their deliverables

Must Have Skills:

  • Should have experience of at least 4+ years with Data Engineering
  • Strong experience of design, implementation and fine-tuning big data processing pipelines in production environment
  • Experience with big tools like Hadoop, Spark, Kafka, Hive, Databricks
  • Experience in programming at least one of with Python, Java, Scala, Shell Script
  • Experience with relational SQL and NO SQL databases like PostgresSQL, MYSQL, Cassandra etc.
  • Experience with any data visualization tool (Plotly, Tableau, Power BI, Google Data Studio, Quick sight etc.)

Good To Have Skills:

  • Should have Basic Knowledge of CI/CD Pipeline
  • Experience in working on at least one Cloud (AWS or Azure or GCP)
  • For AWS: - Experience with AWS Cloud services like EC2, S3, EMR, RDS, Athena, Glue, Lambda, EMR
  • For Azure: -Experience with Azure Cloud services like Azure Blob/Data Lake GEN2, Delta Lake, Databricks, Azure SQL, Azure DevOps, Azure Data Factory, Power BI
  • For GCP: - Experience with GCP Cloud services Big Query, Cloud Storage bucket, DataProc, Dataflow, Pub Sub, Cloud Function, Data Studio
  • Sound familiarity in Versioning tools (Git, SVN etc.)
  • Experience Mentoring students is desirable
  • Knowledge of latest developments in Machine Learning, Deep Learning, Optimization in Automotive domain.
  • Open minded approach to explore multiple algorithms to design optimal solution.
  • History of contribution to articles/blogs/whitepapers etc. in Analytics
  • History of contribution to Open Source.

 

Requirement

ESSENTIAL SKILLS /COMPETENCIES

  • Data Engineering
  • Hadoop
  • Kafka
  • CI/CD
  • Cloud
Set alert for similar jobsData Engineer role in Bengaluru, India
KPIT Logo

Company

KPIT

Job Posted

a year ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 Years

Category

Data & Analytics

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

Caterpillar Inc. Logo

Data Engineer

Caterpillar Inc.

Bengaluru, Karnataka, India

Posted: 3 months ago

Job Description: Your Work Shapes the World at Caterpillar Inc. When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other.  We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it. Your Impact Shapes the World at Caterpillar Inc When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.   Job Summary We are seeking a skilled Data Scientist (Data Engineer) to join our Pricing Analytics Team . The incumbent would be responsible for building scalable, high-performance infrastructure and data driven analytics applications that provide actionable insights. The position will be part of Caterpillar’s fast-moving Global Parts Pricing organization, driving action and tackling challenges and problems that are critical to realizing superior business outcomes. The data engineer will work with data scientists, business intelligence analysts, and others as part of a team that assembles large, complex data sets, pipelines, apps, and data infrastructure that provide competitive advantage.   The preference for this role is to be based out of Bangalore – Whitefield office   What you will do Job Roles and Responsibilities Build infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using AWS tools  Design, develop, and maintain performant and scalable applications   Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability   Perform debugging, troubleshooting, modifications and testing of integration solutions  Operationalize developed jobs and processes  Create/Maintain database infrastructure to process data at scale  Create solutions and methods to monitor systems and processes  Automate code testing and pipelines  Engage directly with business partners to participate in design and development of data integration/transformation solutions.  Engage and actively seek industry best practice for continuous improvement  What you will have    BS in Computer Science, Data Science, Computer Engineering, or related quantitative field  Development experience, preferably using Python and/or PySpark  Understanding of data structures, algorithms, profiling & optimization   Understanding of SQL, ETL/ELT design, and data modeling techniques  Great verbal and written communication skills to collaborate cross functionally and drive action  Thrive in a fast-paced environment that delivers results . Passion for acquiring, analyzing, and transforming data to generate insights  Strong analytical ability, judgment and problem analysis techniques  This position may require 10% travel.  Shift Timing-01:00PM -10:00PM IST(EMEA Shift)  Desired Skills:  MS in Computer Science, Data Science, Computer Engineering, or related quantitative field  Experience with AWS cloud services (e.g. EMR, EC2, Lambda, Glue, CloudFormation, CloudWatch/EventBridge, ECR, ECS, Athena, Fargate, etc.)  Experience administering and/or developing in Snowflake  Masters Degree in Computer Science, Data Science, Computer Engineering, or related quantitative field  Strong background working with version control systems (Git, etc.)  Experience managing continuous integrations systems, Azure pipelines is a plus  Advanced level of experience with programming, data structures and algorithms   Working knowledge of Agile Software development methodology  Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement   A successful history of manipulating, processing and extracting value from large disconnected datasets  Skills desired: Business Statistics : Knowledge of the statistical tools, processes, and practices to describe business results in measurable scales; ability to use statistical tools and processes to assist in making business decisions. Level Working Knowledge: • Explains the basic decision process associated with specific statistics. • Works with basic statistical functions on a spreadsheet or a calculator. • Explains reasons for common statistical errors, misinterpretations, and misrepresentations. • Describes characteristics of sample size, normal distributions, and standard deviation. • Generates and interprets basic statistical data. Accuracy and Attention to Detail : Understanding the necessity and value of accuracy; ability to complete tasks with high levels of precision. Level Working Knowledge: • Accurately gauges the impact and cost of errors, omissions, and oversights. • Utilizes specific approaches and tools for checking and cross-checking outputs. • Processes limited amounts of detailed information with good accuracy. • Learns from mistakes and applies lessons learned. • Develops and uses checklists to ensure that information goes out error-free. Analytical Thinking : Knowledge of techniques and tools that promote effective analysis; ability to determine the root cause of organizational problems and create alternative solutions that resolve these problems. Level Working Knowledge: • Approaches a situation or problem by defining the problem or issue and determining its significance. • Makes a systematic comparison of two or more alternative solutions. • Uses flow charts, Pareto charts, fish diagrams, etc. to disclose meaningful data patterns. • Identifies the major forces, events and people impacting and impacted by the situation at hand. • Uses logic and intuition to make inferences about the meaning of the data and arrive at conclusions. What you will get: Work Life Harmony Earned and medical leave. Flexible work arrangements Relocation assistance Holistic Development Personal and professional development through Caterpillar ‘s employee resource groups across the globe Career developments opportunities with global prospects Health and Wellness Medical coverage -Medical, life and personal accident coverage Employee mental wellness assistance program   Financial Wellness Employee investment plan Pay for performance -Annual incentive Bonus plan.

Amazon Logo

Senior Data Engineer

Amazon

Bengaluru, Karnataka, India

Posted: a year ago

DESCRIPTION North America Consumer is one of Amazon’s largest businesses, including North America retail and third party sellers, as well as an array of innovative new programs including Alexa Shopping and Amazon Business. The NA Consumer Finance organization is looking for an outstanding Data Engineer to build a robust and scalable data platform to support analytical and data capability for the team. As a Data Engineer, you will be working in one of the world's largest and most complex data warehouse environments using the latest suite of AWS toolsets. You should have deep expertise in the design, creation, management, and business use of extremely large datasets. You should be expert at designing, implementing, and maintaining stable, scalable, low cost data infrastructure. In this role, you will build datasets that analysts and BIE use to generate actionable insights. You should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive change. The successful candidate will be an expert with SQL, ETL (and general data wrangling) and have exemplary communication skills. The candidate will need to be a self-starter, comfortable with ambiguity in a fast-paced and ever-changing environment, and able to think big while paying careful attention to detail. Responsibilities You know and love working with data engineering tools, can model multidimensional datasets, and can understand how to make appropriate data trade-offs. You will also have the opportunity to display your skills in the following areas: · Design, implement, and support a platform providing access to large datasets · Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL · Own, maintain and scale a Redshift cluster and manage other AWS Resources · Model data and metadata for ad hoc and pre-built reporting · Collaborate with Business Intelligence Engineers and Analysts to deliver high quality data architecture and pipelines · Recognize and adopt best practices in engineering and operational excellence: data integrity, validation, automation and documentation · Create coherent Logical Data Models that drive physical design We are open to hiring candidates to work out of one of the following locations: Bangalore, KA, IND BASIC QUALIFICATIONS - 5+ years of data engineering experience - Experience with data modeling, warehousing and building ETL pipelines - Experience with SQL - Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS - Experience mentoring team members on best practices PREFERRED QUALIFICATIONS - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience operating large data warehouses

Capgemini Logo

Data Engineer

Capgemini

Bengaluru, Karnataka, India

Posted: 3 months ago

At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same. Job Description:                Expert knowledge in Python                Expert knowledge in popular machine learning libraries and frameworks, such as TensorFlow, Keras, scikit-learn.                Proficient understanding and application of clustering algorithms (e.g., K-means, hierarchical clustering) for grouping similar data points.                Expertise in classification algorithms (e.g., decision trees, support vector machines, random forests) for tasks such as image recognition                Natural language Processing and recommendation systems.                Proficiency in working with databases, both relational and non-relational like MySQL with experience in designing database schemas                And optimizing queries for Efficient data retrieval.                Strong knowledge in areas like Object Oriented Analysis and Design, Multi-threading, Multi process handling and Memory management.                Good knowledge model evaluation metrics and techniques.                Experience in deploying machine learning models to production environments.                Currently working in an Agile scrum team and proficient in using version control systems (e.g., Git) for collaborative development.  Primary Skill:              Excellent in Python Coding              Excellent in Communication Skills              Good in Data modelling, popular machine learning libraries and framework