The Job logo

What

Where

Principal Data Engineer

ApplyJoin for More Updates

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi
Lead a team of data engineers specializing in data crawling, providing technical guidance, mentoring, and performance feedback. Collaborate with cross-functional teams to understand data requirements and develop scalable solutions. Design, develop, and maintain data crawling pipelines. Evaluate and implement appropriate technologies to optimize crawling process. Enforce best practices and resolve issues. Perform ETL tasks to convert crawled data for analysis. Monitor performance and stay up to date with advancements. Bachelor's or master's degree in CS or related field, 6-10 years experience required.

Responsibilities:

  • Lead a team of data engineers specializing in data crawling, providing technical guidance, mentoring, and performance feedback.
  • Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and develop scalable data crawling solutions.
  • Design, develop, and maintain data crawling pipelines, ensuring efficient and timely acquisition of data from various sources.
  • Evaluate and implement appropriate data crawling technologies and tools to optimize the crawling process and ensure data quality and integrity.
  • Develop and enforce data engineering best practices, standards, and processes related to data crawling.
  • Identify and resolve issues related to data crawling, such as handling complex data structures, mitigating crawling bottlenecks, and addressing website-specific challenges.
  • Collaborate with stakeholders to define data engineering project requirements, timelines, and deliverables related to data crawling.
  • Perform data extraction, transformation, and loading (ETL) tasks to convert crawled data into usable formats for downstream analysis and processing.
  • Monitor data crawling performance and implement mechanisms to ensure the reliability and scalability of crawling pipelines.
  • Stay up to date with the latest trends and advancements in data crawling techniques, web scraping frameworks, and related technologies.
  • Experience in Data Modelling

 

Requirements:

  • Bachelor's or master's degree in computer science, data engineering, or a related field.
  • Proven experience (6- 10 years) working as a data engineer, with a specialization in data crawling and web scraping.
  • Strong programming skills in languages such as Python, Java, or Scala, with expertise in web scraping frameworks like Scrapy, Beautiful Soup, or Selenium.
  • Solid understanding of web protocols (HTTP, HTTPS), HTML, CSS, and JavaScript to effectively crawl and extract data from websites.
  • Experience with distributed crawling frameworks such as Apache Nutch or Apache Storm is a plus.
  • Proficiency in SQL and database technologies (e. g., PostgreSQL, MySQL, or Oracle) for data storage and retrieval.
  • Familiarity with cloud platforms (e. g., AWS, Azure, or Google Cloud) and related data services for scalable and reliable data crawling.
  • Knowledge of data modeling, data warehousing, and ETL processes.
  • Strong analytical and problem-solving skills, with a focus on data quality and accuracy.
  • Excellent leadership and team management abilities, with a proven track record of leading data engineering teams.
  • Effective communication and collaboration skills, with the ability to explain complex technical concepts to non-technical stakeholders
Set alert for similar jobsPrincipal Data Engineer role in Bengaluru, India
Ai Palette Logo

Company

Ai Palette

Job Posted

10 months ago

Job Type

Full-time

WorkMode

On-site

Experience Level

3-7 Years

Category

Technology

Locations

Bengaluru, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant

Related Jobs

Ai Palette Logo

Principal Data Engineer

Ai Palette

Bengaluru, Karnataka, India

Posted: 10 months ago

Responsibilities: Lead a team of data engineers specializing in data crawling, providing technical guidance, mentoring, and performance feedback. Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and develop scalable data crawling solutions. Design, develop, and maintain data crawling pipelines, ensuring efficient and timely acquisition of data from various sources. Evaluate and implement appropriate data crawling technologies and tools to optimize the crawling process and ensure data quality and integrity. Develop and enforce data engineering best practices, standards, and processes related to data crawling. Identify and resolve issues related to data crawling, such as handling complex data structures, mitigating crawling bottlenecks, and addressing website-specific challenges. Collaborate with stakeholders to define data engineering project requirements, timelines, and deliverables related to data crawling. Perform data extraction, transformation, and loading (ETL) tasks to convert crawled data into usable formats for downstream analysis and processing. Monitor data crawling performance and implement mechanisms to ensure the reliability and scalability of crawling pipelines. Stay up to date with the latest trends and advancements in data crawling techniques, web scraping frameworks, and related technologies. Experience in Data Modelling   Requirements: Bachelor's or master's degree in computer science, data engineering, or a related field. Proven experience (6- 10 years ) working as a data engineer, with a specialization in data crawling and web scraping. Strong programming skills in languages such as Python, Java, or Scala, with expertise in web scraping frameworks like Scrapy, Beautiful Soup, or Selenium. Solid understanding of web protocols (HTTP, HTTPS), HTML, CSS, and JavaScript to effectively crawl and extract data from websites. Experience with distributed crawling frameworks such as Apache Nutch or Apache Storm is a plus. Proficiency in SQL and database technologies (e. g., PostgreSQL, MySQL, or Oracle) for data storage and retrieval. Familiarity with cloud platforms (e. g., AWS, Azure, or Google Cloud) and related data services for scalable and reliable data crawling. Knowledge of data modeling, data warehousing, and ETL processes. Strong analytical and problem-solving skills, with a focus on data quality and accuracy. Excellent leadership and team management abilities, with a proven track record of leading data engineering teams. Effective communication and collaboration skills, with the ability to explain complex technical concepts to non-technical stakeholders.