Databricks Data Engineer
Job Description:
We are looking for a highly experienced Data Engineer to join our product organization in a full-time role. This position focuses on Big Data systems design and data architecture, with a special emphasis on Spark and Delta/Data Lake technology.
In this role, you will help design, develop, enhance, and maintain complex data pipeline and API products that manage business-critical operations and large-scale analytics. We are looking for applicants who demonstrate a capability to learn new concepts quickly and bring a robust background in data engineering or software engineering expertise.
Responsibilities
- Collaborate: Work collaboratively with other engineers to architect and implement complex systems with minimal oversight, while partnering with team leadership to identify how best to improve and expand platform capabilities.
- Build & Design: Design, develop, and maintain complex data pipeline products that support business-critical operations and large-scale analytics applications.
- Support the Team: Partner with analytics, data science, and engineering teams to understand and solve their unique data needs and challenges.
- Continuous Learning: Dedicate time to staying current with the latest developments in the space and embrace new concepts to keep up with fast-moving data engineering technology.
- Autonomy: Enjoy a role that offers strong independence and autonomy while contributing to the technical maturity of the organization.
Qualifications & Skills
- Experience: 5+ years of Data Engineering experience, including 4+ years designing and building Databricks data pipelines is essential. While Azure cloud experience is preferred, we are happy to consider experience with AWS, GCP, or other cloud platforms.
- Technical Stack:
- 4+ years of hands-on experience with Python, Pyspark, or SparkSQL is key. Experience with Scala is a plus.
- 4+ years of experience with Big Data pipelines or DAG Tools (such as Airflow, Dbt, Data Factory, or similar).
- 4+ years of Spark experience, especially with Databricks Spark and Delta Lake.
- 4+ years of hands-on experience implementing Big Data solutions in a cloud ecosystem, including Data/Delta Lakes.
- 5+ years of relevant software development experience, including Python 3.x, Django/Flask Frameworks, FastAPI (or other standard industry API), and relational databases (SQL/ORM).
- Additional Skills (Great to have):
- Experience with Microsoft Fabric specifically with Pyspark on Fabric and Fabric Pipelines
- Experience with conceptual, logical, and/or physical database designs.
- Strong SQL experience, specifically writing complex, highly optimized queries across large volumes of data.
- Strong data modeling/profiling capabilities using Kimball/star schema methodology as well as medallion architecture.
- Professional experience with Kafka, EventHub or other live streaming technologies.
- Familiarity with database deployment pipelines (e.g., dacpac or similar).
- Experience with unit testing or data quality frameworks.
Beware of scams
Our recruiting team may communicate with candidates via our @hitachisolutions.com domain email address and/or via our SmartRecruiters (Applicant Tracking System) [email protected] domain email address regarding your application and interview requests.
All offers will originate from our @hitachisolutions.com domain email address. If you receive an offer or information from someone purporting to be an employee of Hitachi Solutions from any other domain, it may not be legitimate.