Pyspark -Palantir Core- Data Engineer
Role & responsibilities
- Min. 5 yrs on role Software Engineering + Backend Eng. (one of languages: Python, Java, Scala, Rust - with focus on Python today)
- practice with high quality codes (unit testing, integration testing, others) + code metrics
- fast discovery what is in new repository code, asking right technical questions about codebase and code quality, CI/CD as a general rule of work (automatic tasks, instead of manual)
- Min. 5 yrs on role near "data" subject like Data Engineer or Integrations Engineer
- strong understanding data pipelines in Apache Spark, Apache Airflowand similar orchestrators
- understanding on mid. level Databricks or other modern platforms BigData solutions inside (i.e. data catalogs, computes engines, SQL engines, observability layers, etc.)
- strong focus on delivery data pipelines on high quality from data product delivery perspective
- practical exp. with data streams and Petabytes scale
Preferred candidate profile
- Min. 1 yr with exp. near data integrations (data pipelines, connection for diff. types of data sources and sinks)
- Basic AWS knowledge (Lambda, SQS, CloudWatch, Powertools)
- Soft skills for discussion with vendors/partners about possible solutions/limits/properties in technical integrations and different data architectures
Strong technical arguments in technologies in data integrations and BigData areas (focused on Open Source tools)