Gcp Data Engineer
Job Summary
We are looking for a skilled GCP Data Engineer to design, build, and optimize scalable data pipelines and analytics solutions on Google Cloud Platform. The ideal candidate should have strong experience in SQL, Python, and Apache Spark, along with hands-on expertise in GCP data services.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines on Google Cloud Platform (GCP)
- Build and optimize ETL/ELT workflows using Python, SQL, and Apache Spark
- Work with large structured and unstructured datasets for data ingestion, processing, and analytics
- Develop and manage data models for BigQuery and ensure performance optimization
- Integrate data from multiple sources including databases, APIs, and cloud storage
- Collaborate with data scientists, analysts, and application teams to support data needs
- Implement data quality checks, validation, and monitoring
- Ensure data security, governance, and best practices across pipelines
- Support performance tuning, troubleshooting, and production issue resolution
Required Skills
- Strong experience with Google Cloud Platform (GCP) services such as:
- BigQuery
- Cloud Storage
- Dataflow
- Dataproc
- Excellent knowledge of SQL for data querying and optimization
- Strong programming experience in Python
- Hands-on experience with Apache Spark (PySpark preferred)
- Experience in building batch and/or streaming data pipelines
- Understanding of data warehousing concepts and schema design
- Familiarity with CI/CD pipelines and version control systems (Git)
Good to Have
- Experience with Airflow / Cloud Composer
- Exposure to streaming technologies (Kafka, Pub/Sub)
- Knowledge of data governance, metadata management, and security
- Experience working in Agile/Scrum environments
- Cloud certification (GCP Data Engineer) is a plus