#вакансия #казахстан #удаленка #workITkz #аналитик
Должность: Data Engineer / Junior–Middle
Компания: Data Never Lies
https://data-never-lies.com/
Город: Казахстан
Занятость: удаленка
Оплата: 20-30$ per hour
Data Never Lies is a UK based team that believes in the power of facts over guesswork. For years, we’ve been helping businesses dig into the darkest corners of their data (and yes, data does have dark corners—usually where no one bothers to clean up) to turn insights into action.
Описание вакансии:
We are looking for a Junior/Middle Data Engineer with a strong focus on Python, SQL, and building data pipelines.
The main responsibility of this role is connecting to various data sources, extracting data via REST APIs, databases, files, and third-party platforms, processing it, and loading it into a data warehouse for further analytics and BI reporting.
We are looking for someone who understands that a data pipeline is not just a script, but a stable process with logging, error handling, retries, monitoring, and data quality checks.
Location: Kazakhstan/Remote
Required Skills & Experience:
Strong knowledge of Python and practical experience using it in data engineering tasks.
Experience building data pipelines for loading, processing, and transforming data.
Experience working with various data sources: REST APIs, databases, CSV/Excel/JSON files, cloud storage, and third-party platforms.
Hands-on experience integrating with REST APIs: authentication, pagination, rate limits, retries, timeout handling, and error handling.
Understanding of how to build fault-tolerant pipelines.
Experience setting up incremental data loading and handling partial loads.
Ability to work with JSON and semi-structured data.
Strong SQL knowledge: JOINs, CTEs, aggregates, and window functions.
Experience loading data into databases or data warehouses such as PostgreSQL, BigQuery, Snowflake, Redshift, MS SQL, or similar systems.
Understanding of ETL/ELT approaches.
Experience with logging, monitoring, and basic troubleshooting of pipelines.
Experience working with Git.
Nice to Have:
Experience working with dbt: models, sources, tests, documentation, incremental models.
Experience with Spark / PySpark.
Experience using orchestration tools such as Airflow, Prefect, Dagster, or similar.
Experience implementing data quality checks: freshness, duplicates, completeness, consistency.
Experience working with cloud storage: AWS S3, Google Cloud Storage, Azure Blob Storage.
Experience with Docker.
Understanding of dimensional modelling principles: fact/dimension tables, star schema, data marts.
Experience optimizing SQL queries and pipelines.
Bonus Points:
Experience working with BI tools such as Power BI, Tableau, Looker, QuickSight, Domo, or similar.
Experience preparing datasets for BI reporting and analytical data marts.
Basic understanding of cloud platforms such as GCP, AWS, or Azure.
Experience with CI/CD for data projects.
Ability to document pipeline logic, data sources, and transformations clearly.
Benefits:
- A variety of projects: trust us, you won’t be bored.
- A sane schedule: we focus on tasks, not hours—but showing up at noon every day isn’t exactly smiled upon.
- A team that values expertise and humor: yes, we occasionally crack jokes about SQL—don’t worry if you don’t laugh right away.
- Choose your adventure: Dive deep into a single, large-scale project or opt for a “discovery” mode, collaborating with multiple global clients across different domains. You can get hands-on with cutting-edge data stacks for anything from gaming and dating to skyscraper construction and nuclear energy. If variety is what you crave, you’ll find it here.
Контакты:
Telegram Anna.Kononova@data-never-lies.com
Whatsapp
Навыки
Airflow
AWS
Azure Blob Storage
BigQuery
CI/CD
Cloud Technologies
Dagster
dbt
GCP
JSON
manage cloud data and storage
manage ICT virtualisation environments
PostgreSQL
Prefect
ИИ навыки
BigQuery
Cloud Technologies
computational fluid dynamics
Dagster
data extraction, transformation and loading tools
data visualisation software
dimensional modelling
Looker
manage cloud data and storage
manage ICT virtualisation environments
PostgreSQL
Prefect
project configuration management
ИИ домены
Business Intelligence
Cloud Computing
Data Engineering
* Домены определены автоматически с помощью нейросети