- Integrity: Doing the right things for the right reasons
- Agility: Adapting and thriving in a dynamic environment
- Teamwork: Combining our strengths to do amazing things
- Passion: Channeling enthusiasm to drive excellence
- Creativity: Unleashing curiosity to defy the norm
About the role:
As a Data Engineer at 1010data, you will be responsible for designing, maintaining, and optimizing large-scale automated ELT processes. Working actively with analysts and customer teams specializing in enterprise data warehousing, you will leverage industry-standard data orchestration tools as well as in-house proprietary scheduling and automation tools to create efficient and reliable ELT jobs which support 1010data’s product offerings and data warehousing needs for our customers. As we incorporate more cloud technologies into our processes, you will be at the forefront of exploring and defining best practices and helping us transition our products to be more scalable.
As part of the onboarding process, you will learn about 1010data’s proprietary technology stack. Our query engine, query language, database, and data storage layer were all developed and fine-tuned in-house over the lifetime of the company. ELT processes heavily rely on these components, whether they are written in Python and Airflow, or our proprietary data orchestration tools. You will be formally trained in the latter as a new 1010data employee. The concepts should be familiar to anyone with exposure to database techniques like normalization/indexing/partitioning, MapReduce, columnar database architecture and distributed systems.
This role is not sponsorable
What you will take on:
- Taking end-to-end ownership of data pipelines and custom solutions for our clients
- Coordinating with the systems, core, CX, and analytics teams to build and maintain data products and custom solutions for our clients
- Designing and writing automated scripts to preprocess terabytes of data from our partners/clients
- Designing and writing new enterprise-scale ELT/ETL workflows from scratch in Python using Airflow, Docker, AWS, etc.
- Modifying/redesigning legacy ELT/ETL processes to leverage cutting-edge open source and proprietary technologies
- Ensuring quality, reliability and uptime for critical automated processes
- Migrating our products and processes into the cloud while drastically reducing our in-house data center footprint
What you already have:
- At least 1-2 years of professional experience programming in Python
- Exposure to ETL/ELT pipeline automation
- Exposure to basic database concepts
- Good understanding of Data Engineering, NoSQL databases and database design, distributed systems and/or information retrieval
- Work with Saas products
- Knowledge of Apache Airflow
- DBA experience
- Ability to plan and collect requirements for projects, and interact with the analyst and data science teams
- STEM Bachelor’s required, graduate degree is a big plus
For more than 20 years, 1010data has helped financial, retail and consumer goods customers monitor shifts in consumer demand and market conditions and rapidly respond with highly targeted strategies. The 1010data Insights Platform combines market intelligence, data management, granular enterprise analytics, and collaboration capabilities to empower better business outcomes. More than 900 of the world’s foremost companies partner with 1010data to power smarter decisions.
You can find this on the Company page of 1010data at https://1010data.com/company/
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.