Data Engineer

Birchbox
Published
December 4, 2020
Location
New York, New York
Category
Job Type

Description

Please apply here: https://www.birchbox.com/about/openings?gh_jid=2386003

Birchbox is seeking an ambitious and experienced Data Engineer to help evolve our data systems. In addition to fueling our BI platform, recommender systems, and email marketing, our data services inform decision-making company-wide.This role will be very impactful as we build out the next generation of our data services, and will have considerable leeway in driving future strategy and architecture.

You'll be joining a lean Agile team supporting Data Infrastructure and Machine Learning. Primary upcoming initiatives include retooling our event data collection systems, supporting the next generation of our A/B testing machinery, and iterating on our data pipelines. Each of these projects will primarily involve data extraction, transformation, and loading, as well as DevOps work and vendor management. You will train junior developers and work with the technical leadership of the company to drive best practices for data.

Our Stack: Fivetran/dbt/Redshift/Metabase. Airflow (astronomer.io), Databricks. Python and a little bit of Scala. •

*We are looking for teammates who are willing to work out of our New York, NY office on a full time basis. Due to COVID, our team is working remotely until at least Q1 2021, however, we do expect to return to the office on a full time basis.**

Responsibilities:

• Build and maintain fault-tolerant, scalable batch data pipelines

• Architect and maintain soft real time event analytics pipelines

• Design and implement cloud-based data and machine learning pipelines (Databricks, s3 data lake)

• Implement fault tolerant data integrations between internal systems and with third-party APIs, supporting product and marketing needs

• Create and integrate internal A/B testing tooling for engineering and product teams

• Work with the Business Intelligence team, advising on data provenance and reliability and integrating data sources for analysts.

• Manage vendor interactions related to our data stack

• Design and build components of internal tooling used by our subscription operations team (Python)

• Tool our data systems for observability, including logging, metrics monitoring, and dashboarding

• Train and upskill future data engineers • Contribute to an open, empowering, responsible, and proactive engineering culture.

Requirements:

• 3+ years of professional software experience (or equivalent).

• Degree in Computer Science or related field (or equivalent experience).

• Expert in Python, and comfortable with at least two other languages (e.g., Java, Ruby, PHP).

• Strong command line skills for working within virtualized machines (bash, tmux / screen, vim / emacs).

• Experience orchestrating data infrastructure (Spark, S3, Kafka/Kinesis, Redshift).

• Experience with modern workflow management systems (e.g., Airflow, Luigi).

• Advanced SQL skills (MySQL and/or Postgres), familiarity with data warehousing.

• Familiarity with non-relational data stores and/or indexes (e.g., MongoDB, DynamoDB, ElasticSearch).

• Experience working on teams using distributed version control (e.g., Git, Mercurial).

Apply
Drop files here browse files ...
Are you sure you want to delete this file?
/