Senior Data Engineer

  • Location:

    Richmond, Virginia - United States

  • Sector:

    -

  • Job ref:

    1395

  • Contact:

    Zack Cheatham

  • Expiry date:

    2023-06-29

  • Published:

    5 months ago

  • Location: Richmond, Virginia
  • Type: Direct Hire
  • Job #1395

Senior Data Engineer

JOB DESCRIPTION

The Data Science & Data Engineering team has a broad set of responsibilities. We develop and maintain key data stores that power our applications, enable internal workflows and tooling to enable security researchers, and support end-to-end machine learning (ML) model development and deployment. We also provide data service support to other groups within the company.

We are looking for a highly motivated Data Engineer to help build and maintain our tools, services and pipelines to ingest and persist data from varied sources. We process terabytes of security telemetry per day and use the data to enable applications in our platform. An ideal candidate will have previous experience building data pipelines to ingest and persist data that will be used by diverse data consumers, ranging from data-intensive applications to ML pipelines and to ad-hoc and scheduled querying. Experience with parsing cybersecurity logs is a plus. You will collaborate closely with multiple teams to understand business needs and translate those into technical requirements.

What You’ll Do:

  • Work within the data engineering team to develop pipelines and tools to support multiple use cases and applications
  • Define how the data will be stored and consumed by different consumers, IT systems, and applications that leverage the data
  • Optimize non-performant databases, queries and pipelines to ensure timely access to the right data by our applications and people
  • Document data sources, data structures, data flows and data infrastructure
  • Develop monitoring and alerting capabilities to ensure data pipelines are working properly
  • Collaborate with internal customers to understand their business needs and translate those into technical requirements
  • Participate in feasibility studies and cost/benefit analyses to determine technology solutions that are right for our company
  • Enable analysts to interact with data in a secure manner using tools like a data catalog
  • Define roles and responsibilities with regard to data governance
  • Stay on top of the latest technology trends, coding standards, libraries and frameworks to constantly grow your skillset

Requirements

  • Minimum of a BS in Computer Science or a related technical field
  • Minimum of 5+ years of experience performing data engineering activities
  • 3+ years building data pipelines in production and ability to work across structured, semi-structured and unstructured data
  • Experience building large scale distributed data stores and data warehouses at petabyte scale and supporting machine learning and analytic workloads
  • Experience using Apache Hadoop and the Hadoop ecosystem. Desirable to have experience with Spark, Flink and Kafka
  • Hands-on experience with ETL tools (Apache NiFi is a plus)
  • Strong experience with DevOps practices and common tooling (e.g., Docker, Kubernetes)
  • Experience with technologies like Hive, Impala, Spark SQL and Presto/Trino
  • Prior experience with Databricks and/or Snowflake is a plus
  • Strong communication skills
  • Ability to work with loosely defined requirements
  • Ability to work in an iterative, agile development environment
  • Strong experience developing in Java and Python 3
  • Ability to pick up, work with and explore new analytical tools
Attach a resume file. Accepted file types are DOC, DOCX, PDF, HTML, and TXT.

We are uploading your application. It may take a few moments to read your resume. Please wait!