Senior Data Engineer (North America)
Hello! Thank you for your interest in Sigma Ratings.
About Sigma Ratings
Sigma Ratings is a mission-driven company born at MIT and is looking for talented engineers to work alongside a fast growing, tight-knit team to help our clients detect and react to risk and financial crime.
We are an early but fast-growing, remote-first company based in New York City that is working to quantify company-level risks in emerging market countries. You'll be working with a dynamic group that has deep experience in governance, law, technology, and countering illicit finance.
Sigma is VC-backed and has received numerous investments from well-known angels in the United States and from around the world. The company also boasts a growing advisory board and a world-class independent rating committee.
You can read more here: www.sigmaratings.com
We are looking for a Senior Data Engineer to help us process terabytes of data from dozens of sources into actionable insights for detecting risks and financial crime. We work with graphs of billions of data points, quickly changing data sets, and numerous statistical models. Deep experience with large and sometimes messy datasets is key.
You will be working closely with the Executive, Product and Engineering teams to feed this data into our main product, a risk management application.
What You'll Do
- Partner with our product and the backend teams to create and maintain data pipelines, ETL jobs, perform analyses
- Write Python and PySpark in Databricks
- Write SQL, Cypher or Elastic queries when needed
- Help us ship new features and draw new insights from new and existing data sources
- Develop and maintain necessary tests to ensure that the data is flowing correctly and regularly
- Help schedule and maintain automated jobs related to ETL processes.
- Mentor and share knowledge with the rest of the team
- Python / PySpark (on Databricks)
- Postgres / Neo4J / Elasticsearch
- Amazon: ECS, RDS (Postgres), and more...
- Golang Microservices
- (3+ years) Python / Spark - for ETL, Data pipelines and manipulation
- At least one backend or systems language
- Knowledge of common data architectures and best practices
- Dealing with diverse and often messy data
- Working with cross-functional stakeholders (product, sales, ops, etc) in an Agile-ish environment
- Strongly Preferred Experience: Not all qualified candidates will have all these requirements, so if you have some or most of them, we’d love for you to apply!
- Databricks / Delta Tables
- Web Scraping / Beautiful Soup
- Neo4J + Cypher / Graph Databases
- Version Control and the Unix command line
Approximately 1 or more years of experience working with cross-functional stakeholders (product, sales, ops, etc) in an Agile-ish environment
Something looks off?