Polymarket
is the world's largest prediction market platform.
We enable individuals to express views on real-world events by trading on outcomes across politics, economics, sports, culture, and current affairs.
Built as a peer-to-peer marketplace with no centralized "house," Polymarket aggregates diverse opinions into transparent, market-based probabilities that reflect collective expectations about the future.
We're growing fast — both in terms of volume ($21B traded in 2025) and adoption as an alternative news source.
Our ambition is to become a ubiquitous beacon of truth in global media and we need your help adding fuel to the fire.
About the Role Polymarket is looking for a Senior Data Engineer to help scale our data platform to support the next phase of the company's growth.
Every trade, position, and resolution flows through a data layer that powers our entire user-facing product – leaderboards, PnL, activity feeds, positions, volume analytics – with sub-100ms latency targets at crypto scale.
The team is small and high-ownership.
We need a senior operator who can drive entire workloads end-to-end, not just execute on a defined roadmap.
You'll be hands-on with design, architecture, and delivery.
Your mandate: make Polymarket's analytics and serving data layer something the rest of the company can build on without thinking.
Schema drift, parity gaps under migration, latency regressions, and cost blowouts all land on this role.
When a product team asks "can the data layer answer this query under X ms at Y $/month," you're the one who can say yes confidently, or redesign the slice that makes it possible.
5+ years of data engineering on production systems serving real users at scale Deep knowledge of OLTP/OLAP split architectures: you know when a row store wins, when a column store wins, and when to use both Columnar warehouse expertise: ClickHouse strongly preferred; Snowflake, BigQuery, Redshift, or Apache Pinot accepted if fundamentals are solid Data lake experience: Parquet, Iceberg (or Delta/Hudi), compaction strategies, S3 layout discipline Streaming pipeline experience: Kafka, exactly-once vs. at-least-once reasoning, backpressure, consumer-group patterns, schema evolution Strong data modeling fundamentals: star/snowflake, SCD patterns, CDC, idempotent event sourcing, dimensional vs. event-log tradeoffs PostgreSQL at scale: partitioning, index design, autovacuum/bloat remediation, query planning, CDC triggers vs. logical replication SQL fluency at warehouse scale: window functions, CTEs, dictionary-based enrichment, dialect specifics Distributed systems reasoning: consistency models, event ordering, replay semantics, write-once vs. mutable state, reorg handling (Plus) EVM indexing experience: rindexer, subgraphs, or comparable – this shortens ramp considerably (Plus) Rust: you'll touch indexer and validation tooling codebases; comfortable reading and contributing (Plus) Domain knowledge in DeFi, prediction markets, or order-book systems (Plus) Observability and SLO thinking: Prometheus metrics design, dashboard discipline, alert-fatigue avoidance (Plus) Python for SQL tooling, ad-hoc analysis, and one-off migrations (Plus) Track record shipping a platform migration or greenfield data stack under a hard deadline
Competitive salary & equity Unlimited PTO Full Health, Vision, & Dental coverage 401k match Hardware setup: new MacBook Pro, big display, & accessories