Solidus Labs
About the Role You will design and build robust data pipelines on cloud environments at scale You will own ClickHouse availability and performance while collaborating with downstream analytics ML and product teams You will monitor data quality and help evolve schemas and data formats to support diverse client needs
Requirements BSc in Computer Sciences Strong background as a software engineer with at least 5+ years of hands on experience with Java Rust or Python 8+ years in data engineering and data pipeline development on high volume low latency production environments Experience working in low latency real time systems processing billions of events a day Deep hands on ClickHouse expertise including cluster architecture table engine selection replication sharding and query optimization Proficiency across the broader data engineering stack Apache Kafka Spark Airflow Kubernetes Redis Snowflake and caching technologies Expert level SQL and query optimization skills with emphasis on ClickHouse specific patterns materialized views projections TTLs and merge tree tuning Experience with monitoring and observability tools Prometheus Grafana or similar Excellent verbal and written communication skills and ability to coach and influence engineers across teams in a remote environment
Responsibilities Design and optimize the ClickHouse data layer including table engines partition strategies materialized views and storage policies to ensure high performance at billions of events scale Own ClickHouse clusters sizing topology decisions and capacity planning across both real time ingestion and T1 batch workloads balancing cost latency and throughput Drive data reliability and deduplication strategies within ClickHouse leveraging engine level features like ReplacingMergeTree and CollapsingMergeTree and pipeline level controls to guarantee data completeness and consistency Establish and continuously improve monitoring alerting and observability for the ClickHouse layer covering replication health merge performance query latency and resource utilization Serve as the internal ClickHouse authority coaching engineering teams across the organization on query optimization data modeling best practices and efficient use of ClickHouse specific constructs Act as the primary liaison with the ClickHouse vendor team triaging issues incorporating product feedback evaluating new features and translating vendor guidance into actionable improvements for our deployment Collaborate with downstream consumers analytics ML product to understand access patterns and continuously refine how data is stored and served improving query performance schema design and data formats for diverse client needs Define and enforce schema versioning and governance standards within the ClickHouse environment ensuring schema evolution does not compromise pipeline reliability or consumer compatibility Funding Investors Senior Software Engineer (Golang) Chainstack · 1 day ago Senior Software Engineer (Streaming Data Pipeline) Covalent · 1 day ago Senior Software Engineer, Custody Paxos · 3 days ago Senior Software Engineer Data Platform TRM Labs · 1 week ago Senior Software Engineer, Data Product TRM Labs · 1 week ago Funding Investors Senior Software Engineer (Golang) Chainstack · 1 day ago Senior Software Engineer (Streaming Data Pipeline) Covalent · 1 day ago Senior Software Engineer, Custody Paxos · 3 days ago Senior Software Engineer Data Platform TRM Labs · 1 week ago Senior Software Engineer, Data Product TRM Labs · 1 week ago
Solidus Labs