How it Works

A Lagoon is blend of popular open source tools, managed serverless infrastructure (AWS), wiring, and simplified interfaces to modularize data productization.

Built for data teams with a lot of data, a limited budget, and a need to continuously deliver good clean structured data. We've selected a handful of popular, powerful open source data tools, added in IaC (infrastructure as code), wired it together, and added friendly interfaces to create a modular stack for productizing big datasets.

A Lagoon enables your team to either asynchronously or bulk load in your datasets, automate transformations, enforce compliance, and deliver data in a structured, queryable format without needing to manage complex infrastructure and learn specialty tools. In short, 1) periodically add your data (in a number of formats), 2) define your cleaning, preparation, and compliance workflows, 3) use (query) the results.

Lagoon is predominantly compute consumption driven, with storage separated into S3 to keep costs inline with your data volume. We use an asynchronous event architecture to automate and process data, flowing from raw application assets to production ready datasets.


What’s Next

Dig in!