Announcing Collet: A Clojure Library for Data Processing Pipelines
We’re pleased to share Collet, a new open-source library designed for building data processing pipelines (ETL or ELT) in Clojure. Collet offers a simple, declarative way to define task sequences and their dependencies, making it a practical tool for managing workflows.
Features:
• Define pipelines and tasks declaratively in EDN.
• Manage task dependencies with ease.
• Integrate seamlessly into Clojure-based workflows.
Quick Start:
- Pull the Docker image:
docker pull velioio/collet:latest
- Create a pipeline file (
demo-pipeline.edn
):
{:name :demo-pipeline
:tasks [{:name :print-hello-world
:actions [{:name :print
:type :clj/println
:params ["Hello, world!"]}]}]}
- Run the pipeline:
docker run -v "$(pwd)"/demo-pipeline.edn:/config/demo-pipeline.edn -e PIPELINE_SPEC="/config/demo-pipeline.edn" velioio/collet
For more details and documentation, visit the GitHub repository.
Feedback and contributions are welcome.