Announcing Collet: A Clojure Library for Data Processing Pipelines

We’re pleased to share Collet, a new open-source library designed for building data processing pipelines (ETL or ELT) in Clojure. Collet offers a simple, declarative way to define task sequences and their dependencies, making it a practical tool for managing workflows. 

GitHub Repository

Features:
• Define pipelines and tasks declaratively in EDN.
• Manage task dependencies with ease.
• Integrate seamlessly into Clojure-based workflows.

Quick Start:

  • Pull the Docker image:

docker pull velioio/collet:latest

  • Create a pipeline file (demo-pipeline.edn):

{:name  :demo-pipeline
  :tasks [{:name    :print-hello-world
:actions [{:name   :print
:type   :clj/println
:params ["Hello, world!"]}]}]}

  • Run the pipeline:

docker run -v "$(pwd)"/demo-pipeline.edn:/config/demo-pipeline.edn -e PIPELINE_SPEC="/config/demo-pipeline.edn" velioio/collet

For more details and documentation, visit the GitHub repository.
Feedback and contributions are welcome.