Hi r/Python! I’m the developer of Flowfile and wanted to share FlowFrame, a component I built that bridges the gap between code-based and visual ETL tools.
Source code: https://github.com/Edwardvaneechoud/Flowfile/
What My Project Does
FlowFrame lets you write Polars-like Python code for data pipelines while automatically generating a visual ETL graph behind the scenes. You write familiar code, but get an interactive visualization you can debug, share, or use to explain your pipeline to non-technical colleagues.
Here’s a simple example:
« `python import flowfile as ff from flowfile import col, open_graph_in_editor
Create a dataset
df = ff.from_dict({ « id »: [1, 2, 3, 4, 5], « category »: [« A », « B », « A », « C », « B »], « value »: [100, 200, 150, 300, 250] })
Filter, transform, group by and aggregate
result = df.filter(col(« value ») > 150) .with_columns((col(« value ») * 2).alias(« double_value »)) .group_by(« category ») .agg(col(« value »).sum().alias(« total_value »))
Open the visual graph in a browser
open_graph_in_editor(result.flow_graph) « `
When you run this code, it launches a web interface showing your entire pipeline as a visual flow diagram:

Target Audience
FlowFrame is designed for:
Data engineers who want to build pipelines in code but need to share and explain them to others Data scientists who prefer coding but need to collaborate with less technical team members Analytics teams who want to standardize on a single tool that works for both coders and non-coders Anyone working with data pipelines who wants better visibility into their transformations
It’s production-ready and can handle real-world data processing needs, but also works great for exploration, prototyping, and educational purposes.
Comparison
Compared to existing alternatives, FlowFrame takes a unique approach:
Vs. Pure Code Libraries (Pandas/Polars): – Adds visual representation with no extra work – Makes debugging complex transforms much easier – Enables non-coders to understand and modify pipelines
Vs. Visual ETL Tools (Alteryx, KNIME, etc.): – Maintains the flexibility and power of Python code – No vendor lock-in or proprietary formats – Easier version control through code – Free and open-source
Vs. Notebook Solutions: – Shows the entire pipeline as a connected flow rather than isolated cells – Enables interactive exploration of intermediate data at any point – Creates reusable, production-ready pipelines
Key Features
Built on Polars for fast data processing with lazy evaluation Web-based UI launches directly from your Python code Visual ETL interface that updates as you code Flows can be saved, shared, and modified visually or programmatically Extensible architecture for custom nodes
You can install it with: pip install Flowfile
I’d love feedback from the community on this approach to data pipelines. What do you think about combining code and visual interfaces?
submitted by /u/Proof_Difficulty_434 to r/Python
[link] [comments]
Laisser un commentaire