Data IDE
Open Source Web Data IDE
After playing with Rill Developer, DuckDB, Vega, WASM, Rath, and other modern Data IDEs, I think we have all the pieces for an awesome web based BI/Data exploration tool. Some of the features it could have:
- Let me add local and remote datasets. Not just one as I'd like to join them later.
- Let me plot it using Vega-Lite. Guide me through alternatives like Vega's Voyager2 does.
- Might be as simple as surfacing Observable Plot with DuckDB WASM…
- Use LLMs to improve the datasets and offer next steps:
- Get suggested transformations for certain columns. If it detect a date, extract day of the week. If it detects a string,
lower()
it… - Get suggested plots. Given that it'll know both the column names and the types. Should be possible to create a prompt that returns some plot ideas and another that takes that and write the Vega-Lite code to make it work.
- Make it easy to query the data via Natural Language.
- Get suggested transformations for certain columns. If it detect a date, extract day of the week. If it detects a string,
- Let me transform them with SQL (DuckDB) and Python (JupyterLite). Similar to Neptyne but in the browser (WASM).
- Let me save the plots in a separate space and give me a shareable URL encoded link.
- Local datasets could be shared using something like Magic Wormhole or a temporal storage service.
- Let me grab the state of the app (YAML/JSON), version control it, and generate static (to publish in GitHub Pages) and dynamic (hosted somewhere) dashboards from it.
- Similar to evidence.dev or portal.js.
- It could also have "smart" data checks. Similar to deepchecks alerting about anomalies, outliers, noisy variables, …
- Given a large amount of Open Data. It could offer a way for people to upload their datasets and get them augmented.
- E.g: Upload a CSV with year and country and the tool could suggest GDP per Capita or population.
Could be an awesome front-end to explore Open Data.