Exploring Chicago crimes dataset with DuckDB, Malloy Data, and soon new Panel/PyScript data and dashboard tools ...
Data from: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2
Note: Chicago crimes data is too large for a github repository. You can download it from the the link above.
Raw dataset view in VSCode with Tabular Data Viewer and Rainbow CSV:
Crimes CSV
data imported in DBeaver for comparison:
Crimes CSV
data imported into Tad Viewer:
Quick Chicago crimes CSV
data scan and Arrests query with Polars in one cell code block :
Loading Chicago crimes .parquet
data file with polars.read_parquet()
:
Loading Chicago crimes raw CSV
data with PyArrow CSV:
Writing and reading Chicago crimes PyArrow Table data in Feather and Parquet data file formats:
Loading Chicago crimes CSV
data into a blank in-memory
DuckDB instance:
Loading Chicago crimes .parquet
data via DuckDB read_parquet()
:
Loading Chicago crimes CSV
data into a blank in-memory
DuckDB with ipython-sql
SQLMagic in VSCode Jupyter Notebook:
Loading Chicago crimes 2022 parquet
data with Malloy Data tools via DuckDB parquet data table source, with queries, data schema, Malloy queries outline, data preview, and query results displayed in VSCode Malloy extension query editor and views:
Loading Chicago crimes CSV
data with Pandas:
Loading Chicago crimes CSV
data with DBI R library and DuckDB R API in R Studio:
Reading Chicago crimes CSV
data with DuckDB Julia Package in VSCode Julia lang, and running it in Julia REPL:
Loading Chicago crimes CSV
data via Julia CSVFiles into native Julia DataFrames:
Reading Chicago crimes CSV
data with SBCL + cl-duckdb in Emacs + SLY:
Collection of Jupyter notebooks and data apps visualizing Chicago crimes data from above.
Visualizing Chicago crimes data loaded with Pandas using Matplotlib:
2001-2022 Chicago crimes data loaded from a parquet
file and summarized with Pandas and Altair charts:
2022 Chicago crimes data loaded from a CSV
file with data summary Altair charts in a browser, using Pyodide runtime and Pandas:
https://randomfractals.github.io/chicago-crimes/apps/pyscript/
Displaying Chicago crimes 2022 parquet
data with Malloy Charts using Malloy Import with table source, measures, and data queries defined in Malloy Data section above:
View and query 2022 Chicago crime reports data loaded from parquet
file with Malloy Composer app in your browser:
https://randomfractals.github.io/chicago-crimes/apps/malloy-composer/
View 2022 Chicago crime reports data, schema, Malloy model, and queries with Malloy Fiddle app in your browser:
https://randomfractals.github.io/chicago-crimes/apps/malloy-fiddle
Loading and querying 7,687,725 Chicago crime reports recorded from 2001 through the end of November 2022 from a large 1.68 GB CSV data file with new VSCode DuckDB Sql Tools extension:
Exporting in-memory DuckDB instance with DuckDB Sql Tools:
Exporting DuckDB instance in Parquet data format and importing it into new test DuckDB memory instance:
Links to our prior works on Chicago Crimes EDA circa 2017/2018:
🔗 Chicago Crimes EDA Summary on Linkedin
📚 Chicago Crimes EDA 2017 Jupyter Notebooks
📚 Observable JS Chicago Crimes Notebook Collection 2018
📚 Chicago Homocides Observable JS Notebook Collection 2018
🐦 Chicago Crimes EDA, Data Preview and Tabular Data Viewer tweets