Skip to content

Exploring Chicago crimes dataset with Jupyter notebooks, DuckDB, Malloy and new Panel/PyScript data and dashboard tools.

License

Notifications You must be signed in to change notification settings

RandomFractals/chicago-crimes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chicago-crimes

Exploring Chicago crimes dataset with DuckDB, Malloy Data, and soon new Panel/PyScript data and dashboard tools ...

Data Source

Data from: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2

Note: Chicago crimes data is too large for a github repository. You can download it from the the link above.

Chicago Crimes 2001 to Present Info ...

Raw Data Views

In VSCode

Raw dataset view in VSCode with Tabular Data Viewer and Rainbow CSV:

Chicago Crimes 2001 to Present Data ...

In DBeaver

Crimes CSV data imported in DBeaver for comparison:

Crimes Data in DBeaver

In Tad Viewer

Crimes CSV data imported into Tad Viewer:

Crimes Data in Tad Viewer

With Polars

Quick Chicago crimes CSV data scan and Arrests query with Polars in one cell code block :

Chicago Crimes with Polars

With Polars Parquet

Loading Chicago crimes .parquet data file with polars.read_parquet():

Chicago Crimes with Polars Parquet

With PyArrow

Loading Chicago crimes raw CSV data with PyArrow CSV:

Chicago Crimes with PyArrow

With PyArrow Feather and Parquet

Writing and reading Chicago crimes PyArrow Table data in Feather and Parquet data file formats:

Chicago Crimes with PyArrow Feather and Parquet

With DuckDB

Loading Chicago crimes CSV data into a blank in-memory DuckDB instance:

Chicago Crimes with DuckDB

With DuckDB Parquet

Loading Chicago crimes .parquet data via DuckDB read_parquet():

Chicago Crimes with DuckDB Parquet

With DuckDB SQLMagic

Loading Chicago crimes CSV data into a blank in-memory DuckDB with ipython-sql SQLMagic in VSCode Jupyter Notebook:

Chicago Crimes with DuckDB SQLMagic

With Malloy Data

Loading Chicago crimes 2022 parquet data with Malloy Data tools via DuckDB parquet data table source, with queries, data schema, Malloy queries outline, data preview, and query results displayed in VSCode Malloy extension query editor and views:

Chicago Crimes with Malloy

With Pandas

Loading Chicago crimes CSV data with Pandas:

Chicago Crimes Pandas Notebook

In R Studio

Loading Chicago crimes CSV data with DBI R library and DuckDB R API in R Studio:

Chicago Crimes with DuckDB in R

With Julia REPL

Reading Chicago crimes CSV data with DuckDB Julia Package in VSCode Julia lang, and running it in Julia REPL:

Chicago Crimes CSV to DuckDB in Julia REPL

With Julia CSVFiles and DataFrame

Loading Chicago crimes CSV data via Julia CSVFiles into native Julia DataFrames:

Chicago Crimes CSV in Julia DataFrame

In Emacs

Reading Chicago crimes CSV data with SBCL + cl-duckdb in Emacs + SLY:

Chicago Crimes CSV to DuckDB in Emacs

Visualizations

Collection of Jupyter notebooks and data apps visualizing Chicago crimes data from above.

With Matplotlib

Visualizing Chicago crimes data loaded with Pandas using Matplotlib:

Chicago Crimes Matplotlib Chart

With Altair Charts

2001-2022 Chicago crimes data loaded from a parquet file and summarized with Pandas and Altair charts:

Chicago Crimes Altair Charts Notebook

With PyScript

2022 Chicago crimes data loaded from a CSV file with data summary Altair charts in a browser, using Pyodide runtime and Pandas:

https://randomfractals.github.io/chicago-crimes/apps/pyscript/

Chicago Crimes PyScript Page

With Malloy Charts

Displaying Chicago crimes 2022 parquet data with Malloy Charts using Malloy Import with table source, measures, and data queries defined in Malloy Data section above:

Chicago Crimes Malloy Charts Summary

With Malloy Composer

View and query 2022 Chicago crime reports data loaded from parquet file with Malloy Composer app in your browser:

https://randomfractals.github.io/chicago-crimes/apps/malloy-composer/

Chicago Crimes Malloy Composer App

With Malloy Fiddle

View 2022 Chicago crime reports data, schema, Malloy model, and queries with Malloy Fiddle app in your browser:

https://randomfractals.github.io/chicago-crimes/apps/malloy-fiddle

Chicago Crime Reports Malloy Fiddle App

With DuckDB Sql Tools

Loading and querying 7,687,725 Chicago crime reports recorded from 2001 through the end of November 2022 from a large 1.68 GB CSV data file with new VSCode DuckDB Sql Tools extension:

Chicago Crime Reports with DuckDB Sql Tools

Exporting in-memory DuckDB instance with DuckDB Sql Tools:

Chicago Crime Reports DuckDB Database Export

Exporting DuckDB instance in Parquet data format and importing it into new test DuckDB memory instance:

Chicago Crime Reports DuckDB Database Export

Prior Works

Links to our prior works on Chicago Crimes EDA circa 2017/2018:

🔗 Chicago Crimes EDA Summary on Linkedin

📚 Chicago Crimes EDA 2017 Jupyter Notebooks

📚 Observable JS Chicago Crimes Notebook Collection 2018

📚 Chicago Homocides Observable JS Notebook Collection 2018

🐦 Chicago Crimes EDA, Data Preview and Tabular Data Viewer tweets

About

Exploring Chicago crimes dataset with Jupyter notebooks, DuckDB, Malloy and new Panel/PyScript data and dashboard tools.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published