This project was completed in an internship at Terranexum. See Terranexum's website.
In this project, I learned how to use several data visualization and analysis libraries in order to graph the greenhouse gas emissions occurring on the power grid. In order to do so, I had to research data sources showing major emission locations, power line locations, and emission quantities - and the sources that I found for that data are listed below under Data. Through creating visualizations, I was able to create maps (both static and interactive) that show major emission sources and their proximity to the power grid and find statistics on the percentage of emissions that occur very close to power infrastructure.
I made a presentation explaining the whole project: Presentation
To use the notebooks, a user must download the necessary data for each notebook (each notebook has its required data in a header, and the links to download the data are below). It also may be necessary to download the below libraries.
- OpenStreetMap (through OSMnx)
- eGRID
- HIFLD
- EPA FLight Data
- Canadian Open Source Data
- USGS State Border Data
A major section of this project consisted of learning how to obtain, handle, and graph data, and one of the major libraries to allow me to do so was geopandas. Geopandas allows a user to create efficient maps using shapefile and geojson data through creating geodataframes.
-
In the below graph, I learned how to obtain data from OpenStreetMap and graph that data with different colors according to what type of data was being graphed. (See notebook.ipynb for more information)
-
In the next graph, I learned how to overlay multiple data sources and use an extra data source to filter data. (See overLayingData.ipynb for more information)
-
Using the above graph, I used pandas to create a dataframe with only the emissions sources inside 2.3 miles of the powergrid to extract data showing what percentage of emissions came from within 2.3 miles of the power grid.
-
I overlayed EPA Flight data about national major CO2 emission sources onto HIFLD data showing all US transmision lines to find only the EPA Flight emission sources that fell within 2.3 miles of the power grid. * This map is interactive, so when the program is loaded the user can zoom into different sections of the map *
-
Using that overlay, I found the percetage of EPA Flight reported emissions that fell within 2.3 miles of the power grid and the percentage of EPA Flight emission sources that fell within 2.3 miles of the power grid.
-
I also found the percentage of greenhouse gas emissions that were inside 2.3 miles of a transmission line by state, and graphed each state's percentage in a choropleth map.
I also experimented with using Plotly for visualizations. Plotly is another library that allows for more interactive data visualization. In the below graph, the data in the graph being shown could be switched between three different data sets depending on a user selection.
- I began experimenting with using SRAI, an AI Python library that supports geospatial data analysis. I used SRAI to regionalize and embed spatial data about transmission lines in Denver, CO create a map showing the spatial distrobution of transmission lines in Denver.
- I repeated the same process to show the density of transmission lines across Colorado as well.