Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace many Pandas operations with NumPy #198

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

JCGoran
Copy link
Contributor

@JCGoran JCGoran commented Jul 15, 2024

Describe your changes

  • use numpy instead of Pandas to avoid the overhead

Perf before:

320361688 function calls (316213771 primitive calls) in 116.715 seconds

Perf after:

79419311 function calls (78769517 primitive calls) in 43.085 seconds

which is more or less in-line with the circular shapes.

Checklist

  • Test cases have been modified/added to cover any code changes.
  • Docstrings have been modified/created for any code changes.
  • All linting and formatting checks pass (see the contributing guidelines for more information).

@github-actions github-actions bot added testing Relating to the testing suite shapes Work relating to shapes module data Work relating to data module plotting Work relating to plotting module morpher Work relating to morpher module labels Jul 15, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations on making your first pull request to Data Morph! Please familiarize yourself with the contributing guidelines, if you haven't already.

@stefmolin
Copy link
Owner

Thanks for the PR, @JCGoran! As I'm sure you've seen, I have a backlog to get through 😄 I hope to get to this in the next few weeks.

Copy link

codecov bot commented Jul 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.53%. Comparing base (e440ee7) to head (0a25272).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #198   +/-   ##
=======================================
  Coverage   98.53%   98.53%           
=======================================
  Files          58       58           
  Lines        1907     1915    +8     
  Branches      114      114           
=======================================
+ Hits         1879     1887    +8     
  Misses         25       25           
  Partials        3        3           
Files with missing lines Coverage Δ
src/data_morph/data/dataset.py 74.07% <100.00%> (+0.65%) ⬆️
src/data_morph/data/stats.py 100.00% <100.00%> (ø)
src/data_morph/morpher.py 100.00% <100.00%> (ø)
src/data_morph/plotting/static.py 100.00% <100.00%> (ø)
tests/data/test_stats.py 100.00% <100.00%> (ø)
tests/test_morpher.py 100.00% <100.00%> (ø)
---- 🚨 Try these New Features:

@stefmolin stefmolin added this to the 0.3.0 milestone Jul 16, 2024
Copy link
Owner

@stefmolin stefmolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start by pulling the LineCollection changes into a separate PR.

src/data_morph/shapes/bases/line_collection.py Outdated Show resolved Hide resolved
src/data_morph/shapes/bases/line_collection.py Outdated Show resolved Hide resolved
@JCGoran JCGoran mentioned this pull request Jul 22, 2024
3 tasks
@JCGoran JCGoran force-pushed the jelic/feature/vectorize branch from 298870b to e708f6c Compare July 22, 2024 22:08
@github-actions github-actions bot removed the shapes Work relating to shapes module label Jul 22, 2024
@JCGoran JCGoran changed the title Refactor to use more numpy functions internally Replace many Pandas operations with NumPy Jul 22, 2024
@JCGoran JCGoran requested a review from stefmolin July 30, 2024 18:04
@JCGoran
Copy link
Contributor Author

JCGoran commented Sep 24, 2024

Bump, this is more or less ready for review as-is.

@stefmolin
Copy link
Owner

I haven't forgotten 😄 I'm going to work through the PyCon Taiwan sprint PRs first since I couldn't get to them all at the event, and I want to think more about the design of the internals here. I'm traveling right now and will have very limited time for the next couple of weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Work relating to data module morpher Work relating to morpher module plotting Work relating to plotting module testing Relating to the testing suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants