Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickleable particles #1122

Merged

Conversation

StanczakDominik
Copy link
Member

Closes #1011!

I do not fully understand why this works. But it seems that it does.

I was tinkering with slicing of IonizationState as part of #1059 (@namurphy I have actual arguments for that, and it's just the first of many travesties I want to do to IonizationState). I then somehow stumbled upon this, that got me thinking, I started looking at Particle.__dict__, noticed that json.dumps(particle) complains about _PeriodicTable contained within that __dict__, realized that's a namedtuple, thought hey, if namedtuples have trouble pickling (I don't understand why they would, that's my whole problem with this solutin!), maybe dataclasses wouldn't. And guess what, they don't. And the xfailing tests started passing.

@pheuer could you look at that issue you had previously with scipy.optimize, maybe even add a smaller test case of that somewhere as a failing (pytest.mark.xfail)? I'd like to check again. Maybe this got fixed.

@github-actions github-actions bot added the plasmapy.particles Related to the plasmapy.particles subpackage label Apr 21, 2021
@codecov
Copy link

codecov bot commented Apr 21, 2021

Codecov Report

Merging #1122 (740daed) into master (a23a160) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1122      +/-   ##
==========================================
- Coverage   96.91%   96.90%   -0.02%     
==========================================
  Files          70       70              
  Lines        6930     6936       +6     
==========================================
+ Hits         6716     6721       +5     
- Misses        214      215       +1     
Impacted Files Coverage Δ
plasmapy/particles/elements.py 100.00% <100.00%> (ø)
plasmapy/particles/ionization_state.py 93.18% <0.00%> (-0.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a23a160...740daed. Read the comment docs.

@StanczakDominik
Copy link
Member Author

Having had time to think this through on the evening doggo walk,

Nope, I got nothing.

@pheuer
Copy link
Member

pheuer commented Apr 21, 2021

I can write a somewhat contrived test using scipy.optimize (I'd have to make up a fitting function that uses a particle instance), but I'm pretty sure that if your new version is pickleable (which it seems like it is, if @namurphy 's test is passing now) then I think it should work with scipy too?

@StanczakDominik
Copy link
Member Author

It should! If it's more complex than necessary, don't bother :)

@pheuer
Copy link
Member

pheuer commented Apr 22, 2021

So, my attempts at writing a test function that would demonstrate the issue with differential_evolution in lmfit, parallelization, and particles didn't work. I did try just merging this over onto the thomson_fitting branch. That did clear the old error, which was :

PicklingError: Can't pickle <class 'plasmapy.particles.elements.periodic_table'>: attribute lookup periodic_table on plasmapy.particles.elements failed

But resulted in this new error

    ret = differential_evolution(self.penalty, _bounds, **fmin_kws)
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\site-packages\scipy\optimize\_differentialevolution.py", line 307, in differential_evolution
    constraints=constraints) as solver:
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\site-packages\scipy\optimize\_differentialevolution.py", line 501, in __init__
    self._mapwrapper = MapWrapper(workers)
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\site-packages\scipy\_lib\_util.py", line 373, in __init__
    self.pool = Pool()
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    w.start()
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\pvheu\Anaconda3\envs\plasmapy\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

In summary, I think this fixes the pickling issue, although I'm not sure if the new error is related to your fix or was just always waiting after the first error.

@StanczakDominik
Copy link
Member Author

I'd still appreciate the code for that! If it's doing complex setup beforehand, that's fine; so long as it reproduces. Maybe I can pinpoint the reason why it's crashing.

@pheuer
Copy link
Member

pheuer commented Apr 22, 2021

To test in-vivo in the thomson fitting, just set fit_kws={'workers':-1} in run_fit() in test_thomson.py. That turns on the built-in parallelization.

But, I think I got a test working! The code below fits two models: one with a Particle in it and one without. On the main branch version, only the first fit executes, then the same "Not pickleable" error is raised during the second one. But, on your branch, both execute! This could be rearranged into a test, but it's computationally intensive and I think redundant with the pickling tests that already exist so I'd be inclined to leave it out of the official tests?

Don't worry about that second error above: that was my fault (not a problem with the code).

from lmfit import Parameters, Model

def _mock_model(x, settings=None, **params):
    particle = settings['particle']
    a = params['a']
    b = params['b']
    return particle.integer_charge*np.exp(-(x-a)**2/b**2)


def _mock_model2(x, settings=None, **params):
    a = params['a']
    b = params['b']
    return np.exp(-(x-a)**2/b**2)


def test_differential_evolution_oarallelization_with_particles():
    x = np.linspace(1, 5, num=1000)
    particle = Particle('H+')
    
    a = 3
    b = 1
    
    ydata = particle.integer_charge*np.exp(-(x-a)**2/b**2)*np.random.normal(loc=1.0, scale=0.01, size=x.size)
    
    
    params = Parameters()
    params.add('a', 5, min=0.1, max=10, vary=True)
    params.add('b', 5, min=0.1, max=10, vary=True)
    
    fit_kws = {'workers':-1, 'updating':'deferred'}

    
    # TEST 1: Create model without particle. This should work in parallel
    model = Model(_mock_model2, intependent_vars=["x"],
                  settings=None)
    
    
    res = model.fit(ydata, params,  x=x, fit_kws=fit_kws, method='differential_evolution')
    print(res.best_values)
    
    
    # TEST 2: Create model with the particle. Should fail if particles cannot
    # be pickled
    model = Model(_mock_model, intependent_vars=["x"],
                  settings={'particle':particle})
    
    res = model.fit(ydata, params,  x=x, fit_kws=fit_kws, method='differential_evolution')
    print(res.best_values)

I was inserting this test at the bottom of test_pickling.py

@StanczakDominik
Copy link
Member Author

Yeah, I suspect it's fine to leave it out. I can confirm that both pass on my branch.

Shall we merge it, then? :)

@pheuer
Copy link
Member

pheuer commented Apr 24, 2021

Sounds good to me! What does @namurphy think?

@StanczakDominik StanczakDominik merged commit 8e55f58 into PlasmaPy:master May 4, 2021
@StanczakDominik StanczakDominik deleted the pickleable_particles branch May 4, 2021 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plasmapy.particles Related to the plasmapy.particles subpackage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pickle-able Particles (and other objects, for parallelization)
3 participants