"Hooks" for PathSampling#911
Conversation
|
Great to see this! Sorry that I'm not going to be very responsive in the next week or two. Somehow I'm significantly busier on lockdown than at normal times. The current test failure is a known issue with the most recent conda-forge release of svgwrite. It should be fixed very soon -- in principle it is fixed; needs me to reply to someone though (one of the many things on my list). Quick comments on your questions/todos:
This is a good point. I'm being quick here, so I'm not even checking the code, but what should probably happen is to treat the What should not happen is, hook is added, implicitly takes the simulator's value. Then user changes simulator's value, and it has no effect. This would be a bug.
This seems fine; but is there also a way to ensure that the 0th step is visualized? (Maybe a The caveat on refreshing output is completely fine. There's no way around that, so as long as the docstrings are clear on the topic, all good.
Good question. As I recall, the idea was for Is the time info actually used by anything other than the output hook? You can probably move the time info into the hook -- get initial time at |
|
No problem. Concerning the points:
|
|
First off, the error you have now should be fixed as soon as you merge in master. It's a problem with the miniconda install (they moves the web page we use for checksums). Point-by-point:
Let me sketch out in code what I meant, which I think will address the issue. Let's pretend that the simulator has a property called class MyHook(PathSimulatorHook):
def __init__(self, frequency=None):
self._simulation = None
self._frequency = frequency
def before_simulation(self, sim):
self._simulation = sim
@property
def frequency(self):
if self._freq_val is not None:
return self._frequency
elif self._simulation is not None:
return self._simulation.frequency
else:
raise RuntimeError("`frequency` has not been set") # maybe return None?That isn't how I initially implemented it, but I think it (mostly) makes more sense. It solves the problem you're thinking of, right? If you ever set it in the hook, then the hook's value is used (even if you set it in the simulator later). If you haven't set it in the hook, the simulation's value is used. The only question is what to return when
If the first step can be visualized, great, but no big deal if it can't easily be done.
I'd say move timing into the hook. For any other hook that needs it, I'm not worried about the overhead of calling
I think this sounds like a very good idea. In the long run, I want to have mid-trajectory checkpointing, because there are cases where a single trajectory might be as long as your allowed wall time. However, this would be a very useful improvement in the meantime. If you implement this, please do a separate PR for it. It deserves a separate mention in the release notes. |
|
Yes, that solves the problem I had in mind. I went ahead and implemented it for all hooks. I moved the timing to the PathSamplingOutputHook, which has the added benefit that the test now also test the actual timing (and not only that we are able to print the correct string given the numbers). |
779c257 to
6e0f50a
Compare
Codecov Report
@@ Coverage Diff @@
## master #911 +/- ##
==========================================
+ Coverage 81.40% 81.47% +0.07%
==========================================
Files 140 140
Lines 15237 15362 +125
==========================================
+ Hits 12403 12516 +113
- Misses 2834 2846 +12
Continue to review full report at Codecov.
|
|
@dwhswenson I would say this is ready for review. I cleaned up a bit and added the |
…dified storage hook to also suit pathsampling
…ties to ensure a smooth user experience; PathsimulatorOutputHook now uses regex matching to allow for some leeway in the timing output
This will make it easier to implement other approaches to running path sampling. For example, this should make it possible to: * clean up run_until_decorrelated * allow running until a certain number of accepted trajectories * use algorithms that use a replica reservoir
7fbc388 to
b06666c
Compare
|
@dwhswenson Not morning anymore, but the branch is rebased on the current master and updated. But it looks like I will have to write some more test tomorrow... :) |
openpathsampling/beta/hooks.py
Outdated
| if self._output_stream is not None: | ||
| return self._output_stream | ||
| elif self._simulation is not None: | ||
| return self._simulation.output_stream | ||
| else: | ||
| raise SimulationNotFoundError("'output_stream' has not " |
There was a problem hiding this comment.
(Starting to glance over it, but a thought before you spend much time on remaining tests.) You might to create something like
def _self_or_sim_property(obj, attr_name):
if getattr(obj, "_" + attr_name) is not None:
return getattr(obj, "_" + attr_name)
... # etc.then use it as
@property
def output_stream(self):
return _self_or_sim_property(self, 'output_stream')This would allow you to test the underlying logic once, and not need to worry about creating multiple tests for each property. More complete integration test are also welcome, but this definitely seems like a motif that is being (and will continue to be) repeated.
There was a problem hiding this comment.
(You could add a return_none=False to that private method to make it return None instead of erroring when that it appropriate)
dwhswenson
left a comment
There was a problem hiding this comment.
Mostly looks good, and thanks for the contribution!
A few mostly minor changes. In particular:
- probably best to move
sanity_checkto its own hook - I don't think
LiveVisualizerHookshould be a default hook - please patch
time.timein your unit tests -- this should make a big difference on performance and reproducibility
BTW, also thanks for the TODO & other comments with questions. It makes very easy to find points that should be considered.
|
@dwhswenson All done (I think).
if self.storage is not None:
self.storage.sync_all()This should work in all cases, namely if the storage set only in the hook and also if the storage is taken from the simulation (because then it will be at |
|
So I just realized that I did not address the |
|
Looks good, as soon as the try/except clause for SimStore vs netcdfplus saving is added!
I agree. I'm not sure why there was a thing to avoid doing a second |
Done.
That was me beeing confused from what we previously had if self.storage is not None:
sim.storage.sync_all()which did not make sense to me (checking |
Ah, I understand. The Anyway, all looks good now! Merging -- thanks again! |
This builds on #755 and should only be merged after that.
It adds hook support to the
PathSampling, thePathSamplingOutputHookand theLiveVisualizerHookincluding tests.Please let me know if htere is anything to improve!
I also have some open questions/TODOs, in particular:
if a user sets any attribute that controls the behaviour of the hook on the
PathSimulatorobject it is only read once at the start of the simulation and only if the hooks value for that attribute isNone(see e.g.output_stream,save_frequencyandstorage). This is potentially confusing, as I would expect to be able to set these attributes on thePathSimulatorto influence the behaviour of the currently attached hooks. I am mot really sure how to adress this. Potential solutions include removing these attributes from thePathSimulatorand only set them on hook creation. Another option is to make them properties to either (re)set the respective attributes on the hooks (or at least warn the user that it could not work as intended) if he/she sets the attributes on thePathsimulatorPreviously the LiveVisualizer was called before the MCstep but visualizing the previous step. I moved the LiveVisualization to after step. This enables visualization of the step that just finished and avoids that the beginning of the next step depends on the previous one. But it also means that visualization only works if the serial OutputHook has
allow_refresh=False, otherwise the plot will get refreshed away.The
step_infopassed to theafter_stephooks from thePathSamplingis specifically tailored for thePathSamplingOutputHook, i.e. it only takes the variables the OutputHook needs (current step, total steps, total elapsed time). @dwhswenson any ideas what else we could want/need in there in the future?