I used to live in Jupyter. Every project started with a notebook, every analysis stayed in a notebook, every model got trained from a notebook. It felt productive. It wasn't.
Here is what actually happened to every notebook I started, in order:
- Cell 1: imports
- Cells 2–10: load data, look at it, look at it again
- Cells 11–30: try things, comment them out, try other things
- Cell 31: it works
- Three weeks later: nothing works, no idea why, kernel won't restart
The problem isn't the notebook. The problem is that I confused exploring with building. Notebooks are great for the first thing and terrible for the second. The cell-as-unit-of-work model encourages you to keep state in your head and your kernel, which is exactly the wrong place to keep state.
What I do now
I still open a notebook for the first thirty minutes of any project. Look at the data, get a feel for it, run a few queries. Then I close it and never open it again. Everything that needs to live longer than thirty minutes goes into a .py file with a if __name__ == "__main__": block at the bottom. Everything that needs to be reproducible goes into a script I can run from the command line.
It's slower. It's also the only way I've found to ship anything that still works a month later.
The exception
The only place notebooks still earn their keep is in communication. If I need to show someone how a model behaves, a notebook with charts beats a script every time. But that's the last step, not the first.