jupyter-cache1.0.0
Published
A defined interface for working with a cache of jupyter notebooks.
pip install jupyter-cache
Package Downloads
Authors
Project URLs
Requires Python
>=3.9
Dependencies
- attrs
- click
- importlib-metadata
- nbclient
>=0.2
- nbformat
- pyyaml
- sqlalchemy
>=1.3.12,<3
- tabulate
- click-log
; extra == "cli"
- pre-commit
>=2.12 ; extra == "code_style"
- nbdime
; extra == "rtd"
- ipykernel
; extra == "rtd"
- jupytext
; extra == "rtd"
- myst-nb
; extra == "rtd"
- sphinx-book-theme
; extra == "rtd"
- sphinx-copybutton
; extra == "rtd"
- nbdime
; extra == "testing"
- coverage
; extra == "testing"
- ipykernel
; extra == "testing"
- jupytext
; extra == "testing"
- matplotlib
; extra == "testing"
- nbformat
>=5.1 ; extra == "testing"
- numpy
; extra == "testing"
- pandas
; extra == "testing"
- pytest
>=6,<8 ; extra == "testing"
- pytest-cov
; extra == "testing"
- pytest-regressions
; extra == "testing"
- sympy
; extra == "testing"
jupyter-cache
A defined interface for working with a cache of jupyter notebooks.
Why use jupyter-cache?
If you have a number of notebooks whose execution outputs you want to ensure are kept up to date, without having to re-execute them every time (particularly for long running code, or text-based formats that do not store the outputs).
The notebooks must have deterministic execution outputs:
- You use the same environment to run them (e.g. the same installed packages)
- They run no non-deterministic code (e.g. random numbers)
- They do not depend on external resources (e.g. files or network connections) that change over time
For example, it is utilised by jupyter-book, to allow for fast document re-builds.
Install
pip install jupyter-cache
For development:
git clone https://github.com/executablebooks/jupyter-cache
cd jupyter-cache
git checkout develop
pip install -e .[cli,code_style,testing]
See the documentation for usage.
Development
Some desired requirements (not yet all implemented):
- Persistent
- Separates out "edits to content" from "edits to code cells". Cell rearranges and code cell changes should require a re-execution. Content changes should not.
- Allow parallel access to notebooks (for execution)
- Store execution statistics/reports
- Store external assets: Notebooks being executed often require external assets: importing scripts/data/etc. These are prepared by the users.
- Store execution artefacts: created during execution
- A transparent and robust cache invalidation: imagine the user updating an external dependency or a Python module, or checking out a different git branch.
Contributing
jupyter-cache follows the Executable Book Contribution Guide. We'd love your help!
Code Style
Code style is tested using flake8,
with the configuration set in .flake8
,
and code formatted with black.
Installing with jupyter-cache[code_style]
makes the pre-commit
package available, which will ensure this style is met before commits are submitted, by reformatting the code
and testing for lint errors.
It can be setup by:
>> cd jupyter-cache
>> pre-commit install
Optionally you can run black
and flake8
separately:
>> black .
>> flake8 .
Editors like VS Code also have automatic code reformat utilities, which can adhere to this standard.