nblite

A lightweight wrapper around nbdev for streamlined notebook-driven development

nblite simplifies the workflow between Jupyter notebooks, Python scripts, and module code, enhancing the notebook-driven development process.

Note: nblite is merely a wrapper around nbdev with some adjustments and additions adapted to the needs of the Autonomy Data Unit. Full credit of the concept and implementation of notebook-driven development using Jupyter notebooks should go to the creators of nbdev.

Installation

pip install nblite

Core Concepts

Code locations

Directories containing code in different formats (notebooks, scripts, modules). Each code location is defined in the nblite.toml configuration file and will store different representations of your code. Available formats are:

Format Format key File Extension
Python module module py
Jupyter notebook ipynb ipynb
Percent percent pct.py
Light light lgt.py
Sphinx sphinx spx.py
Myst myst myst.md
Pandoc pandoc pandoc.md

In the nblite.toml you define the code locations and the formats of the code within them:

[cl.nbs]
format="ipynb"
path="notebooks"

[cl.pts]
format="percent"
path="percent_notebooks"

[cl.lib]
format="module"
path="nblite"

Here we have defined three code locations (nbs, pts and lib) and specified their paths (relative to the project root) and their formats. Read more about plaintext notebook formats here.

Export pipeline

Defines the flow of code conversion between different code locations. For example, a typical pipeline might be:

nbs -> pts
pts -> lib

This means: 1. Start with notebooks (.ipynb) as the source 2. Convert them to percent scripts (.pct.py) 3. Finally export to Python library modules (.py)

Notebook ‘twins’

Corresponding versions of the same content in different formats. When you write a notebook my_notebook.ipynb, nblite can create twins like: - my_notebook.pct.py (percent script) - my_notebook.lgt.py (light script) - my_module/my_notebook.py (Python module)

These twins contain the same logical content but in different formats, allowing you to use the format that’s most appropriate for the task at hand.

Why store plaintext versions?

While Jupyter notebooks (.ipynb) are excellent for interactive development, they pose challenges for version control systems like Git:

  1. Git-Friendly: Plaintext formats (.pct.py, .lgt.py, .py) are better handled by Git, making diffs and merge conflicts easier to resolve.
  2. GitHub UI: GitHub’s interface more effectively displays changes in plaintext Python files compared to JSON-formatted notebook files.
  3. Code Review: Reviewing code changes is more straightforward with plaintext formats.
  4. Cleaner History: By cleaning notebook outputs before committing, you avoid polluting your Git history with large output cells and changing execution counts.
  5. Collaboration: Team members can work with the format they prefer—notebooks for exploration, Python files for implementation.

The export pipeline ensures that changes made in one format are propagated to all twins, maintaining consistency across representations.

Key Features

  • Export Pipeline: Convert notebooks between different formats (.ipynb, percent scripts, light scripts, and Python modules)
  • Documentation: Generate documentation from notebooks using Quarto
  • Git Integration: Clean notebooks and enforce consistent git commits
  • Parallel Execution: Execute notebooks in parallel for faster workflow
  • Export as Functions: Notebooks can be exported as functions

Quick Start

Initialize a project

# Create a new nblite project
nbl init --module-name my_project

Set up Git hooks

# Install pre-commit hooks for automatic notebook cleaning
nbl install-hooks

Git hooks ensure that notebooks are properly cleaned before committing. The pre-commit hook automatically: - Validates that notebooks are clean (removes metadata and outputs) - Ensures that all notebook twins are consistent - Prevents accidental commits of unclean notebooks

Create a new notebook

# Create a new notebook in a code location
nbl new nbs/my_notebook.ipynb

Fill Notebooks with Outputs

The nbl fill command is used to execute all the cells in all .ipynb notebooks.

nbl fill

This command also works as a testing command.

Prepare your project

# Export, clean, and fill notebooks in one command
nbl prepare

Configuration

nblite uses a TOML configuration file (nblite.toml) at the project root:

export_pipeline = """
nbs -> pts
pts -> lib
"""
docs_cl = "nbs"
docs_title = "My Project"

[cl.lib]
path = "my_module"
format = "module"

[cl.nbs]
format = "ipynb"

[cl.pts]
format = "percent"

Common Commands

Run nbl to see all available commands.

Export and Conversion

  • nbl export: Export notebooks according to the export pipeline
  • nbl convert <nb_path> <dest_path>: Convert a notebook between formats
  • nbl clear: Clear downstream code locations

Notebook Management

  • nbl clean: Clean notebooks by removing outputs and metadata
  • nbl fill: Execute notebooks and fill with outputs
  • nbl test: Test that notebooks execute without errors (dry run of fill)

Documentation

  • nbl readme: Generate README.md from index.ipynb
  • nbl render-docs: Render project documentation using Quarto
  • nbl preview-docs: Preview documentation

Git Integration

  • nbl git-add: Add files to git staging with proper cleaning
  • nbl validate-staging: Validate that staged notebooks are clean
  • nbl install-hooks: Install git hooks for the project

Development Workflow

  1. Write code in Jupyter notebooks (.ipynb)
  2. Run nbl export to convert to other formats
  3. Run nbl clean before committing to git
  4. Use nbl fill (or nbl test if outputs are not to be rendered) to verify your notebooks execute correctly
  5. Use nbl render-docs to generate documentation, or use nbl preview-docs to preview the documentation.