NOTE: This text has been automatically extracted from a Colab/Jupyter notebook. If you have any questions, feel free to leave a comment there (requires sign in with a Google account).


NOTE: If you are on Colab, you can download the code below as a single script using File ▶ Download ▶ Download .py. The script can then be run with python3 ipynbtomarkdown.py <notebook.ipynb>

Given Jupyter Notebooks are JSON files, we need to import the json module. The sys module will be used to get the notebook file name from arguments.

import json
import sys

For Markdown cells, print as is:

def emit_markdown(src):
    return '\n{}\n'.format(''.join(src))

If a title is detected, print it as a YAML front matter:

def extract_front_matter(idx, src):
    if idx == 0 and src[0].startswith('#'):
        return src[1:], '---\ntitle: {}\n---'.format(src[0][1:].strip())
    else:
        return src, ''

For code cells, wrap in a code block with Python syntax highlighting, except for those that start with a %%bash cell magic, for which we print without any syntax highlighting.

def emit_code(src):
    return '\n```{}\n{}\n```\n'.format('' if src[0] == '%%bash\n' else 'python',''.join(src))

Parse the JSON from the notebook and iterate on each cell, printing it based on the type:

with open(sys.argv[1], 'r') as f:
    data = json.load(f)
    assert data['metadata']['language_info']['name'] == 'python'
    markdown = ''
    for idx, c in enumerate(data['cells']):
        if c['cell_type'] == 'markdown':
            src, front_matter = extract_front_matter(idx, c['source'])
            markdown += front_matter + emit_markdown(src)
        elif c['cell_type'] == 'code':
            markdown += emit_code(c['source'])
    print(markdown + '\n<!-- Generated with ipynbtomarkdown.py -->')

Known issues

  • Plain URLs (e.g. https://example.com/) are not converted to links automatically. The workaround is to use the [...](...) Markdown syntax.