tabula-py: Read tables in a PDF into DataFrame

tabula-py is a simple Python wrapper of tabula-java, which can read table of PDF. You can read tables from PDF and convert them into pandas’ DataFrame. tabula-py also converts a PDF file into CSV/TSV/JSON file.

We highly recommend looking at the example notebook and trying it on Google Colab.

For high-level API reference, see High level interfaces.

Indices and tables