Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wb-21fd5541-style-guide-models-integrations-20260527-015516.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

PaddleOCR provides multilingual, practical OCR tools that help users train models and apply them in production, implemented in PaddlePaddle. PaddleOCR supports a range of OCR algorithms and includes industrial solutions. PaddleOCR includes a W&B integration for logging training and evaluation metrics along with model checkpoints and corresponding metadata. This page shows you how to enable the W&B integration in PaddleOCR so that your OCR training runs automatically stream metrics, validation results, and checkpoint metadata to a W&B dashboard. Use this integration to compare experiments, monitor training in real time, and keep a versioned history of your OCR models.

Example blog and Colab

See the PaddleOCR and W&B training tutorial for how to train a model with PaddleOCR on the ICDAR2015 dataset. This also comes with a Google Colab and the corresponding live W&B dashboard. A Chinese version of this blog is also available: W&B对您的OCR模型进行训练和调试.

Sign up and create an API key

An API key authenticates your machine to W&B. You can generate an API key from your user profile.
For a more streamlined approach, create an API key by going directly to User Settings. Copy the newly created API key immediately and save it in a secure location such as a password manager.
  1. Click your user profile icon in the upper right corner.
  2. Select User Settings, then scroll to the API Keys section.

Install the wandb library and log in

To install the wandb library locally and log in:
  1. Set the WANDB_API_KEY environment variable to your API key.
    export WANDB_API_KEY=[YOUR_API_KEY]
    
  2. Install the wandb library and log in.
    pip install wandb
    
    wandb login
    

Add wandb to your config.yml file

PaddleOCR requires you to provide configuration variables using a YAML file. To enable W&B logging, add the following snippet at the end of the configuration YAML file. This setting configures PaddleOCR to automatically log all training and validation metrics to a W&B dashboard along with model checkpoints:
Global:
    use_wandb: True
You can also add any additional, optional arguments that you want to pass to wandb.init() under the wandb header in the YAML file:
wandb:
    project: CoolOCR  # (optional) this is the wandb project name
    entity: my_team   # (optional) if you're using a wandb team, you can pass the team name here
    name: MyOCRModel  # (optional) this is the name of the wandb run

Pass the config.yml file to train.py

With W&B logging configured, start training by passing the YAML file as an argument to the training script available in the PaddleOCR repository.
python tools/train.py -c config.yml
Once you run your train.py file with W&B enabled, PaddleOCR generates a link to your W&B dashboard, where you can monitor training and validation metrics in real time:
PaddleOCR training dashboard
PaddleOCR validation dashboard
Text detection model dashboard

Feedback or issues

If you have any feedback or issues about the W&B integration, open an issue on the PaddleOCR GitHub or email support@wandb.com.