Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wb-21fd5541-style-guide-models-integrations-20260527-015516.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This page shows how to integrate Weights & Biases (W&B) with Simple Transformers so you can visualize and track Transformer model training. By the end, you’ll know how to enable W&B logging from a Simple Transformers model and where to find examples for common NLP tasks. Simple Transformers is based on the Transformers library by Hugging Face and lets you train and evaluate Transformer models. You need only three lines of code to initialize a model, train the model, and evaluate a model. It supports sequence classification, token classification (NER), question answering, language model fine-tuning, language model training, language generation, T5 model, Seq2Seq tasks, multi-modal classification, and conversational AI. To use W&B for visualizing model training, set a project name for W&B in the wandb_project attribute of the args dictionary. This logs all hyperparameter values, training losses, and evaluation metrics to the given project.
model = ClassificationModel('roberta', 'roberta-base', args={'wandb_project': 'project-name'})
You can pass any additional arguments that go into wandb.init() as wandb_kwargs.

Structure

The following section outlines how Simple Transformers organizes its classes, so you know which module to import for a given task. The library is designed to have a separate class for every NLP task. The classes that provide similar functionality are grouped together.
  • simpletransformers.classification - Includes all classification models.
    • ClassificationModel
    • MultiLabelClassificationModel
  • simpletransformers.ner - Includes all named entity recognition models.
    • NERModel
  • simpletransformers.question_answering - Includes all question answering models.
    • QuestionAnsweringModel
The following sections describe minimal examples for two common tasks, demonstrating how to enable W&B logging through the wandb_project argument.

Multi-label classification

  model = MultiLabelClassificationModel("distilbert","distilbert-base-uncased",num_labels=6,
    args={"reprocess_input_data": True, "overwrite_output_dir": True, "num_train_epochs":epochs,'learning_rate':learning_rate,
                'wandb_project': "simpletransformers"},
  )
   # Train the model
  model.train_model(train_df)

  # Evaluate the model
  result, model_outputs, wrong_predictions = model.eval_model(eval_df)

Question answering

  train_args = {
    'learning_rate': wandb.config.learning_rate,
    'num_train_epochs': 2,
    'max_seq_length': 128,
    'doc_stride': 64,
    'overwrite_output_dir': True,
    'reprocess_input_data': False,
    'train_batch_size': 2,
    'fp16': False,
    'wandb_project': "simpletransformers"
}

model = QuestionAnsweringModel('distilbert', 'distilbert-base-cased', args=train_args)
model.train_model(train_data)

Global arguments

SimpleTransformers provides classes as well as training scripts for all common natural language tasks. The following is the complete list of global arguments that the library supports, with their default arguments. Refer to this list when you want to customize training behavior beyond the W&B-specific options shown earlier.
global_args = {
  "adam_epsilon": 1e-8,
  "best_model_dir": "outputs/best_model",
  "cache_dir": "cache_dir/",
  "config": {},
  "do_lower_case": False,
  "early_stopping_consider_epochs": False,
  "early_stopping_delta": 0,
  "early_stopping_metric": "eval_loss",
  "early_stopping_metric_minimize": True,
  "early_stopping_patience": 3,
  "encoding": None,
  "eval_batch_size": 8,
  "evaluate_during_training": False,
  "evaluate_during_training_silent": True,
  "evaluate_during_training_steps": 2000,
  "evaluate_during_training_verbose": False,
  "fp16": True,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 1,
  "learning_rate": 4e-5,
  "local_rank": -1,
  "logging_steps": 50,
  "manual_seed": None,
  "max_grad_norm": 1.0,
  "max_seq_length": 128,
  "multiprocessing_chunksize": 500,
  "n_gpu": 1,
  "no_cache": False,
  "no_save": False,
  "num_train_epochs": 1,
  "output_dir": "outputs/",
  "overwrite_output_dir": False,
  "process_count": cpu_count() - 2 if cpu_count() > 2 else 1,
  "reprocess_input_data": True,
  "save_best_model": True,
  "save_eval_checkpoints": True,
  "save_model_every_epoch": True,
  "save_steps": 2000,
  "save_optimizer_and_scheduler": True,
  "silent": False,
  "tensorboard_dir": None,
  "train_batch_size": 8,
  "use_cached_eval_features": False,
  "use_early_stopping": False,
  "use_multiprocessing": True,
  "wandb_kwargs": {},
  "wandb_project": None,
  "warmup_ratio": 0.06,
  "warmup_steps": 0,
  "weight_decay": 0,
}

Additional resources

Refer to simpletransformers on GitHub for more detailed documentation. See Using simpleTransformer on common NLP applications, a W&B report that covers training transformers on some of the most popular GLUE benchmark datasets. Try it yourself on Colab.