---
license: mit
base_model: openai-community/gpt2-medium
tags:
- trl
- reward-trainer
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: 2c2-reward-medium
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# 2c2-reward-medium

This model is a fine-tuned version of [openai-community/gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4456
- Accuracy: 0.7936

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.41e-05
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_steps: 20
- num_epochs: 3.0

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy |
|:-------------:|:------:|:----:|:---------------:|:--------:|
| 1.1138        | 0.2151 | 20   | 1.0849          | 0.5157   |
| 0.704         | 0.4301 | 40   | 0.7146          | 0.5622   |
| 0.5512        | 0.6452 | 60   | 0.5744          | 0.7084   |
| 0.5099        | 0.8602 | 80   | 0.5031          | 0.7478   |
| 0.5035        | 1.0753 | 100  | 0.4814          | 0.7639   |
| 0.4409        | 1.2903 | 120  | 0.4687          | 0.7671   |
| 0.4318        | 1.5054 | 140  | 0.4550          | 0.7831   |
| 0.4379        | 1.7204 | 160  | 0.4463          | 0.7871   |
| 0.4402        | 1.9355 | 180  | 0.4366          | 0.7863   |
| 0.3794        | 2.1505 | 200  | 0.4592          | 0.7936   |
| 0.3813        | 2.3656 | 220  | 0.4481          | 0.7920   |
| 0.3634        | 2.5806 | 240  | 0.4384          | 0.8016   |
| 0.3867        | 2.7957 | 260  | 0.4456          | 0.7936   |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1