Edit model card

phi-3-mini-QLoRA

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0514

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
2.2314 0.0273 100 2.1230
1.5709 0.0546 200 1.5079
1.4201 0.0820 300 1.4112
1.3689 0.1093 400 1.3759
1.3444 0.1366 500 1.3509
1.3287 0.1639 600 1.3273
1.3019 0.1912 700 1.3038
1.2713 0.2185 800 1.2827
1.2596 0.2459 900 1.2630
1.2433 0.2732 1000 1.2459
1.2233 0.3005 1100 1.2310
1.2197 0.3278 1200 1.2177
1.2033 0.3551 1300 1.2055
1.1902 0.3825 1400 1.1963
1.1927 0.4098 1500 1.1876
1.1771 0.4371 1600 1.1791
1.1615 0.4644 1700 1.1702
1.1655 0.4917 1800 1.1641
1.1725 0.5191 1900 1.1585
1.1348 0.5464 2000 1.1522
1.1429 0.5737 2100 1.1464
1.141 0.6010 2200 1.1413
1.1458 0.6283 2300 1.1362
1.1268 0.6556 2400 1.1314
1.1218 0.6830 2500 1.1272
1.1277 0.7103 2600 1.1226
1.1092 0.7376 2700 1.1198
1.1282 0.7649 2800 1.1156
1.1027 0.7922 2900 1.1116
1.0951 0.8196 3000 1.1084
1.1001 0.8469 3100 1.1057
1.1027 0.8742 3200 1.1021
1.0989 0.9015 3300 1.0987
1.0917 0.9288 3400 1.0966
1.0832 0.9562 3500 1.0939
1.1074 0.9835 3600 1.0915
1.0692 1.0108 3700 1.0891
1.0868 1.0381 3800 1.0872
1.079 1.0654 3900 1.0855
1.0844 1.0927 4000 1.0831
1.0779 1.1201 4100 1.0819
1.0737 1.1474 4200 1.0797
1.0651 1.1747 4300 1.0775
1.0656 1.2020 4400 1.0764
1.0592 1.2293 4500 1.0739
1.07 1.2567 4600 1.0729
1.068 1.2840 4700 1.0719
1.0623 1.3113 4800 1.0701
1.0622 1.3386 4900 1.0691
1.0579 1.3659 5000 1.0678
1.0652 1.3933 5100 1.0667
1.0655 1.4206 5200 1.0654
1.0619 1.4479 5300 1.0642
1.0521 1.4752 5400 1.0635
1.0563 1.5025 5500 1.0625
1.0554 1.5298 5600 1.0611
1.0577 1.5572 5700 1.0599
1.0427 1.5845 5800 1.0590
1.0489 1.6118 5900 1.0583
1.0444 1.6391 6000 1.0578
1.0573 1.6664 6100 1.0562
1.0494 1.6938 6200 1.0555
1.0355 1.7211 6300 1.0551
1.0531 1.7484 6400 1.0544
1.0542 1.7757 6500 1.0540
1.0324 1.8030 6600 1.0535
1.0497 1.8304 6700 1.0532
1.0415 1.8577 6800 1.0529
1.0414 1.8850 6900 1.0522
1.0588 1.9123 7000 1.0520
1.0347 1.9396 7100 1.0519
1.0346 1.9669 7200 1.0516
1.043 1.9943 7300 1.0514

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for UndefinedCpp/phi-3-mini-QLoRA

Adapter
this model