Edit model card

privacy-200k-masking

This model is a fine-tuned version of distilbert-base-multilingual-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • eval_loss: 0.0949
  • eval_overall_precision: 0.9099
  • eval_overall_recall: 0.9306
  • eval_overall_f1: 0.9201
  • eval_overall_accuracy: 0.9692
  • eval_ACCOUNTNAME_f1: 0.9863
  • eval_ACCOUNTNUMBER_f1: 0.9551
  • eval_AGE_f1: 0.9454
  • eval_AMOUNT_f1: 0.9481
  • eval_BIC_f1: 0.9140
  • eval_BITCOINADDRESS_f1: 0.9227
  • eval_BUILDINGNUMBER_f1: 0.9056
  • eval_CITY_f1: 0.9351
  • eval_COMPANYNAME_f1: 0.9621
  • eval_COUNTY_f1: 0.9756
  • eval_CREDITCARDCVV_f1: 0.9201
  • eval_CREDITCARDISSUER_f1: 0.9767
  • eval_CREDITCARDNUMBER_f1: 0.8506
  • eval_CURRENCY_f1: 0.7277
  • eval_CURRENCYCODE_f1: 0.8398
  • eval_CURRENCYNAME_f1: 0.1576
  • eval_CURRENCYSYMBOL_f1: 0.9216
  • eval_DATE_f1: 0.7988
  • eval_DOB_f1: 0.6103
  • eval_EMAIL_f1: 0.9862
  • eval_ETHEREUMADDRESS_f1: 0.9624
  • eval_EYECOLOR_f1: 0.9779
  • eval_FIRSTNAME_f1: 0.9636
  • eval_GENDER_f1: 0.9852
  • eval_HEIGHT_f1: 0.9771
  • eval_IBAN_f1: 0.9513
  • eval_IP_f1: 0.0
  • eval_IPV4_f1: 0.8240
  • eval_IPV6_f1: 0.7389
  • eval_JOBAREA_f1: 0.9713
  • eval_JOBTITLE_f1: 0.9819
  • eval_JOBTYPE_f1: 0.9743
  • eval_LASTNAME_f1: 0.9439
  • eval_LITECOINADDRESS_f1: 0.8069
  • eval_MAC_f1: 0.9668
  • eval_MASKEDNUMBER_f1: 0.8084
  • eval_MIDDLENAME_f1: 0.9401
  • eval_NEARBYGPSCOORDINATE_f1: 0.9963
  • eval_ORDINALDIRECTION_f1: 0.9904
  • eval_PASSWORD_f1: 0.9690
  • eval_PHONEIMEI_f1: 0.9842
  • eval_PHONENUMBER_f1: 0.9690
  • eval_PIN_f1: 0.8584
  • eval_PREFIX_f1: 0.9594
  • eval_SECONDARYADDRESS_f1: 0.9880
  • eval_SEX_f1: 0.9952
  • eval_SSN_f1: 0.9813
  • eval_STATE_f1: 0.9664
  • eval_STREET_f1: 0.9607
  • eval_TIME_f1: 0.9560
  • eval_URL_f1: 0.9866
  • eval_USERAGENT_f1: 0.9901
  • eval_USERNAME_f1: 0.9743
  • eval_VEHICLEVIN_f1: 0.9699
  • eval_VEHICLEVRM_f1: 0.9725
  • eval_ZIPCODE_f1: 0.9018
  • eval_runtime: 3609.2787
  • eval_samples_per_second: 17.394
  • eval_steps_per_second: 8.697
  • epoch: 1.0
  • step: 73241

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 2

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for taro-pudding/privacy-200k-masking

Finetuned
this model