joneill commited on
Commit
876243b
1 Parent(s): 709adc1

Added model files and documentation

Browse files
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CarD-T: Carcinogen Detection via Transformers
2
+
3
+ ## Overview
4
+
5
+ CarD-T (Carcinogen Detection via Transformers) is a novel text analytics approach that combines transformer-based machine learning with probabilistic statistical analysis to efficiently nominate carcinogens from scientific texts. This model is designed to address the challenges faced by current systems in managing the burgeoning biomedical literature related to carcinogen identification and classification.
6
+
7
+ ## Model Details
8
+
9
+ - **Architecture**: Based on Bio-ELECTRA, a 335 million parameter language model
10
+ - **Training Data**: PubMed abstracts featuring known carcinogens from International Agency for Research on Cancer (IARC) groups G1 and G2A
11
+ - **Task**: Named Entity Recognition (NER) for carcinogen identification
12
+ - **Performance**:
13
+ - Precision: 0.894
14
+ - Recall: 0.857
15
+ - F1 Score: 0.875
16
+
17
+ ## Features
18
+
19
+ - Efficient nomination of potential carcinogens from scientific literature
20
+ - Context classifier to enhance accuracy and manage computational demands
21
+ - Capable of identifying both chemical and non-chemical carcinogenic factors
22
+ - Trained on a comprehensive dataset of carcinogen-related abstracts from 2000-2024
23
+
24
+ ## Use Cases
25
+
26
+ - Streamlining toxicogenomic literature reviews
27
+ - Identifying potential carcinogens for further investigation
28
+ - Augmenting existing carcinogen databases with emerging candidates
29
+
30
+ ## Limitations
31
+
32
+ - Identifies potential candidates, not confirmed carcinogens
33
+ - Analysis limited to abstract-level information
34
+ - May be influenced by publication trends and research focus shifts
35
+
36
+
37
+ ## Citation
38
+
39
+ If you use this model in your research, please cite:
40
+
41
+ O'Neill, J., Reddy, G.A., Dhillon, N., Tripathi, O., Alexandrov, L., & Katira, P. (2024). CarD-T: Interpreting Carcinomic Lexicon via Transformers. BioRxiv.
42
+
43
+ ## License
44
+
45
+ MIT License
46
+
47
+ Copyright (c) 2024 Jamey O'Neill
48
+
49
+ Permission is hereby granted, free of charge, to any person obtaining a copy
50
+ of this software and associated documentation files (the "Software"), to deal
51
+ in the Software without restriction, including without limitation the rights
52
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
53
+ copies of the Software, and to permit persons to whom the Software is
54
+ furnished to do so, subject to the following conditions:
55
+
56
+ The above copyright notice and this permission notice shall be included in all
57
+ copies or substantial portions of the Software.
58
+
59
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
60
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
61
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
62
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
63
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
64
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
65
+ SOFTWARE.
66
+
67
+ ## Contact
68
+
69
+ For questions and feedback, please contact Jamey ONeill at [email protected].
checkpoint-10595/config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sultan/BioM-ELECTRA-Large-SQuAD2",
3
+ "architectures": [
4
+ "ElectraForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "embedding_size": 1024,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "O",
14
+ "1": "B-cancertype",
15
+ "2": "B-carcinogen",
16
+ "3": "B-negative",
17
+ "4": "I-cancertype",
18
+ "5": "I-carcinogen",
19
+ "6": "I-negative"
20
+ },
21
+ "initializer_range": 0.02,
22
+ "intermediate_size": 4096,
23
+ "label2id": {
24
+ "B-cancertype": 1,
25
+ "B-carcinogen": 2,
26
+ "B-negative": 3,
27
+ "I-cancertype": 4,
28
+ "I-carcinogen": 5,
29
+ "I-negative": 6,
30
+ "O": 0
31
+ },
32
+ "layer_norm_eps": 1e-12,
33
+ "max_position_embeddings": 512,
34
+ "model_type": "electra",
35
+ "num_attention_heads": 16,
36
+ "num_hidden_layers": 24,
37
+ "pad_token_id": 0,
38
+ "position_embedding_type": "absolute",
39
+ "summary_activation": "gelu",
40
+ "summary_last_dropout": 0.1,
41
+ "summary_type": "first",
42
+ "summary_use_proj": true,
43
+ "torch_dtype": "float32",
44
+ "transformers_version": "4.40.1",
45
+ "type_vocab_size": 2,
46
+ "use_cache": true,
47
+ "vocab_size": 28895
48
+ }
checkpoint-10595/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1152dab87a3cb1cf2d6d8812b793cf31390c77c430ba4c20736601390303770a
3
+ size 1329781612
checkpoint-10595/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15a0c6be3b8cad9cacda15dab0ad3bfb05a3f313d063e536180c0ccf651b85f1
3
+ size 2659794471
checkpoint-10595/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17024af815d4eee454cd54e1c9d6acbc87c00976e22d68b0fae1dc0f611fdbe4
3
+ size 14244
checkpoint-10595/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc25d54a6f628bcee11085d2055063ab51e6a482a0b0a73494d5bd691588b7b4
3
+ size 1064
checkpoint-10595/trainer_state.json ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.0534161813557148,
3
+ "best_model_checkpoint": "/storage/BioM-ELECTRA-Large-SQuAD2_carDB_5e_neg_lg_SGD/checkpoint-4238",
4
+ "epoch": 5.0,
5
+ "eval_steps": 500,
6
+ "global_step": 10595,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.23596035865974516,
13
+ "grad_norm": 1.2747831344604492,
14
+ "learning_rate": 1.9063709296838132e-05,
15
+ "loss": 0.1626,
16
+ "step": 500
17
+ },
18
+ {
19
+ "epoch": 0.4719207173194903,
20
+ "grad_norm": 0.4436088800430298,
21
+ "learning_rate": 1.8119867862199152e-05,
22
+ "loss": 0.0822,
23
+ "step": 1000
24
+ },
25
+ {
26
+ "epoch": 0.7078810759792354,
27
+ "grad_norm": 1.6280406713485718,
28
+ "learning_rate": 1.7176026427560172e-05,
29
+ "loss": 0.0789,
30
+ "step": 1500
31
+ },
32
+ {
33
+ "epoch": 0.9438414346389806,
34
+ "grad_norm": 2.336548089981079,
35
+ "learning_rate": 1.6232184992921192e-05,
36
+ "loss": 0.0706,
37
+ "step": 2000
38
+ },
39
+ {
40
+ "epoch": 1.0,
41
+ "eval_accuracy_score": 0.9812832660700708,
42
+ "eval_f1": 0.8218279984573852,
43
+ "eval_loss": 0.05883064866065979,
44
+ "eval_precision": 0.8056710775047259,
45
+ "eval_recall": 0.8386462022825659,
46
+ "eval_runtime": 5.2595,
47
+ "eval_samples_per_second": 358.209,
48
+ "eval_steps_per_second": 44.871,
49
+ "step": 2119
50
+ },
51
+ {
52
+ "epoch": 1.1798017932987257,
53
+ "grad_norm": 2.0369768142700195,
54
+ "learning_rate": 1.528834355828221e-05,
55
+ "loss": 0.0496,
56
+ "step": 2500
57
+ },
58
+ {
59
+ "epoch": 1.415762151958471,
60
+ "grad_norm": 0.6217797994613647,
61
+ "learning_rate": 1.4344502123643229e-05,
62
+ "loss": 0.0487,
63
+ "step": 3000
64
+ },
65
+ {
66
+ "epoch": 1.651722510618216,
67
+ "grad_norm": 0.8187075257301331,
68
+ "learning_rate": 1.3400660689004247e-05,
69
+ "loss": 0.0455,
70
+ "step": 3500
71
+ },
72
+ {
73
+ "epoch": 1.8876828692779613,
74
+ "grad_norm": 1.2321038246154785,
75
+ "learning_rate": 1.2456819254365267e-05,
76
+ "loss": 0.0439,
77
+ "step": 4000
78
+ },
79
+ {
80
+ "epoch": 2.0,
81
+ "eval_accuracy_score": 0.9828429938975649,
82
+ "eval_f1": 0.8423466462832028,
83
+ "eval_loss": 0.0534161813557148,
84
+ "eval_precision": 0.8187221396731055,
85
+ "eval_recall": 0.867375049193231,
86
+ "eval_runtime": 5.2808,
87
+ "eval_samples_per_second": 356.764,
88
+ "eval_steps_per_second": 44.69,
89
+ "step": 4238
90
+ },
91
+ {
92
+ "epoch": 2.1236432279377064,
93
+ "grad_norm": 1.0647934675216675,
94
+ "learning_rate": 1.1512977819726287e-05,
95
+ "loss": 0.0337,
96
+ "step": 4500
97
+ },
98
+ {
99
+ "epoch": 2.3596035865974514,
100
+ "grad_norm": 0.07234682887792587,
101
+ "learning_rate": 1.0569136385087306e-05,
102
+ "loss": 0.0247,
103
+ "step": 5000
104
+ },
105
+ {
106
+ "epoch": 2.595563945257197,
107
+ "grad_norm": 2.0315685272216797,
108
+ "learning_rate": 9.625294950448326e-06,
109
+ "loss": 0.0263,
110
+ "step": 5500
111
+ },
112
+ {
113
+ "epoch": 2.831524303916942,
114
+ "grad_norm": 0.034490641206502914,
115
+ "learning_rate": 8.681453515809346e-06,
116
+ "loss": 0.027,
117
+ "step": 6000
118
+ },
119
+ {
120
+ "epoch": 3.0,
121
+ "eval_accuracy_score": 0.9833888986371878,
122
+ "eval_f1": 0.853031465848043,
123
+ "eval_loss": 0.06263311207294464,
124
+ "eval_precision": 0.832272557094721,
125
+ "eval_recall": 0.8748524203069658,
126
+ "eval_runtime": 5.3057,
127
+ "eval_samples_per_second": 355.091,
128
+ "eval_steps_per_second": 44.481,
129
+ "step": 6357
130
+ },
131
+ {
132
+ "epoch": 3.067484662576687,
133
+ "grad_norm": 0.1344643086194992,
134
+ "learning_rate": 7.739499764039643e-06,
135
+ "loss": 0.0254,
136
+ "step": 6500
137
+ },
138
+ {
139
+ "epoch": 3.303445021236432,
140
+ "grad_norm": 0.07355394959449768,
141
+ "learning_rate": 6.795658329400662e-06,
142
+ "loss": 0.0163,
143
+ "step": 7000
144
+ },
145
+ {
146
+ "epoch": 3.5394053798961775,
147
+ "grad_norm": 0.7838532328605652,
148
+ "learning_rate": 5.85181689476168e-06,
149
+ "loss": 0.016,
150
+ "step": 7500
151
+ },
152
+ {
153
+ "epoch": 3.7753657385559225,
154
+ "grad_norm": 3.0708000659942627,
155
+ "learning_rate": 4.9079754601227e-06,
156
+ "loss": 0.0138,
157
+ "step": 8000
158
+ },
159
+ {
160
+ "epoch": 4.0,
161
+ "eval_accuracy_score": 0.9838763135832798,
162
+ "eval_f1": 0.8535911602209945,
163
+ "eval_loss": 0.06983982026576996,
164
+ "eval_precision": 0.8559556786703602,
165
+ "eval_recall": 0.8512396694214877,
166
+ "eval_runtime": 5.271,
167
+ "eval_samples_per_second": 357.425,
168
+ "eval_steps_per_second": 44.773,
169
+ "step": 8476
170
+ },
171
+ {
172
+ "epoch": 4.011326097215668,
173
+ "grad_norm": 0.10571020096540451,
174
+ "learning_rate": 3.966021708352997e-06,
175
+ "loss": 0.0144,
176
+ "step": 8500
177
+ },
178
+ {
179
+ "epoch": 4.247286455875413,
180
+ "grad_norm": 0.029509373009204865,
181
+ "learning_rate": 3.0221802737140165e-06,
182
+ "loss": 0.0086,
183
+ "step": 9000
184
+ },
185
+ {
186
+ "epoch": 4.483246814535158,
187
+ "grad_norm": 1.1270127296447754,
188
+ "learning_rate": 2.0783388390750357e-06,
189
+ "loss": 0.0066,
190
+ "step": 9500
191
+ },
192
+ {
193
+ "epoch": 4.719207173194903,
194
+ "grad_norm": 0.010910986922681332,
195
+ "learning_rate": 1.134497404436055e-06,
196
+ "loss": 0.0097,
197
+ "step": 10000
198
+ },
199
+ {
200
+ "epoch": 4.955167531854649,
201
+ "grad_norm": 1.0645971298217773,
202
+ "learning_rate": 1.906559697970741e-07,
203
+ "loss": 0.0076,
204
+ "step": 10500
205
+ },
206
+ {
207
+ "epoch": 5.0,
208
+ "eval_accuracy_score": 0.9837593339962176,
209
+ "eval_f1": 0.8539891556932611,
210
+ "eval_loss": 0.07769417762756348,
211
+ "eval_precision": 0.8406404879908502,
212
+ "eval_recall": 0.8677685950413223,
213
+ "eval_runtime": 5.252,
214
+ "eval_samples_per_second": 358.721,
215
+ "eval_steps_per_second": 44.935,
216
+ "step": 10595
217
+ }
218
+ ],
219
+ "logging_steps": 500,
220
+ "max_steps": 10595,
221
+ "num_input_tokens_seen": 0,
222
+ "num_train_epochs": 5,
223
+ "save_steps": 500,
224
+ "total_flos": 1.7692937221754256e+16,
225
+ "train_batch_size": 8,
226
+ "trial_name": null,
227
+ "trial_params": null
228
+ }
checkpoint-10595/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07af839673be954537faf350c212a49f3f745cf8247da4cb4f9a1d9f73b78ed7
3
+ size 5048
checkpoint-3204/config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sultan/BioM-ELECTRA-Large-SQuAD2",
3
+ "architectures": [
4
+ "ElectraForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "embedding_size": 1024,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "O",
14
+ "1": "B-antineoplastic",
15
+ "2": "B-cancertype",
16
+ "3": "B-carcinogen",
17
+ "4": "B-negative",
18
+ "5": "I-cancertype",
19
+ "6": "I-carcinogen",
20
+ "7": "I-negative"
21
+ },
22
+ "initializer_range": 0.02,
23
+ "intermediate_size": 4096,
24
+ "label2id": {
25
+ "B-antineoplastic": 1,
26
+ "B-cancertype": 2,
27
+ "B-carcinogen": 3,
28
+ "B-negative": 4,
29
+ "I-cancertype": 5,
30
+ "I-carcinogen": 6,
31
+ "I-negative": 7,
32
+ "O": 0
33
+ },
34
+ "layer_norm_eps": 1e-12,
35
+ "max_position_embeddings": 512,
36
+ "model_type": "electra",
37
+ "num_attention_heads": 16,
38
+ "num_hidden_layers": 24,
39
+ "pad_token_id": 0,
40
+ "position_embedding_type": "absolute",
41
+ "summary_activation": "gelu",
42
+ "summary_last_dropout": 0.1,
43
+ "summary_type": "first",
44
+ "summary_use_proj": true,
45
+ "torch_dtype": "float32",
46
+ "transformers_version": "4.40.1",
47
+ "type_vocab_size": 2,
48
+ "use_cache": true,
49
+ "vocab_size": 28895
50
+ }
checkpoint-3204/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ca26bb96ef03d72f27f204e9972313d8190bf4c74542e7b5ba3c035bd9011e4
3
+ size 1329785712
checkpoint-3204/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f27b65e1ecc59b25c4d7e5c7daf5576fa34b0dfa99c71f5de7be4333aec324da
3
+ size 2659802663
checkpoint-3204/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:223e37654453351bb53133440aabdb815af7d246c62afead675019647840ae7f
3
+ size 14244
checkpoint-3204/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a90c28e7909084a367599a7cdab33ed7a9cb38d9886e04e2a407c17d10355fba
3
+ size 1064
checkpoint-3204/trainer_state.json ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.07832363992929459,
3
+ "best_model_checkpoint": "/storage/BioM-ELECTRA-Large-SQuAD2_carDB_5e_neg_lg_SGD/checkpoint-3204",
4
+ "epoch": 2.0,
5
+ "eval_steps": 500,
6
+ "global_step": 3204,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.3121098626716604,
13
+ "grad_norm": 3.2046010494232178,
14
+ "learning_rate": 4.690387016229713e-05,
15
+ "loss": 0.1789,
16
+ "step": 500
17
+ },
18
+ {
19
+ "epoch": 0.6242197253433208,
20
+ "grad_norm": 1.161939263343811,
21
+ "learning_rate": 4.378277153558052e-05,
22
+ "loss": 0.1113,
23
+ "step": 1000
24
+ },
25
+ {
26
+ "epoch": 0.9363295880149812,
27
+ "grad_norm": 2.1341910362243652,
28
+ "learning_rate": 4.066167290886392e-05,
29
+ "loss": 0.1054,
30
+ "step": 1500
31
+ },
32
+ {
33
+ "epoch": 1.0,
34
+ "eval_accuracy_score": 0.9690997325938397,
35
+ "eval_f1": 0.7794316644113668,
36
+ "eval_loss": 0.08875501155853271,
37
+ "eval_precision": 0.7747886241352806,
38
+ "eval_recall": 0.7841306884480747,
39
+ "eval_runtime": 4.093,
40
+ "eval_samples_per_second": 348.153,
41
+ "eval_steps_per_second": 43.733,
42
+ "step": 1602
43
+ },
44
+ {
45
+ "epoch": 1.2484394506866416,
46
+ "grad_norm": 0.97569340467453,
47
+ "learning_rate": 3.754057428214732e-05,
48
+ "loss": 0.079,
49
+ "step": 2000
50
+ },
51
+ {
52
+ "epoch": 1.5605493133583022,
53
+ "grad_norm": 0.8073092699050903,
54
+ "learning_rate": 3.441947565543071e-05,
55
+ "loss": 0.0685,
56
+ "step": 2500
57
+ },
58
+ {
59
+ "epoch": 1.8726591760299627,
60
+ "grad_norm": 2.6909968852996826,
61
+ "learning_rate": 3.129837702871411e-05,
62
+ "loss": 0.0671,
63
+ "step": 3000
64
+ },
65
+ {
66
+ "epoch": 2.0,
67
+ "eval_accuracy_score": 0.973506982271962,
68
+ "eval_f1": 0.814260389994091,
69
+ "eval_loss": 0.07832363992929459,
70
+ "eval_precision": 0.8248204309656824,
71
+ "eval_recall": 0.8039673278879813,
72
+ "eval_runtime": 4.1937,
73
+ "eval_samples_per_second": 339.796,
74
+ "eval_steps_per_second": 42.683,
75
+ "step": 3204
76
+ }
77
+ ],
78
+ "logging_steps": 500,
79
+ "max_steps": 8010,
80
+ "num_input_tokens_seen": 0,
81
+ "num_train_epochs": 5,
82
+ "save_steps": 500,
83
+ "total_flos": 4245597826238592.0,
84
+ "train_batch_size": 8,
85
+ "trial_name": null,
86
+ "trial_params": null
87
+ }
checkpoint-3204/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36534a66cd54fa62d55a78ae7da66791486f97dbf75f55360327b517cd4e0be3
3
+ size 5048
checkpoint-8010/config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sultan/BioM-ELECTRA-Large-SQuAD2",
3
+ "architectures": [
4
+ "ElectraForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "embedding_size": 1024,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "O",
14
+ "1": "B-antineoplastic",
15
+ "2": "B-cancertype",
16
+ "3": "B-carcinogen",
17
+ "4": "B-negative",
18
+ "5": "I-cancertype",
19
+ "6": "I-carcinogen",
20
+ "7": "I-negative"
21
+ },
22
+ "initializer_range": 0.02,
23
+ "intermediate_size": 4096,
24
+ "label2id": {
25
+ "B-antineoplastic": 1,
26
+ "B-cancertype": 2,
27
+ "B-carcinogen": 3,
28
+ "B-negative": 4,
29
+ "I-cancertype": 5,
30
+ "I-carcinogen": 6,
31
+ "I-negative": 7,
32
+ "O": 0
33
+ },
34
+ "layer_norm_eps": 1e-12,
35
+ "max_position_embeddings": 512,
36
+ "model_type": "electra",
37
+ "num_attention_heads": 16,
38
+ "num_hidden_layers": 24,
39
+ "pad_token_id": 0,
40
+ "position_embedding_type": "absolute",
41
+ "summary_activation": "gelu",
42
+ "summary_last_dropout": 0.1,
43
+ "summary_type": "first",
44
+ "summary_use_proj": true,
45
+ "torch_dtype": "float32",
46
+ "transformers_version": "4.40.1",
47
+ "type_vocab_size": 2,
48
+ "use_cache": true,
49
+ "vocab_size": 28895
50
+ }
checkpoint-8010/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28ff692670e07e3443424d23688d7f8c5f7ab04b3a81f11285f1ae8c826d24d3
3
+ size 1329785712
checkpoint-8010/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75b61d9f1a1c4a763eefe45bc88b93a6caaff5621848dba6693e1c84f9e02b5f
3
+ size 2659802663
checkpoint-8010/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:24255887216b6ed59d1dea7f442c2bf9f0d4baf66cc6133b44860e5e7a22bdb9
3
+ size 14244
checkpoint-8010/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b119d907611dd4a92192cad349abbf6251e9f178cd9bdccbc47a3fd72a25015
3
+ size 1064
checkpoint-8010/trainer_state.json ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.07832363992929459,
3
+ "best_model_checkpoint": "/storage/BioM-ELECTRA-Large-SQuAD2_carDB_5e_neg_lg_SGD/checkpoint-3204",
4
+ "epoch": 5.0,
5
+ "eval_steps": 500,
6
+ "global_step": 8010,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.3121098626716604,
13
+ "grad_norm": 3.2046010494232178,
14
+ "learning_rate": 4.690387016229713e-05,
15
+ "loss": 0.1789,
16
+ "step": 500
17
+ },
18
+ {
19
+ "epoch": 0.6242197253433208,
20
+ "grad_norm": 1.161939263343811,
21
+ "learning_rate": 4.378277153558052e-05,
22
+ "loss": 0.1113,
23
+ "step": 1000
24
+ },
25
+ {
26
+ "epoch": 0.9363295880149812,
27
+ "grad_norm": 2.1341910362243652,
28
+ "learning_rate": 4.066167290886392e-05,
29
+ "loss": 0.1054,
30
+ "step": 1500
31
+ },
32
+ {
33
+ "epoch": 1.0,
34
+ "eval_accuracy_score": 0.9690997325938397,
35
+ "eval_f1": 0.7794316644113668,
36
+ "eval_loss": 0.08875501155853271,
37
+ "eval_precision": 0.7747886241352806,
38
+ "eval_recall": 0.7841306884480747,
39
+ "eval_runtime": 4.093,
40
+ "eval_samples_per_second": 348.153,
41
+ "eval_steps_per_second": 43.733,
42
+ "step": 1602
43
+ },
44
+ {
45
+ "epoch": 1.2484394506866416,
46
+ "grad_norm": 0.97569340467453,
47
+ "learning_rate": 3.754057428214732e-05,
48
+ "loss": 0.079,
49
+ "step": 2000
50
+ },
51
+ {
52
+ "epoch": 1.5605493133583022,
53
+ "grad_norm": 0.8073092699050903,
54
+ "learning_rate": 3.441947565543071e-05,
55
+ "loss": 0.0685,
56
+ "step": 2500
57
+ },
58
+ {
59
+ "epoch": 1.8726591760299627,
60
+ "grad_norm": 2.6909968852996826,
61
+ "learning_rate": 3.129837702871411e-05,
62
+ "loss": 0.0671,
63
+ "step": 3000
64
+ },
65
+ {
66
+ "epoch": 2.0,
67
+ "eval_accuracy_score": 0.973506982271962,
68
+ "eval_f1": 0.814260389994091,
69
+ "eval_loss": 0.07832363992929459,
70
+ "eval_precision": 0.8248204309656824,
71
+ "eval_recall": 0.8039673278879813,
72
+ "eval_runtime": 4.1937,
73
+ "eval_samples_per_second": 339.796,
74
+ "eval_steps_per_second": 42.683,
75
+ "step": 3204
76
+ },
77
+ {
78
+ "epoch": 2.184769038701623,
79
+ "grad_norm": 0.14074726402759552,
80
+ "learning_rate": 2.818352059925094e-05,
81
+ "loss": 0.0534,
82
+ "step": 3500
83
+ },
84
+ {
85
+ "epoch": 2.4968789013732833,
86
+ "grad_norm": 0.8002931475639343,
87
+ "learning_rate": 2.5062421972534333e-05,
88
+ "loss": 0.043,
89
+ "step": 4000
90
+ },
91
+ {
92
+ "epoch": 2.808988764044944,
93
+ "grad_norm": 1.326310157775879,
94
+ "learning_rate": 2.194132334581773e-05,
95
+ "loss": 0.0433,
96
+ "step": 4500
97
+ },
98
+ {
99
+ "epoch": 3.0,
100
+ "eval_accuracy_score": 0.9748935327324948,
101
+ "eval_f1": 0.8186646433990896,
102
+ "eval_loss": 0.09053385257720947,
103
+ "eval_precision": 0.7989633469085524,
104
+ "eval_recall": 0.839362115908207,
105
+ "eval_runtime": 4.09,
106
+ "eval_samples_per_second": 348.41,
107
+ "eval_steps_per_second": 43.765,
108
+ "step": 4806
109
+ },
110
+ {
111
+ "epoch": 3.1210986267166043,
112
+ "grad_norm": 0.9473728537559509,
113
+ "learning_rate": 1.8820224719101125e-05,
114
+ "loss": 0.0318,
115
+ "step": 5000
116
+ },
117
+ {
118
+ "epoch": 3.4332084893882646,
119
+ "grad_norm": 0.391970157623291,
120
+ "learning_rate": 1.569912609238452e-05,
121
+ "loss": 0.0228,
122
+ "step": 5500
123
+ },
124
+ {
125
+ "epoch": 3.7453183520599254,
126
+ "grad_norm": 5.073329448699951,
127
+ "learning_rate": 1.258426966292135e-05,
128
+ "loss": 0.0224,
129
+ "step": 6000
130
+ },
131
+ {
132
+ "epoch": 4.0,
133
+ "eval_accuracy_score": 0.9751906506883232,
134
+ "eval_f1": 0.8262295081967214,
135
+ "eval_loss": 0.09821399301290512,
136
+ "eval_precision": 0.819433817903596,
137
+ "eval_recall": 0.8331388564760793,
138
+ "eval_runtime": 4.1062,
139
+ "eval_samples_per_second": 347.04,
140
+ "eval_steps_per_second": 43.593,
141
+ "step": 6408
142
+ },
143
+ {
144
+ "epoch": 4.057428214731585,
145
+ "grad_norm": 0.055937759578228,
146
+ "learning_rate": 9.463171036204745e-06,
147
+ "loss": 0.0204,
148
+ "step": 6500
149
+ },
150
+ {
151
+ "epoch": 4.369538077403246,
152
+ "grad_norm": 0.6860196590423584,
153
+ "learning_rate": 6.342072409488139e-06,
154
+ "loss": 0.0116,
155
+ "step": 7000
156
+ },
157
+ {
158
+ "epoch": 4.681647940074907,
159
+ "grad_norm": 0.6975510716438293,
160
+ "learning_rate": 3.2209737827715358e-06,
161
+ "loss": 0.011,
162
+ "step": 7500
163
+ },
164
+ {
165
+ "epoch": 4.9937578027465666,
166
+ "grad_norm": 0.33860084414482117,
167
+ "learning_rate": 9.987515605493134e-08,
168
+ "loss": 0.0091,
169
+ "step": 8000
170
+ },
171
+ {
172
+ "epoch": 5.0,
173
+ "eval_accuracy_score": 0.9755125284738041,
174
+ "eval_f1": 0.8289676425269646,
175
+ "eval_loss": 0.11285488307476044,
176
+ "eval_precision": 0.8210606638687524,
177
+ "eval_recall": 0.8370283936211591,
178
+ "eval_runtime": 4.1141,
179
+ "eval_samples_per_second": 346.37,
180
+ "eval_steps_per_second": 43.509,
181
+ "step": 8010
182
+ }
183
+ ],
184
+ "logging_steps": 500,
185
+ "max_steps": 8010,
186
+ "num_input_tokens_seen": 0,
187
+ "num_train_epochs": 5,
188
+ "save_steps": 500,
189
+ "total_flos": 1.064813249753856e+16,
190
+ "train_batch_size": 8,
191
+ "trial_name": null,
192
+ "trial_params": null
193
+ }
checkpoint-8010/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36534a66cd54fa62d55a78ae7da66791486f97dbf75f55360327b517cd4e0be3
3
+ size 5048
config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sultan/BioM-ELECTRA-Large-SQuAD2",
3
+ "architectures": [
4
+ "ElectraForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "embedding_size": 1024,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "O",
14
+ "1": "B-cancertype",
15
+ "2": "B-carcinogen",
16
+ "3": "B-negative",
17
+ "4": "I-cancertype",
18
+ "5": "I-carcinogen",
19
+ "6": "I-negative"
20
+ },
21
+ "initializer_range": 0.02,
22
+ "intermediate_size": 4096,
23
+ "label2id": {
24
+ "B-cancertype": 1,
25
+ "B-carcinogen": 2,
26
+ "B-negative": 3,
27
+ "I-cancertype": 4,
28
+ "I-carcinogen": 5,
29
+ "I-negative": 6,
30
+ "O": 0
31
+ },
32
+ "layer_norm_eps": 1e-12,
33
+ "max_position_embeddings": 512,
34
+ "model_type": "electra",
35
+ "num_attention_heads": 16,
36
+ "num_hidden_layers": 24,
37
+ "pad_token_id": 0,
38
+ "position_embedding_type": "absolute",
39
+ "summary_activation": "gelu",
40
+ "summary_last_dropout": 0.1,
41
+ "summary_type": "first",
42
+ "summary_use_proj": true,
43
+ "torch_dtype": "float32",
44
+ "transformers_version": "4.40.1",
45
+ "type_vocab_size": 2,
46
+ "use_cache": true,
47
+ "vocab_size": 28895
48
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fc3d970de77d2af227efab4238d737a8226de685bea1a93bc42848c723b9e9a
3
+ size 1329781612
runs/Apr25_14-35-27_hpcf/events.out.tfevents.1714080930.hpcf.137937.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:64afb7a940a161fa73270408d5c7bdfd08cd268948f591dde724000e76873b33
3
+ size 11444
runs/Apr26_09-58-52_hpcf/events.out.tfevents.1714150736.hpcf.183226.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de80d861e08c3f13afa8fec93cdc5f4e42c442309baa5f920daa4b9cdaa03721
3
+ size 12443
runs/Apr26_10-20-51_hpcf/events.out.tfevents.1714152055.hpcf.183659.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78ad061897c41c39551a6a3fb59cf510f60d389690b2792b223efbe66fc4cf5b
3
+ size 5268
runs/Apr26_10-21-30_hpcf/events.out.tfevents.1714152093.hpcf.186484.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7036534514f03fdd01de11863b01a94b6e43cbfeba22e6d634a073eeb2b2b5a4
3
+ size 12443
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "ignore_mismatched_sizes": true,
49
+ "mask_token": "[MASK]",
50
+ "max_len": 512,
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "[PAD]",
54
+ "padding": true,
55
+ "sep_token": "[SEP]",
56
+ "strip_accents": null,
57
+ "tokenize_chinese_chars": true,
58
+ "tokenizer_class": "ElectraTokenizer",
59
+ "truncation": true,
60
+ "unk_token": "[UNK]"
61
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07af839673be954537faf350c212a49f3f745cf8247da4cb4f9a1d9f73b78ed7
3
+ size 5048
vocab.txt ADDED
The diff for this file is too large to render. See raw diff