antoinelouis
/

splade-max-camembert-base-mmarcoFR

Sentence Similarity

passage-retrieval

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

antoinelouis commited on May 6

Commit

77c4a1f

•

1 Parent(s): 37ae2c0

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -99,8 +99,8 @@ with BM25 negatives.
 #### Implementation
 The model is initialized from the [almanach/camembert-base](https://huggingface.co/almanach/camembert-base) checkpoint and optimized via a combination of the InfoNCE
-ranking loss with a temperature of 0.05 and the FLOPS regularization loss with quadratic increase of lambda until step 33k after which it remains constant with lambda_q
-= 3e-4 and lambda_d = 1e-4. The model is fine-tuned on one 80GB NVIDIA H100 GPU for 100k steps using the AdamW optimizer with a batch size of 128, a peak learning rate
 of 2e-5 with warm up along the first 4000 steps and linear scheduling. The maximum sequence lengths for questions and passages length were fixed to 32 and 128 tokens.
 Relevance scores are computed with the cosine similarity.

 #### Implementation
 The model is initialized from the [almanach/camembert-base](https://huggingface.co/almanach/camembert-base) checkpoint and optimized via a combination of the InfoNCE
+ranking loss with a temperature of 0.05 and the FLOPS regularization loss with quadratic increase of lambda until step 33k after which it remains constant with lambda_q=3e-4
+and lambda_d=1e-4. The model is fine-tuned on one 80GB NVIDIA H100 GPU for 100k steps using the AdamW optimizer with a batch size of 128, a peak learning rate
 of 2e-5 with warm up along the first 4000 steps and linear scheduling. The maximum sequence lengths for questions and passages length were fixed to 32 and 128 tokens.
 Relevance scores are computed with the cosine similarity.