Checking multiple policy rules

#11
by AmenRa - opened

Hi, can the model check multiple policy rules in a single pass?

Google org

ShieldGemma was trained and evaluated for a single policy classification per inference call, and that's how we recommend you use it.

That said, prompts are fungible and we would find any evidence from the community about performance characteristics for multi-policy classification interesting.

Hi,

Following up on this, the model appears to perform very poorly when checking for multiple policies at once, but very good for just checking one policy. Do you have any recommendations about how the prompt can be formatted to make it better than that?

Google org

We don't have specific prompt recoemmendations for multi-policy-per-prompt use at this time. The model was trained for single-policy-per-prompt detection and we don't expect it to perform well in a multi-policy-per-prompt context.

Hey folks, not to seem unappreciative of the work done, but in the interest of avoiding mishaps it would probably be wise to explicitly advertise that only a single policy should be used at a time somewhere in the model card. I didn't see anything in the paper about this limitation and (as of writing) don't see anything on the model card either. It wasn't until a member of our team noticed that multiple policies caused a dramatic drop that we investigated and found this to be the case.

Sign up or log in to comment