Update README.md
This commit is contained in:
@@ -6,17 +6,13 @@ To our knowledge, this is the first attempt at using any auto-prompting *framewo
|
|||||||
|
|
||||||
We accomplish this using a *deep* language program with several layers of alternating `Attack` and `Refine` modules in the following optimization loop:
|
We accomplish this using a *deep* language program with several layers of alternating `Attack` and `Refine` modules in the following optimization loop:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
*Figure 1: Overview of DSPy for red-teaming. The DSPy MIPRO optimizer, guided by a LLM as a judge, compiles our language program into an effective red-teamer against Vicuna.*
|
*Figure 1: Overview of DSPy for red-teaming. The DSPy MIPRO optimizer, guided by a LLM as a judge, compiles our language program into an effective red-teamer against Vicuna.*
|
||||||
|
|
||||||
The following Table demonstrates the effectiveness of the chosen architecture, as well as the benefit of DSPy compilation:
|
The following Table demonstrates the effectiveness of the chosen architecture, as well as the benefit of DSPy compilation:
|
||||||
|
|
||||||
| **Architecture** | **ASR** |
|

|
||||||
|:------------:|:----------:|
|
|
||||||
| None (Raw Input) | 10% |
|
|
||||||
| Architecture (5 Layer) | 26% |
|
|
||||||
| Architecture (5 Layer) + Optimization | 44% |
|
|
||||||
|
|
||||||
*Table 1: ASR with raw harmful inputs, un-optimized architecture, and architecture post DSPy compilation.*
|
*Table 1: ASR with raw harmful inputs, un-optimized architecture, and architecture post DSPy compilation.*
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user