Update README.md

This commit is contained in:
2025-10-21 08:06:33 +00:00
parent 0cf6ae8385
commit 84fae50904

View File

@@ -6,17 +6,13 @@ To our knowledge, this is the first attempt at using any auto-prompting *framewo
We accomplish this using a *deep* language program with several layers of alternating `Attack` and `Refine` modules in the following optimization loop:
![Overview of DSPy for red-teaming](https://www.dropbox.com/scl/fi/4ebg4jrsebbvkpgfs8fjp/feedforward.png?rlkey=tq2saicjukzolhs1fjn30egyf&st=dmuqltlu&dl=0)
![Overview of DSPy for red-teaming](https://cdn.prod.website-files.com/66f89b6eb96e685709a53e09/6783565e10c519704c177998_DSPy-Redteam.png)
*Figure 1: Overview of DSPy for red-teaming. The DSPy MIPRO optimizer, guided by a LLM as a judge, compiles our language program into an effective red-teamer against Vicuna.*
The following Table demonstrates the effectiveness of the chosen architecture, as well as the benefit of DSPy compilation:
| **Architecture** | **ASR** |
|:------------:|:----------:|
| None (Raw Input) | 10% |
| Architecture (5 Layer) | 26% |
| Architecture (5 Layer) + Optimization | 44% |
![Results](https://cdn.prod.website-files.com/66f89b6eb96e685709a53e09/678357036bff3a56f1161706_678356ec1f1cbdbead37e11d_Screenshot%25202025-01-12%2520at%252012.45.10%25E2%2580%25AFAM.png)
*Table 1: ASR with raw harmful inputs, un-optimized architecture, and architecture post DSPy compilation.*