Directional Stimulus Prompting
Li et al., (2023) (opens in a new tab) proposes a new prompting technique to better guide the LLM in generating the desired summary.
A tuneable policy LM is trained to generate the stimulus/hint. Seeing more use of RL to optimize LLMs.
The figure below shows how Directional Stimulus Prompting compares with standard prompting. The policy LM can be small and optimized to generate the hints that guide a black-box frozen LLM.
Image Source: Li et al., (2023) (opens in a new tab)
Full example coming soon!