Deep learning AI models have revolutionized the way we approach complex tasks. But when it comes to optimization, derivative-based optimizers often fall short in handling real-world applications. In a groundbreaking new paper, researchers from DeepMind propose a novel method called Optimization by PROmpting (OPRO) that leverages large language models (LLMs) as optimizers, opening up new possibilities for solving a wide array of problems.
What makes OPRO unique is its use of natural language descriptions instead of formal mathematical definitions. By describing the optimization problem in natural language and instructing the LLM to generate solutions based on the problem description and previous solutions, OPRO provides a highly adaptable and flexible approach to optimization
The process of OPRO begins with a "meta-prompt" that includes a natural language description of the task, along with examples, instructions, and solutions. As the optimization process unfolds, the LLM generates candidate solutions based on the problem description and previous solutions. These solutions are then evaluated and assigned quality scores. Optimal solutions and their scores are added to the meta-prompt, enriching the context for the next round of solution generation. This iterative process continues until the model stops proposing better solutions.
One of the key advantages of using LLMs for optimization is their ability to understand natural language. This means users can specify target metrics such as accuracy while providing additional instructions. OPRO also capitalizes on LLMs' ability to detect in-context patterns, allowing the model to identify optimization trajectories and build upon existing good solutions to construct potentially better ones.
To validate the effectiveness of OPRO, the researchers tested it on well-known mathematical optimization problems, such as linear regression and the "traveling salesman problem." While OPRO may not be the most optimal way to solve these problems, the results were promising. On small-scale problems, LLMs were able to capture optimization directions based on the past optimization trajectory provided in the meta-prompt.
But the real potential of OPRO lies in its ability to optimize LLM prompts. Experiments show that prompt engineering can dramatically affect the output of a model. By finding the optimal prompt that maximizes task accuracy, OPRO enables LLMs like OpenAI's ChatGPT and Google's PaLM to deliver better results.
However, it's important to note that LLMs do not possess human-like reasoning abilities. Their responses are highly dependent on the format of the prompt, and semantically similar prompts can yield different results. This highlights the need for model-specific and task-specific prompt formats.
While there is still much to learn about the inner workings of LLMs, OPRO provides a systematic way to explore the vast space of possible prompts and find the one that works best for a specific type of problem. As we continue to uncover the potential of LLMs in optimization, OPRO represents a significant step forward in our understanding and utilization of these powerful models.
Links: Paper