Paper Notes: LLM Powered Autonomous Agents
date
Dec 29, 2023
slug
paper-LLM-Powered_Autonomous_Agents
status
Published
tags
Paper
summary
In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components
type
Post
cover
LLM POWERED AUTONOMOUS AGENTS
1. AGENT SYSTEM OVERVIEW
In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components
- 计划:任务分解,自我学习
- 记忆:短期(上下文),长期(信息获取)
- 工具:APIs for 额外的信息
2. PLANNING
2.1 Task Decomposition
- 链式思维:
The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps.
- 树型思维:
by exploring multiple reasoning possibilities at each step
2.2 Self-Reflection
Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes.
- 自省,在迭代中通过复盘来进行能力提升。
- ReAct
integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. 融合任务行动空间和语言空间。
- Reflection
a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. 强化学习机制+LLM 实现可理解 self-reflection。
- Chain of Hindsight
encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. 事后复盘
3. MEMORY
3.1 Memory Types
- Sensory Memory(早期,图像/声音):
This is the earliest stage of memory, providing the ability to retain impressions of sensory information (visual, auditory, etc) after the original stimuli have ended.
- Short-Term Memory(STM):
It stores information that we are currently aware of and needed to carry out complex cognitive tasks such as learning and reasoning.
- Long-Term Memory (LTM) :
- Explicit/Declarative memory: life events, facts and concepts
- Implicit/Procedural memory: unconscious and involves skills and routines that are performed automatically
3.2 Maximum Inner Product Search(MIPS)
A standard practice is to save the embedding representation of information into a vector store database that can support fast maximum inner-product search (MIPS).
- 一些算法(暂时不关注这部分内容,不是我的领域):
- LSH
- ANNOY
- HNSW
- FAISS
- ScaNN
4.TOOL USE
Equipping LLMs with external tools can significantly extend the model capabilities.
- HuggingGPT
LLM as Controller 。