Paper Notes: LLM Powered Autonomous Agents

date

Dec 29, 2023

slug

paper-LLM-Powered_Autonomous_Agents

status

Published

LLM POWERED AUTONOMOUS AGENTS

1. AGENT SYSTEM OVERVIEW

In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components

计划：任务分解，自我学习

记忆：短期（上下文），长期（信息获取）

工具：APIs for 额外的信息

2. PLANNING

2.1 Task Decomposition

链式思维：

The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps.

树型思维：

by exploring multiple reasoning possibilities at each step

2.2 Self-Reflection

Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes.

自省，在迭代中通过复盘来进行能力提升。

ReAct

integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. 融合任务行动空间和语言空间。

Reflection

a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. 强化学习机制+LLM 实现可理解 self-reflection。
(View Source)

Chain of Hindsight

encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. 事后复盘

3. MEMORY

3.1 Memory Types

Sensory Memory（早期，图像/声音）：

This is the earliest stage of memory, providing the ability to retain impressions of sensory information (visual, auditory, etc) after the original stimuli have ended.

Short-Term Memory（STM）：

It stores information that we are currently aware of and needed to carry out complex cognitive tasks such as learning and reasoning.

Long-Term Memory (LTM) :

Explicit/Declarative memory: life events, facts and concepts
Implicit/Procedural memory: unconscious and involves skills and routines that are performed automatically

3.2 Maximum Inner Product Search(MIPS)

A standard practice is to save the embedding representation of information into a vector store database that can support fast maximum inner-product search (MIPS).

一些算法（暂时不关注这部分内容，不是我的领域）：

LSH
ANNOY
HNSW
FAISS
ScaNN

4.TOOL USE

Equipping LLMs with external tools can significantly extend the model capabilities.

HuggingGPT

LLM as Controller 。