Agent 记忆系统 —— 权威参考资料

论文、博客、行业框架的索引。按"从思想到工程"排列。

一、奠基论文

1. Generative Agents: Interactive Simulacra of Human Behavior

作者: Joon Sung Park et al. (Stanford + Google Research)
时间: 2023年4月
链接: https://arxiv.org/abs/2304.03442
为什么必读: 记忆系统的"创世纪"论文。提出了 Memory Stream + Retrieval（recency × importance × relevance 三维打分）+ Reflection（累积重要性触发升维思考）+ Planning（层级规划）的完整认知架构。25个 Agent 在 Smallville 小镇中展现出涌现行为。消融实验证明 observation、planning、reflection 每一项都不可少（效应量 d=8.16）。
关键贡献: 定义了"记忆不是数据库，而是认知基座"的范式

2. MemGPT: Towards LLMs as Operating Systems

作者: Charles Packer, Sarah Wooders et al. (UC Berkeley)
时间: 2023年10月
链接: https://arxiv.org/abs/2310.08560
为什么必读: 把操作系统的虚拟内存思想引入 LLM——主存（context window）和外存（外部存储）之间做 paging，制造"无限上下文"的幻觉。Agent 自己决定何时换入/换出记忆。这个思路直接影响了后来 Letta 框架以及所有分层记忆系统的设计。
关键贡献: 记忆管理 = 虚拟上下文管理，Agent 自主控制 page-in/page-out

二、综述论文（系统性理解全貌）

3. A Survey on the Memory Mechanism of Large Language Model based Agents

时间: 2024年4月
链接: https://arxiv.org/abs/2404.13501
内容: 系统梳理了 Agent 记忆的需求、设计、评测方法，覆盖个人助手、社会模拟、角色扮演、开放世界游戏、代码生成、推荐系统等应用场景。适合作为全景地图。

4. Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers

时间: 2026年3月
链接: https://arxiv.org/abs/2603.07670
内容: 截至目前最新的综述，覆盖 2022-2026。将记忆形式化为 write-manage-read 循环，提出三维分类学：时间范围（temporal scope）、表征基底（representational substrate）、控制策略（control policy）。五大机制族：上下文驻留压缩、检索增强存储、反思式自我改进、层级虚拟上下文、策略学习型管理。

5. Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions

时间: 2025年5月
链接: https://arxiv.org/abs/2505.00675
内容: 提出记忆的六大基本操作：Consolidation（巩固）、Updating（更新）、Indexing（索引）、Forgetting（遗忘）、Retrieval（检索）、Compression（压缩）。从认知科学视角切入，适合理解"为什么这样设计"。

6. From Human Memory to AI Memory

时间: 2025年4月
链接: https://arxiv.org/abs/2504.15965
内容: 从人类记忆机制到 AI 记忆的映射关系，三维分类：对象、形式、时间。帮助你建立"人脑怎么做 → AI 怎么模仿 → 差距在哪"的思维链。

三、行业核心博客

7. Harrison Chase —— "Your Harness, Your Memory"

作者: Harrison Chase (LangChain 创始人)
时间: 2026年4月11日
链接: https://langchain.com/blog/your-harness-your-memory
核心观点:
- "Managing context, and therefore memory, is a core capability and responsibility of the agent harness."
- 记忆是 harness 的核心职责，不是可插拔的外部模块
- 封闭 harness（API 背后的记忆）= 平台锁定，你的记忆被第三方控制
- Claude Code 的泄露源码有 512,000 行，全是 harness
启发: 选择开源 harness + 自有存储（Postgres / MongoDB），确保记忆可迁移

8. AgentPatterns.ai —— "Harness-Memory Coupling as a Design Axis"

链接: https://agentpatterns.ai/agent-design/harness-memory-coupling/
核心观点: 提出 7 个 load-bearing seams（承重接缝） ——harness 和记忆的交汇点：
1. 指令加载时机（system prompt / first turn / lazy）
2. 技能元数据存放位置（system prompt / messages / 按需检索）
3. Agent 能否运行中自我修改指令
4. 上下文压缩时哪些字段必须存活
5. 交互记录是否持久化、可查询、可重注入
6. 存储的记忆如何浮现给 Agent（tool results / system notes / priming）
7. 每轮 Agent 可见的文件系统视野
启发: 换 harness = 换对这 7 个问题的全部回答。这就是为什么"给 harness 插一个记忆库"行不通

9. Anthropic —— "Effective Context Engineering for AI Agents"

时间: 2025年9月29日
链接: https://anthropic.com/engineering/effective-context-engineering-for-ai-agents
核心观点:
- Context Engineering 是 Prompt Engineering 的自然演进
- 核心问题不是"写好一个 prompt"，而是在长 session 中管理上下文
- 包括：记忆保持（keep between turns）、压缩摘要（compacting）、防 context rot（上下文腐烂）
- 同期发布了 Context Editing 和 Memory Tool 两个平台能力
启发: "a clean context beats a large context"——干净的上下文胜过大的上下文

10. Andrej Karpathy —— Context Engineering > Prompt Engineering

时间: 2025年6月
来源: X (Twitter) 帖子，由 Tobi Lütke (Shopify CEO) 先提出术语，Karpathy 放大影响
核心观点:
- "Context engineering is the delicate art and science of filling the context window"
- 上下文包含：system prompts、用户输入、对话历史、可用工具、记忆、检索信息
- 太少 → 模型没有足够信息；太多 → 成本上升且性能下降
- 做好这件事 "is highly non-trivial"
启发: 记忆管理的本质是注意力预算分配

11. Hindsight —— "The Missing Layer in Every Agent Harness"

时间: 2026年5月4日
链接: https://hindsight.vectorize.io/blog/2026/05/04/agent-harness-needs-memory
核心观点:
- 当前主流 harness（Claude Code、Cursor、Cline 等）几乎都缺少 learning memory layer
- 每个 session 从零开始，CLAUDE.md / AGENTS.md 是静态文件，需要人工维护
- 真正的记忆需要：选择性保留、跨 session 召回、经验反思、范围控制
相关文章: Your Agent Is Not Forgetful. It Was Never Given a Memory.（2026年4月23日）

四、Harness Engineering 框架文章

12. OpenAI —— "Harness Engineering: Leveraging Codex in an Agent-First World"

作者: Ryan Lopopolo (OpenAI)
时间: 2026年2月11日
链接: https://openai.com/index/harness-engineering/
核心观点:
- 用 Codex agents 构建了约 100万行代码的内部产品，零人工编写，5个月，3→7人
- 工程师的角色从写代码变成设计环境、指定意图、构建反馈循环
- 三大支柱：Context Engineering（用 AGENTS.md 做目录）、Architectural Constraints（用 linter 强制执行）、Entropy Management（自动清理低质量生成代码）

13. VOID —— "Harness Engineering: Building Reliable AI Agents in Production"

时间: 2026年2月11日
链接: https://void.ma/en/publications/harness-engineering-reliable-ai-agents-production/
核心观点: Harness 的 6 大支柱：
1. Context Engineering（记忆管理）
2. Skills & Tools（工具协议）
3. Workflow（工作流编排）
4. Validation Loops（验证循环）
5. Steering & Guardrails（约束护栏）
6. Observability（可观测性）
核心原则："critical → deterministic, creative → non-deterministic. The harness draws the line."
案例：100% 本地部署的 Qwen3.5-27B 做 CVE 自动修复

14. ddhigh —— "Harness Engineering: The Core Engineering Discipline of the AI Agent Era"

时间: 2026年3月27日
链接: https://www.ddhigh.com/en/2026/03/27/ai-agent-harness-engineering/
核心观点:
- 同一个模型在不同 harness 上 SWE-bench 得分差 20-30 个百分点
- 生产事故案例：Replit Agent 删除 1,206 条数据库记录、Gemini CLI 移动用户文件不可恢复、Amazon Q 被恶意提示攻击
- AI 安全事件 2023→2024 同比增长 56.4%

15. tianpan.co —— "Harness Engineering: The Discipline That Determines Whether Your AI Agents Actually Work"

时间: 2026年2月17日
链接: https://tianpan.co/blog/2026-02-17-harness-engineering-agent-first-software-development
核心观点: Guide（前馈）+ Sensor（反馈）的二分框架。同模型仅靠 harness 改进从 52.8 提升到 66.5（SWE-bench）。

五、开源项目（可动手对比的实现）

项目	核心思路	链接
Letta (原 MemGPT)	虚拟上下文管理，主存/归档分层，Agent 自主 page-in/out	https://github.com/letta-ai/letta
Mem0	从对话自动提取记忆、去重、更新，极简 API	https://github.com/mem0ai/mem0
Hindsight	多策略检索（语义、BM25、图谱、时间排序），实体图谱	https://github.com/hindsight-ai/hindsight
Zep	时间感知检索 + 自动摘要 + 实体图谱	https://github.com/getzep/zep
LangChain Memory	多种记忆类型（Buffer, Summary, VectorStore, Entity）	https://python.langchain.com/docs/modules/memory/
Generative Agents	斯坦福小镇原始实现	https://github.com/joonspk-research/generative_agents
Mempalace	Wing/Room/Drawer 层级 + KG + Tunnel 跨项目	(见你的记忆框架笔记)

六、阅读路线建议

第一阶段（理解范式）
  ├── 论文 #1: Generative Agents ← 记忆系统的创世纪
  ├── 论文 #2: MemGPT ← 操作系统视角的记忆管理
  └── 博客 #9: Anthropic Context Engineering ← 从 prompt 到 context 的演进

第二阶段（建立全景）
  ├── 综述 #3 或 #4 ← 选一篇通读，建立分类学框架
  └── 综述 #5: Rethinking Memory ← 六大基本操作

第三阶段（理解 Harness 中的记忆定位）
  ├── 博客 #7: Harrison Chase ← "记忆是 harness 的骨骼"
  ├── 博客 #8: AgentPatterns ← 7 个承重接缝
  └── 博客 #12: OpenAI ← 工程实践中 harness 长什么样

第四阶段（动手对比）
  └── 开源项目表 → 选 Mem0 或 Letta 与你的实现横向对比

七、信息源（持续获取新内容的渠道）

论文

渠道	说明
arxiv.org/list/cs.AI	AI 领域每日新论文，搜 "agent memory"、"context management"、"long-term memory LLM"
arxiv.org/list/cs.CL	计算语言学分区，LLM 相关论文主要在这里
Semantic Scholar	论文搜索引擎，比 Google Scholar 更好用，支持按引用量、时间排序，有 AI 摘要
Papers With Code	论文 + 对应开源实现，搜 "memory agent" 可以直接看到代码
ConnectedPapers	输入一篇种子论文（如 Generative Agents），自动生成关联论文图谱，发现你不知道的相关工作
Hugging Face Papers	社区每日投票的 AI 论文，偏实践导向

渠道	说明
LangChain Blog	Harrison Chase 团队，agent 架构、记忆、harness 的一手思考
Anthropic Engineering Blog	Context Engineering 等概念的原产地
OpenAI Blog	Harness Engineering 概念的提出者
Letta Blog	MemGPT 团队，记忆系统的专项深耕
AgentPatterns.ai	Agent 设计模式的系统化整理，质量很高
Simon Willison's Blog	LLM 工具链的独立观察者，经常有深度分析
Latent Space Podcast	AI 工程领域最活跃的播客/newsletter，经常采访框架作者本人
The Batch (deeplearning.ai)	Andrew Ng 的周刊，覆盖面广，适合保持大方向感知

社区 / 聚合

渠道	说明
r/LocalLLaMA	Reddit 最活跃的 LLM 社区，实践导向，经常有记忆方案的讨论
r/LangChain	LangChain 生态讨论，记忆模块是高频话题
Hacker News	搜 "agent memory"、"harness engineering"，评论区经常比文章本身更有价值
X/Twitter 列表	关注：@hwchase17 (Harrison Chase)、@karpathy、@swabormeister (Sarah Wooders)、@AnthropicAI、@simonw
GitHub Trending	定期看 Python / TypeScript 的 trending，agent 相关项目会周期性冒出来

课程

渠道	说明
DeepLearning.AI Short Courses	Andrew Ng 和各框架作者合作的免费短课，有 LangChain Memory、Agent 相关专题
Stanford CS324	Stanford 的 LLM 课程，覆盖 agent 架构

一、奠基论文​

1. Generative Agents: Interactive Simulacra of Human Behavior​

2. MemGPT: Towards LLMs as Operating Systems​

二、综述论文（系统性理解全貌）​

3. A Survey on the Memory Mechanism of Large Language Model based Agents​

4. Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers​

5. Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions​

6. From Human Memory to AI Memory​

三、行业核心博客​

7. Harrison Chase —— "Your Harness, Your Memory"​

8. AgentPatterns.ai —— "Harness-Memory Coupling as a Design Axis"​

9. Anthropic —— "Effective Context Engineering for AI Agents"​

10. Andrej Karpathy —— Context Engineering > Prompt Engineering​

11. Hindsight —— "The Missing Layer in Every Agent Harness"​

四、Harness Engineering 框架文章​

12. OpenAI —— "Harness Engineering: Leveraging Codex in an Agent-First World"​

13. VOID —— "Harness Engineering: Building Reliable AI Agents in Production"​

14. ddhigh —— "Harness Engineering: The Core Engineering Discipline of the AI Agent Era"​

15. tianpan.co —— "Harness Engineering: The Discipline That Determines Whether Your AI Agents Actually Work"​

五、开源项目（可动手对比的实现）​

六、阅读路线建议​

七、信息源（持续获取新内容的渠道）​

论文​

博客 / Newsletter​

社区 / 聚合​

课程​

一、奠基论文

1. Generative Agents: Interactive Simulacra of Human Behavior

2. MemGPT: Towards LLMs as Operating Systems

二、综述论文（系统性理解全貌）

3. A Survey on the Memory Mechanism of Large Language Model based Agents

4. Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers

5. Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions

6. From Human Memory to AI Memory

三、行业核心博客

7. Harrison Chase —— "Your Harness, Your Memory"

8. AgentPatterns.ai —— "Harness-Memory Coupling as a Design Axis"

9. Anthropic —— "Effective Context Engineering for AI Agents"

10. Andrej Karpathy —— Context Engineering > Prompt Engineering

11. Hindsight —— "The Missing Layer in Every Agent Harness"

四、Harness Engineering 框架文章

12. OpenAI —— "Harness Engineering: Leveraging Codex in an Agent-First World"

13. VOID —— "Harness Engineering: Building Reliable AI Agents in Production"

14. ddhigh —— "Harness Engineering: The Core Engineering Discipline of the AI Agent Era"

15. tianpan.co —— "Harness Engineering: The Discipline That Determines Whether Your AI Agents Actually Work"

五、开源项目（可动手对比的实现）

六、阅读路线建议

七、信息源（持续获取新内容的渠道）

论文

博客 / Newsletter

社区 / 聚合

课程