论文笔记 - Can We Edit Factual Knowledge by In-Context Learning

1. Information

Title: Can We Edit Factual Knowledge by In-Context Learning?
Link: IKE Paper
Source: Empirical Methods in Natural Language Processing (EMNLP)
Date: 2023.05.22

2. Summary

本文探讨了利用上下文学习（In-Context Learning, ICL）来编辑大型语言模型（LLMs）中的事实知识的可能性。提出了一种名为 IKE（In-Context Knowledge Editing）的方法，该方法无需更新模型参数，通过构建示范上下文来引导 LLMs 进行知识编辑。实验表明，IKE 在有效性（efficacy）、泛化性（generalization）和特异性（specificity）上达到了与基于梯度的方法相当甚至更优的性能，同时显著降低了计算开销和副作用。

3. Background

大规模语言模型在 NLP 任务中表现出色，但其存储的事实知识可能过时或错误。传统的知识编辑方法需要通过梯度更新模型参数，存在计算成本高、副作用大的问题。上下文学习是一种无需参数更新的新方法，本研究探索其在知识编辑中的潜力。

4. Research Objective

本研究的主要目标是探索 ICL 在 LLMs 知识编辑中的潜力，具体包括两个方面：确保 LLMs 能够对更新的知识进行泛化，即对同一知识的不同文本描述进行预测更新；确保编辑的特异性，即在修改目标知识事实时，不影响其他不相关的知识。

5. Method

5.1 In-Context Learning (ICL)

ICL 由 Brown 等人于2020年提出，旨在基于 \(k\) 个示范 \(C=\left\{\left(x_1, y_1\right), \ldots,\left(x_k, y_k\right)\right\}\)，在不更新任何参数的情况下，预测输入 \(x\) 的输出 \(y\)。

5.2 In-Context Knowledge Editing (IKE)

当向 LLMs 注入目标事实 \(f=\left(x^*, y^*\right)\) 时，构建 \(k\) 个示范 \(C = \left\{c_1, \ldots, c_k\right\}\)。知识编辑的目标是在目标提示 \(x^*\) 的编辑范围内（即 \(x \in \mathcal{D}_{x^*}\)），最大化 \(\mathcal{P}\left(y^* \mid x, f, C\right)\) （泛化目标），并在\(x \notin \mathcal{D}_{x^*}\)时，最小化 \(\mathcal{P}(y \mid x, f, C)\) 和 \(\mathcal{P}(y \mid x)\)之间的距离（特异性目标）。为了实现这些目标，示范的构建至关重要，研究者将其分解为两个子问题：如何设计每个示范的格式，以及如何选择和排序上下文示范。

5.2.1 Demonstration Formatting

每个示范 \(c_i\) 包含一个新事实 \(f_i=\left(x_i^*, y_i^*\right)\)，一个探测提示 \(x_i\) 及其预测 \(y_i\)。示范应教导 LLMs 复制、更新和保留不同提示的预测：

copy：注入新事实的第一步是教导 LLMs 复制新事实中的目标提示的预测。在 copy 示范中，\(x_i=x_i^*\)，\(y_i=y_i^*\)。
update：知识编辑不仅仅是让 LLMs 重复新事实。为了知识编辑的泛化，编辑范围内的提示的预测也应该更新。在 update 示范中，\(x_i \in \mathcal{D}_{x_i^*}\)，\(y_i=y_i^*\)。
retain：为了知识编辑的特异性，LLMs 应在范围外的提示中保持其原始预测。在 retain 示范中，\(x_i \notin \mathcal{D}_{x_i^*}\)，\(y_i\) 应为其原始答案\(y^o_i\)。

IKE 的模板 \(T\) 将 \(f\), \(x\) 和 \(y\) 转换为自然语言：\(\mathcal{T}(f, x, y)\) = New Fact: f, Prompt: x y。

5.2.2 Demonstration Organization

在编辑 LLMs 中的知识事实 \(f\) 时，从训练语料库中构建 \(k\) 个示范 \(C=\left\{c_1, \ldots, c_k\right\}\)。使用无监督检索器根据输入提示 \(x^*\) 与其原始答案 \(y^o\) 和目标预测 \(y^*\) 之间的余弦相似度选择 \(k\) 个最近邻。具体来说，使用预训练的句子编码器 \(E\) 对新事实 \(f\) 的提示 \(x^*\) 进行编码，以及训练语料库中的记录也将以相同方式进行编码，并基于余弦相似度检索 k-NN 事实。上下文示范的排序也取决于余弦相似度：\(\cos \left(c_0, f\right)<\cos \left(c_1, f\right)<\ldots<\cos \left(c_k, f\right)\)，其中\(c_1, \ldots, c_k\) 从左到右放置在上下文中。

6. Conclusion

IKE 在无需参数修改的情况下实现了高效知识编辑，具备较好的泛化性和特异性。

方法适用于大规模模型，计算效率高，且副作用少。

7. Notes

7.1 详细介绍 copy, update 和 retain 示例的选取。

copy, update 和 retain 三种示例选择的比例为 1:3:4。

Copy 示例用于直接注入新知识，因为目标提示是唯一的，示例需求较少。

Update 示例用于增强泛化性，由于涉及多种变体问题，Update 示例需求相对较多。

Retain 示例用于确保无关知识保持不变，由于训练语料中无关提示的数量较多，需要更多示例来指导模型正确保留这些知识。

Copy 示例

目的: 教模型直接记住目标知识，即在提示 \(x^*\) 下预测 \(y^*\)。

构造方式:

\(x_i = x^*\)，即目标提示本身。

\(y_i = y^*\)，即目标答案。

例如：

New Fact: The president of the US is Joe Biden.
Prompt: The president of the US is? Joe Biden.

Update 示例

目的: 增强模型的泛化能力，使模型能正确回答目标知识相关的变体问题。

构造方式:

从训练语料中选择与目标提示 \(x^*\) 相关的提示 \(x_i \in D_{x^*}\)。

将答案统一为目标答案 \(y^*\)。

例如：

New Fact: The president of the US is Joe Biden.
Prompt: Who is the president of the US? Joe Biden.
New Fact: The president of the US is Joe Biden.
Prompt: The current president of the US is? Joe Biden.

Retain 示例

目的: 确保无关知识不会受到影响，保持模型原有预测能力。

构造方式:

从训练语料中选择与目标提示无关的提示 \(x_i \notin D_{x^*}\)。

保留其原始答案 \(y_i = y^o_i\)。

例如：

New Fact: The president of the US is Joe Biden.
Prompt: Who is the president of Russia? Putin.
New Fact: The president of the US is Joe Biden.
Prompt: Who created Apple Inc.? Steve Jobs.

7.2 使用 IKE 方法编辑模型示例。

假设希望编辑语言模型中的一个事实：将“美国总统是奥巴马”更新为“美国总统是拜登”。

旧事实：The president of the US is Obama.

新事实：The president of the US is Joe Biden.
示范格式化（Demonstration Formatting） 根据论文，需要设计三种类型的示范：copy、update 和 retain。
Copy示范：直接注入新事实。
1
2
3
New Fact: The president of the US is Joe Biden.
Prompt: The president of the US is?
Answer: Joe Biden.
Update示范：对与目标事实相关的提示进行更新。
1
2
3
New Fact: The president of the US is Joe Biden.
Prompt: Who is the current president of the US?
Answer: Joe Biden.
Retain示范：保留与目标事实无关的原始预测。
1
2
3
New Fact: The president of the US is Joe Biden.
Prompt: Who is the president of Russia?
Answer: Putin.
示范组织（Demonstration Organization） 从训练语料库中检索与目标事实最相关的示范（选取比例为1:3:4），并将它们按相关性排序。假设我们选择了以下示范：
1
2
3
4
5
Context C:
- New Fact: The president of the US is Joe Biden. Prompt: The president of the US is? Answer: Joe Biden.
- New Fact: The president of the US is Joe Biden. Prompt: Who is the current president of the US? Answer: Joe Biden.
- New Fact: The president of the US is Joe Biden. Prompt: Who was the previous president of the US? Answer: Obama.
- New Fact: The president of the US is Joe Biden. Prompt: Who is the president of Russia? Answer: Putin.
模型输入与输出 将上述示范作为上下文输入到语言模型中，模型需要根据这些示范来调整其预测。
模型输入：
1
2
3
4
5
6
7
Context C:
New Fact: The president of the US is Joe Biden. Prompt: The president of the US is? Answer: Joe Biden.
New Fact: The president of the US is Joe Biden. Prompt: Who is the current president of the US? Answer: Joe Biden.
New Fact: The president of the US is Joe Biden. Prompt: Who was the previous president of the US? Answer: Obama.
New Fact: The president of the US is Joe Biden. Prompt: Who is the president of Russia? Answer: Putin.

Query: Who is the president of the US?
期望输出：
1
Answer: Joe Biden.

论文阅读

#深度学习 #NLP #知识编辑

论文笔记 - Can We Edit Factual Knowledge by In-Context Learning

http://hellochuanyang.github.io/2025/01/19/论文笔记-Can-We-Edit-Factual-Knowledge-by-In-Context-Learning/

作者

阿阳

发布于

2025年1月19日

许可协议

论文笔记 - Can We Edit Multimodal Large Language Models 上一篇

论文笔记 - Memory-Based Model Editing at Scale 下一篇