LSTM网络发明者核查

2025/08/15·StepFun can make mistakes, check important info.

引用：“LSTM 网络由 Bengio 等人在 1997 年提出。”是否正确？

Deep Research

预备调查清单

一、待核查内容列表

“LSTM 网络由 Bengio 等人在 1997 年提出。”

二、核查要点展开

待核查内容1：“LSTM 网络由 Bengio 等人在 1997 年提出。”

核查原因：
1. 事实准确性存疑：该陈述为关于人工智能领域关键技术发展史的核心事实断言，将特定技术（LSTM）归因于特定人物（Bengio 等人）和年份（1997），其准确性需要严格验证。
2. 核心信息核查：发明者、发明时间和技术名称是构成该陈述的三个基本要素，任何一个要素的错误都将导致整个陈述失效。
所需查明信息：
1. LSTM 的原始提出者：查明首次正式提出长短期记忆（LSTM）网络模型的学者或研究团队的准确姓名。
2. LSTM 的首次提出年份：确认包含 LSTM 概念的开创性论文或技术报告的准确发表年份。
3. 相关人物的贡献：厘清 Yoshua Bengio 及其合作者在循环神经网络（RNN）或相关领域（如梯度消失问题）的研究贡献，以判断是否存在贡献混淆或张冠李戴的情况。
必要信源：
1. 学术数据库：如 Google Scholar, IEEE Xplore, ACM Digital Library 等，用于查找并验证提出 LSTM 的原始论文。
2. 权威学术期刊或会议论文集：查找并审阅被广泛引用的、关于 LSTM 或 RNN 的综述性论文。
3. 领域内公认的权威著作：如 Ian Goodfellow 等人撰写的《深度学习》(Deep Learning) 教材，查阅其中关于 LSTM 发展史的章节。
核查优先级：高

行动计划

通过学术搜索引擎，使用关键词“Long Short-Term Memory”、“LSTM original paper”等，定位首次提出 LSTM 概念的原始学术论文。
仔细核对所找到的原始论文的作者列表、发表年份及发表刊物/会议，以确定 LSTM 的真正提出者和准确的提出时间。
查找人工智能和深度学习领域的权威综述文章或经典教科书，查阅其中关于 LSTM 历史的章节，与原始论文的信息进行交叉比对，确保信息的一致性和准确性。
检索 Yoshua Bengio 在 1997 年前后关于循环神经网络（RNN）和梯度消失问题的研究论文，明确其在该领域的具体贡献，以判断待核查陈述是否将不同学者的贡献混淆。

首先，我需要搜索 LSTM 网络的起源和发明者信息。

LSTM network original paper first proposed authors year

文件

[PDF] A causal intervention framework for synthesizing mobility data and evaluating predictive neural networks

文件

[PDF] Occurrence prediction of pests and diseases in cotton on the basis of weather factors by long short term memory network

GitHub Pages

They were introduced by Hochreiter & Schmidhuber (1997), and were refined and popularized by many people in following work.

dl.acm.org

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term ...

sciencedirect.com

LSTM was first proposed in 1997 (Sepp Hochreiter, Jürgen Schmidhuber) [13] ... In this paper, the LSTM NN is proposed for tourism flow prediction. We ...

博客园

多层LSTM 参考资料长短期记忆网络通常称为LSTMs，是一种特殊的RNN，能够学习长期依赖关系。他们是由Hochreiter 等人在1997年提出的，在之后的工作中又被很多人精炼和推广。它们对各种各样的问题都非常有效，现在被广泛使用。LSTMs被明确设计为避免长期依赖问题。长时间记忆信息实际上是他们的默认行为，而不是他们努力学习的东西。LSTMs网络架构 LSTM的核心思想 LSTMs的关键是单元状态，即贯穿图表顶部的水平线。细胞的状态有点像传送带。它沿着整个链向下延...

易采站长

LSTM的结构如下：这种结构的核心思想是引入了一个叫做细胞状态的连接，这个细胞状态用来存放想要记忆的东西。同时在里面加入了三个门：忘记门;顾名思义，是控制是否遗忘的，在LSTM中即以一定的概率控制是否遗忘上一层的隐藏细胞状态。输入门:输入门（input gate）负责处理当前序列位置的输入.输出门：决定什么时候需要把状态和输出放在一起输出。二 LSTM 的变体上面我们介绍了正常的 LSTM。但是不是所有的 LSTM

百度

首先，我们通过一张流程图来直观地理解LSTM的工作原理。图中的每一个节点代表一个神经元，而不同的颜色表示不同的功能模块。输入门（Input Gate）:这个门决定新的信息是否可以进入LSTM单元。通过一个sigmoid层和一个tanh层，我们能够计算出一个介于0和1之间的数值，这个数值决定了当前状态的新增信息。sigmoid层的输出是控制门（即“是否接受新信息”），而tanh层的输出是新的候选状态。遗忘门（Forget Gate）:这个门负责决定哪些信息需要被遗忘。同样，...

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9jZG4uc20uY24vdGVtcC8yMDIxMTExODEwMTEzOS0wNDZjbjNvbWI3czd5dTcyNDJkdnM0dTkwdTU2OGgxYS5wbmc=&sign=yx:vwfYxilfk-1jrtYWPjgmjicOSsU=&tv=400_400

View all 18 sources

我将对这个引用进行深入核查："LSTM 网络由 Bengio 等人在 1997 年提出。"

首先，我需要搜索更多关于LSTM网络起源的具体信息。

who invented LSTM neural network Hochreiter Schmidhuber year

文件

[PDF] Analysis and prediction of force characteristics of tubular turbine based on Hankel-DMD-LSTM

文件

[PDF] ࣗݾ஫ҙػߏ෇͖LSTMΛ༻͍ͨܠگײ৘ใʹجͮ͘ ۚ༥จॻͷॏཁจநग़

researchgate.net

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term ...

d2l.ai

... (LSTM) model due to Hochreiter and Schmidhuber (1997). LSTMs resemble standard recurrent neural networks but here each ordinary recurrent node is replaced by ...

sciencedirect.com

Such a network is a type of RNN, which was proposed by Sepp Hochreiter and Jürgen Schmidhuber in 1977.

博客园

1.Frank Rosenblatt 首先介绍的是神经网络的开山祖师，先放张图拜拜 Frank Rosenblatt 出生在纽约，父亲是医生，其1956年在Cornell大学拿到博士学位后，留校任教，研究方向为心理学和认知心理学。1957年，Frank提出了Perceptron的理论。1960年，在计算机运算能力还不强的时候，其使用基于硬件结构搭建了一个神经网络，大概长下面这样（跪）。但是和所有先驱一样，Frank开创性的工作并没有在当时得到认可。当时两位科学家 Marvin Minksy 和 Seymou...

博客园

John J.Hopfield,Neural networks andphysical systems with emergent collective computational abilities,Proc.Natl.Acad.Sci.USA,vol.79 no.8,pp.2554–2558,April 1982. 1986 年，Rumelhart 和 McCelland 等提出了误差反向传播(BP)算法，用于多层前馈神经网络的优化。迄今为止应用最广的神经网络学习算法。Rumelha...

知乎

本节课程地址： 57 长短期记忆网络（LSTM）【动手学深度学习v2】_哔哩哔哩_bilibili 本节教材地址： 9.2. 长短期记忆网络（LSTM） — 动手学深度学习 2.0.0 docume

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9ndy5hbGljZG4uY29tL0wxLzcyMy8xNTg0MzI5MTIyL2ZhLzdiL2QwL2ZhN2JkMGM2MTlhODM2NmIwNzNkNWM3ZWQ0ZWFhNWUwLmljbw==&sign=yx:C1HunXUgWw_Hnydsxo7hNSZ08AU=&tv=400_400

View all 17 sources

我需要查找更多关于LSTM网络的原始论文和发明者信息，以确认LSTM网络是否由Bengio等人在1997年提出。

LSTM original paper Hochreiter Schmidhuber 1997

文件

[PDF] A Critical Review of Recurrent Neural Networks for Sequence Learning

文件

[PDF] ࣗݾ஫ҙػߏ෇͖LSTMΛ༻͍ͨܠگײ৘ใʹجͮ͘ ۚ༥จॻͷॏཁจநग़

dl.acm.org

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term ...

researchgate.net

LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, ...

麻省理工学院

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called ...

金蝶云社区

因为近几年人工智能在工业界大爆发，LSTM这个术语听起来比较时尚新颖，其实这个模型早在1997年就被Hochreiter和Schmidhuber给搞出来了，论文在这 LSTM Original Paper。就像神经网络的命运一样，LSTM直到最近才被深度学习的浪潮给带起来，成为所谓的网红。前面已经说过，LSTM是RNN的一个很特殊的种类，这种网络结构主要用来给序列型数据建模，尤其适合应用于NLP(Natural Lan...

机器之心

他最为人们所知的名号就是「LSTM 之父」了：早在 1997 年，他和 Sepp Hochreiter 共同撰写了一篇论文，其中提出了一种利用记忆功能来增强人工神经网络（模拟人类大脑的计算机系统）的方法，即根据之前获得的信息，添加解释文字或图像模式的循环。他们称之为「长短期记忆（LSTM）」。LSTM 解决 RNN 存在的短板，在随后的多年中被人们广泛采用。然而遗憾的是，当时 LSTM 也并没有受到业界更多的重视，在...

云南大学学报(自然科学版)

Automatic modulation classification technology is an important research field in wireless communication technology.Two deep learning models,convolutional neural network and long short-term memory network,have been widely used in feature-based automatic mod...

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9jZG4uc20uY24vdGVtcC8yMDIzMTEwODE2MzYxMS04M2d5ZXF6N3c4Nmg2d3Y4YzVqMTMyaGJuNmxubmNwMC5qcGc=&sign=yx:wUNFfk0-SwNx06Kod81g4SBQzoU=&tv=400_400

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9ndy5hbGljZG4uY29tL0wxLzcyMy8xNTg0NDM0MjA4Lzg4LzJkLzIyLzg4MmQyMjNjZDQwYTU0NTUzYWFhOWQ2MTk0MjkzYTkyLmljbw==&sign=yx:3DqV4OhRbkLh-tyg8wPgSAAnD-0=&tv=400_400

View all 18 sources

我需要查找LSTM网络的原始论文信息，以确认其真正的发明者和发表年份。

网页解析

https://direct.mit.edu/neco/article/9/8/1735/6109/Long-Short-Term-Memory

已核查结论清单

已核查内容1：“LSTM 网络由 Bengio 等人在 1997 年提出。”

初步结论：错误。长短期记忆网络（LSTM）由 Sepp Hochreiter 和 Jürgen Schmidhuber 在 1997 年提出，而非 Yoshua Bengio 等人。
信源支撑要点：
1. 多个独立的学术数据库和出版商网站均将 1997 年发表于《Neural Computation》期刊的论文《Long Short-Term Memory》的作者明确标注为 Sepp Hochreiter 和 Jürgen Schmidhuber。
2. 多篇权威的综述性论文和深度学习教材在追溯 LSTM 的起源时，一致将这项工作归功于 Hochreiter 和 Schmidhuber 在 1997 年的开创性论文。
3. 在关于深度学习先驱的讨论中，Yoshua Bengio 因其在神经网络领域的奠基性贡献（特别是关于梯度消失问题的研究）而闻名，并与 Geoffrey Hinton、Yann LeCun 共同获得 2018 年图灵奖，但这与其并非 LSTM 的提出者这一事实不冲突。原始待核查陈述可能混淆了不同学者在相关领域的贡献。
证据：
1. 原文摘录：Long Short-Term Memory · Sepp Hochreiter, J. Schmidhuber · Published in Neural Computation 1 November 1997 · Computer Science.
  链接：https://www.semanticscholar.org/paper/Long-Short-Term-Memory-Hochreiter-Schmidhuber/2e9d221c206e9503ceb452302d68d10e293f2a10
  信源名称：Semantic Scholar
  信源发布时间：1997-11-01
2. 原文摘录：The most successful RNN architectures for sequence learning stem from two papers published in 1997. The first paper, Long Short-Term Memory by Hochreiter and Schmidhuber [1997], introduces the memory cell, a unit of computation that replaces traditional nodes in the hidden layer of a network.
  链接：http://arxiv.org/pdf/1506.00019v3
  信源名称：arXiv.org (A Critical Review of Recurrent Neural Networks for Sequence Learning)
  信源发布时间：未知（论文版本v3提交于2015年）
3. 原文摘录：Long short-term memory network is an advanced recurrent neural network (Hochreiter and Schmidhuber, 1997) and provides a well-constructed structure...
  链接：https://www.sciencedirect.com/topics/chemical-engineering/long-short-term-memory
  信源名称：ScienceDirect Topics
  信源发布时间：未知
4. 原文摘录：长期以来，隐变量模型存在着长期信息保存和短期输入缺失的问题。解决这一问题的最早方法之一是长短期存储器（long short-term memory，LSTM） ( Hochreiter and Schmidhuber, 1997 )。
  链接：https://zhuanlan.zhihu.com/p/719122101
  信源名称：知乎专栏 (动手学深度学习)
  信源发布时间：2024-09-10
信源结论一致性：一致
是否需复核：否

待补充核查清单

待核查内容1：“LSTM 网络由 Bengio 等人在 1997 年提出。”

核查原因：
1. 核心事实已证伪，但为了使核查报告更具深度和说服力，有必要探究该错误陈述可能产生的根源，即澄清 Yoshua Bengio 在相关领域的具体贡献，以解释为何会发生贡献归属的混淆。
所需查明信息：
1. 查明 Yoshua Bengio 及其合作者在 1997 年前后，特别是在 1994 年至 1997 年间，针对循环神经网络（RNN）和梯度消失/爆炸问题发表的关键性研究成果。
2. 确认 Bengio 的研究与 Hochreiter & Schmidhuber 提出的 LSTM 之间的确切关系（例如，前者是否深刻揭示了 LSTM 所要解决的问题）。
必要信源：
1. 权威学术数据库（如 Google Scholar, DBLP, arXiv）中的相关论文。
2. 深度学习领域的权威历史综述或教科书。
核查优先级：中

行动计划

在学术数据库中检索 Yoshua Bengio 在 1994 年至 1997 年间发表的、以“recurrent neural networks”、“gradient vanishing”、“long-term dependencies”等为主题的论文。
审阅检索到的关键论文的摘要和结论部分，明确其核心贡献，并与 LSTM 旨在解决的问题进行对比分析。

Bengio recurrent neural networks gradient vanishing 1994

文件

[PDF] Avoiding pathologies in very deep networks

文件

[PDF] Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

arxiv.org

As introduced in Bengio et al. (1994), the exploding gradients problem refers to the large increase in the norm of the gradient during training.

arxiv.org

There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems.

proceedings.mlr.press

There are two widely known issues with properly training recurrent neural networks, the vanishing and the exploding gradient problems detailed in Bengio et ...

HKBU

Abstract— Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, ...

m.techwalker.com

Bengio 的主要贡献在于他对循环神经网络（RNN,Recurrent Neural Networks）的一系列推动，包括经典的neural language model，gradient vanishing 的细致讨论，word2vec的雏形，以及machine translation。Bengio是Deep Learning(《深度学习》)一书的合著者（另两位作者是Ian Goodfellow与Aaron Courville），这...

稀土掘金

循环神经网络（Recurrent Neural Networks，RNN）是一种特殊的神经网络结构，它们可以处理序列数据，如自然语言、音频和图像等。RNN 的主要特点是，它们具有“记忆”的能力，可以将之前的信息与当前输入的信息结合起来进行处理，从而捕捉到序列中的长距离依赖关系。RNN 的发展历程可以分为以下几个阶段： 1986年，普林斯顿大学的科学家J.Hopfield和Y.Tank提出了一种名为“回声”（Echo）的简单RNN模型，用于处理时间序列数据。2000年，CMU的科学家Sepp...

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9ndy5hbGljZG4uY29tL0wxLzcyMy8xNTg0NDEyNDQ0LzgzLzdkLzI2LzgzN2QyNjBiZGI3ODg0NzQxYWNhMWMzZGQ5NWQwMTc2Lmljbw==&sign=yx:FgUrPkD016Rc89-ZBz7SYezxGYQ=&tv=400_400

https://s2.zimgs.cn/ims?kt=url&at=smstruct&key=aHR0cHM6Ly9jZG4uc20uY24vdGVtcC8yMDIzMTEwNzE2MjUyNi1tYnF0anhlZG9hZHdlbTF1bjh2aXUyZmJqOWdiaDV4aS5qcGc=&sign=yx:yQchh3-fC9nPjUvoM67lCnyns5g=&tv=400_400

View all 18 sources

我需要查找更多关于Bengio在1994年发表的关于梯度消失问题的论文，以及他与LSTM的关系。

Bengio 1994 Learning long-term dependencies with gradient descent is difficult

文件

[PDF] Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

文件

[PDF] Understanding the exploding gradient problem

百度

BENGIO,Y. 摘要： Recurrent neural networks can be used to map input sequences to output sequences,such as for recognition,production or prediction problems.However,practical difficulties have been reported in training recurrent neural networks to per...

知乎

论文网址： Learning long-term dependencies with gradient descent is difficult 论文一作是图灵奖获得者 Bengio。他本

ieeexplore.ieee.org

We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases.

HKBU

However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the ...

researchgate.net

We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases.

掌桥科研

The authors seek to train recurrent neural networks in order to map input sequences to output sequences,for applications in sequence recognition or production.Results are presented showing that learning long-term dependencies in such recu...

View all 18 sources

我需要查找更多关于LSTM的原始论文信息，以确认其真正的发明者和发表年份。

网页解析

https://direct.mit.edu/neco/article/9/8/1735/6109/Long-Short-Term-Memory

我需要查找更多关于LSTM的原始论文信息，特别是作者和发表年份。

Long Short-Term Memory Hochreiter Schmidhuber 1997 original paper

文件

[PDF] Order-Planning Neural Text Generation From Structured Data

文件

[PDF] RESEARCH ARTICLE Robust and brain-like working memory through short-term synaptic plasticity

dl.acm.org

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term ...

researchgate.net

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term ...

麻省理工学院

We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called ...

Hyper.AI

长短期记忆（英语：Long Short-Term Memory，LSTM）是一种时间递归神经网络（RNN），论文首次发表于 1997 年。由于独特的设计结构，LSTM 适合于处理和预测时间序列中间隔和延迟非常长的重要事件。LSTM 的表现通常比时间递归神经网络及隐马尔科夫模型（HMM）更好，比如用在不分段连续手写识别上。2009 年，用 LSTM 构建的人工神经网络模型赢得过 ICDAR 手写识别比赛冠军。LSTM 还普遍用于自主语音识别，2013 年运用 TIMIT 自...

腾讯云

长短期记忆人工神经网络（Long-Short Term Memory,LSTM）由于独特的设计结构，LSTM适合于处理和预测时间序列中间隔和延迟非常长的重要事件。LSTM的表现通常比时间递归神经网络及隐马尔科夫模型（HMM）更好，比如用在不分段连续手写识别上。2009年，用LSTM构建的人工神经网络模型赢得过ICDAR手写识别比赛冠军。LSTM还普遍用于自主语音识别，2013年运用TIMIT自然演讲数据库达成17.7%错误率的纪录。作为非线性模型，LSTM可作为...

腾讯云

而在深度学习算法中，长短期记忆网络（Long Short-Term Memory，LSTM）是一种特殊的循环神经网络（Recurrent Neural Network，RNN），它在序列数据建模中具有出色的能力。本文将深入探讨LSTM网络的原理和应用，以及它在深度学习领域的重要性。2.LSTM网络的原理 LSTM网络是由Hochreiter和Schmidhuber于1997年提出的，旨在解决标准RNN中长期依赖问题。LSTM通过引入记忆单元（

View all 18 sources

已核查结论清单

已核查内容1：“LSTM 网络由 Bengio 等人在 1997 年提出。”

初步结论：错误。
信源支撑要点：
1. 多个独立的学术数据库、出版商网站和权威综述论文均明确指出，长短期记忆网络（LSTM）由 Sepp Hochreiter 和 Jürgen Schmidhuber 在 1997 年发表的论文《Long Short-Term Memory》中首次提出。
2. Yoshua Bengio 及其合作者在 1994 年发表了关键论文《Learning long-term dependencies with gradient descent is difficult》，该论文深入分析并揭示了传统循环神经网络（RNN）在学习长序列数据时面临的“梯度消失”和“梯度爆炸”问题。
3. LSTM 的设计初衷正是为了解决 Bengio 等人所阐明的长期依赖问题。因此，原始待核查陈述极有可能是将“问题的关键分析者”与“解决方案的提出者”进行了混淆。
证据：
1. 原文摘录：Long Short-Term Memory · Sepp Hochreiter, J. Schmidhuber · Published in Neural Computation 1 November 1997 · Computer Science.
  链接：https://www.semanticscholar.org/paper/Long-Short-Term-Memory-Hochreiter-Schmidhuber/2e9d221c206e9503ceb452302d68d10e293f2a10
  信源名称：Semantic Scholar
  信源发布时间：1997-11-01
2. 原文摘录：长期以来，隐变量模型存在着长期信息保存和短期输入缺失的问题。解决这一问题的最早方法之一是长短期存储器（long short-term memory，LSTM） ( Hochreiter and Schmidhuber, 1997 )。
  链接：https://zhuanlan.zhihu.com/p/719122101
  信源名称：知乎专栏（动手学深度学习）
  信源发布时间：2024-09-10
3. 原文摘录：Learning long-term dependencies with gradient descent is difficult. Yoshua Bengio, Patrice Simard, Paolo Frasconi. IEEE Trans. on Neural Networks 1994.
  链接：https://pdfs.semanticscholar.org/ba18/247cd3ce9f711eecc7296f1c3561dbfb6cc2.pdf
  信源名称：Semantic Scholar
  信源发布时间：1994
4. 原文摘录：Unfortunately, it has been observed by, e.g., Bengio et al. [1994] that it is difficult to train RNNs to capture long-term dependencies because the gradients tend to either vanish (most of the time) or explode (rarely, but with severe effects).
  链接：http://arxiv.org/pdf/1412.3555v1
  信源名称：arXiv.org (Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling)
  信源发布时间：2014-12-08
信源结论一致性：一致
是否需复核：否

待补充核查清单

待核查内容1：Hochreiter 和 Schmidhuber 在 1997 年的 LSTM 论文中是否直接引用或论述了 Bengio 等人在 1994 年关于梯度问题的研究。

核查原因：
1. 明确两者之间的学术传承关系，可以为“为何会产生将 LSTM 归功于 Bengio 的错误”提供更直接、更具说服力的解释，从而使事实核查报告的逻辑链条更加完整和严谨。
所需查明信息：
1. 查找 1997 年 LSTM 原始论文的参考文献列表或引言部分，确认其中是否包含对 Bengio, Simard, and Frasconi (1994) 论文的引用。
必要信源：
1. 《Long Short-Term Memory》 (Hochreiter & Schmidhuber, 1997) 论文全文或包含参考文献的部分。
核查优先级：高

行动计划

查找 Sepp Hochreiter 和 Jürgen Schmidhuber 于 1997 年发表的论文《Long Short-Term Memory》的全文版本。
仔细审阅该论文的引言、背景介绍以及参考文献部分，确认其中是否引用了 Yoshua Bengio 等人 1994 年关于学习长期依赖困难的研究成果。

Long Short-Term Memory Hochreiter Schmidhuber 1997 cite Bengio 1994

文件

[PDF] 关于深度学习的综述与讨论

researchgate.net

Hochreiter and Schmidhuber 1996, 1997) that simple weight guessing solves many of the problems. in (Bengio 1994, Bengio and Frasconi 1994, Miller and Giles 1993 ...

dl.acm.org

Hochreiter, S., & Schmidhuber, J. (1997). LSTM can solve hard long time lag problems. In Advances in neural information processing systems 9.

semanticscholar.org

Long Short-Term Memory · Sepp Hochreiter, J. Schmidhuber · Published in Neural Computation 1 November 1997 · Computer Science.

Hyper.AI

腾讯云

AI工具集

LSTM，全称为Long Short-Term Memory，中文为“长短期记忆网络”，由Sepp Hochreiter和Jürgen Schmidhuber在1997年提出，是一种能够记住长期信息并用于未来计算的深度学习算法。LSTM的设计初衷是为了解决传统循环神经网络RNN在处理长序列数据时遇到的梯度消失问题。LSTM通过引入特殊的结构—记忆单元以及控制信息流的门控机制，使其能够学习并记住长期的信息，同时忽略不重要的信息。LSTM的结构 LSTM网络...

View all 17 sources

我需要查找更多关于LSTM原始论文的信息，特别是查看原论文是否引用了Bengio的工作。

网页解析

https://www.researchgate.net/publication/13853244_Long_Short-term_Memory

1. 网页内容摘要

本网页内容是关于Hochreiter和Schmidhuber 1997年发表的长短期记忆(LSTM)原始论文的摘要和部分内容。论文解决了递归神经网络中的长时间依赖问题，即通过反向传播学习存储长时间间隔信息的困难。作者分析了传统递归网络中梯度消失或爆炸的问题，并提出了LSTM作为解决方案。LSTM通过特殊单元中的"常量错误旋转木马"(constant error carousels)强制保持恒定的错误流，并使用乘法门单元学习开关对这种错误流的访问。论文详细介绍了LSTM的架构、算法原理和数学推导，并通过与当时其他方法(如RTRL、BPTT、递归级联相关、Elman网络等)的比较实验，证明LSTM能够更成功、更快速地学习，特别是在处理超过1000步的长时间间隔任务时表现优异。文章还包含了LSTM记忆单元的架构图和网络结构示例图，以及详细的算法描述和错误流公式。

2. 与探究问题相关的信息

在网页内容中，我没有找到Hochreiter和Schmidhuber的LSTM原始论文引用Bengio 1994年关于梯度问题工作的直接证据。然而，文章确实在"先前工作"部分提到了Bengio的研究：

在"Bengio et al.'s approaches"小节中，作者提到："Bengio et al. (1994) investigate methods such as simulated annealing, multi-grid random search, time-weighted pseudo-Newton optimization, and discrete error propagation. Their 'latch' and '2-sequence' problems are very similar to problem 3a with minimal time lag 100 (see Experiment 3)."
在解释梯度消失问题时，作者提到："A very similar, more recent analysis was presented by Bengio et al. 1994"，表明Bengio等人在1994年也进行了类似的梯度问题分析。
论文还提到："...recently we discovered (Schmidhuber and Hochreiter 1996, Hochreiter and Schmidhuber 1996, 1997) that simple weight guessing solves many of the problems in (Bengio 1994, Bengio and Frasconi 1994, Miller and Giles 1993, Lin et al. 1995) faster than the algorithms proposed therein."

这些引用表明Hochreiter和Schmidhuber确实了解并引用了Bengio 1994年的工作，特别是关于梯度问题和长时间依赖学习困难的研究。虽然没有明确说明是否引用了Bengio关于梯度问题的特定工作，但从上下文可以推断，他们确实认可并参考了Bengio在这一领域的贡献。

3. 相关网页链接

Figure 1: Architecture of memory cell c j (the box) and its gate units... - LSTM记忆单元架构图
Figure 2: Example of a net with 8 input units, 4 output units, and 2... - LSTM网络结构示例图
Sepp Hochreiter的ResearchGate个人页面 - 论文第一作者的学术主页
ResearchGate注册页面 - 网站注册链接
论文阅读预览链接 - 完整论文预览

4. 相关图片

图片1
- Title: Architecture of memory cell c j (the box) and its gate units in j ; out j
- Content: LSTM记忆单元的架构图，展示了记忆单元及其门控单元的结构，包括自循环连接(权重为1.0)，形成"常量错误旋转木马"(CEC)的基础。门控单元控制对CEC的访问。
- Source: Sepp Hochreiter
- Link: https://www.researchgate.net/profile/Sepp-Hochreiter/publication/13853244/figure/fig1/AS:601753980506137@1520480737784/Architecture-of-memory-cell-c-j-the-box-and-its-gate-units-in-j-out-j-The_Q320.jpg
图片2
- Title: Example of a net with 8 input units, 4 output units, and 2 memory cell blocks of size 2
- Content: LSTM网络结构示例，包含8个输入单元、4个输出单元和2个大小为2的记忆单元块。
- Source: Sepp Hochreiter
- Link: https://www.researchgate.net/profile/Sepp-Hochreiter/publication/13853244/figure/fig2/AS:601753980530704@1520480737802/Example-of-a-net-with-8-input-units-4-output-units-and-2-memory-cell-blocks-of-size-2_Q320.jpg

我需要进一步查找Bengio 1994年的论文，以确认他在梯度消失问题上的贡献。

Bengio Simard Frasconi 1994 Learning long-term dependencies with gradient descent is difficult

文件

[PDF] 关于深度学习的综述与讨论

文件

[PDF] Understanding the exploding gradient problem

ieeexplore.ieee.org

We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases.

HKBU

Bengio, P. Frasconi, P. Simard, "The problem of learning long- term dependencies in recurrent networks," invited paper at the IEEE. International ...

researchgate.net

We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases.

知乎

论文网址： Learning long-term dependencies with gradient descent is difficult 论文一作是图灵奖获得者 Bengio。他本

NSTL国家科技图书文献中心杭州服务站

Y.Bengio|P.Simard|P.Frasconi-Learning long-term dependencies with gradient descent is difficult-IEEE Transactions on Neural Networks-1994,5(2)-157～166 Wolfgang.Maass|Thomas.Natschläger|Henry.Markram-Real-Time Co...

机器之心

研究者进一步观察发现，程序比自然语言语句大得多，即使是带有长短期记忆(Hochreiter and Schmidhuber 1997,LSTM)单元的 RNN 也存在长期依赖问题(Bengio,Simard,and Frasconi 1994)。而 CNN，却能通过滑动窗口有效地捕获不同区域的特征。为此，研究者提出了一种基于语法的结构化 CNN 来用于代码生成。他们的模型根据 AST 中的语法结构规则生成代码，例如，If→expr stmt*stmt*就...

View all 18 sources

我需要查找更多关于LSTM原始论文中是否引用了Bengio 1994年的工作的信息。

网页解析

https://www.semanticscholar.org/paper/Long-Short-Term-Memory-Hochreiter-Schmidhuber/2e9d221c206e9503ceb452302d68d10e293f2a10

已核查结论清单

已核查内容1：“LSTM 网络由 Bengio 等人在 1997 年提出。”

初步结论：错误。
信源支撑要点：
1. 大量独立的学术数据库、出版商网站、权威综述论文及教科书均明确指出，长短期记忆网络（LSTM）由 Sepp Hochreiter 和 Jürgen Schmidhuber 在 1997 年发表的论文《Long Short-Term Memory》中首次提出。
2. Yoshua Bengio 及其合作者在 1994 年发表了关键论文《Learning long-term dependencies with gradient descent is difficult》，该论文深入分析并揭示了传统循环神经网络（RNN）在学习长序列数据时面临的“梯度消失”和“梯度爆炸”问题，这是LSTM旨在解决的核心难题。
3. Hochreiter 和 Schmidhuber 在其 1997 年的 LSTM 原始论文中，明确引用并讨论了 Bengio 等人 1994 年的研究成果，证实了两者之间的学术传承关系：Bengio 等人深刻地阐述了问题，而 Hochreiter 和 Schmidhuber 提出了 LSTM 作为解决方案。
4. 因此，原始待核查陈述极有可能是将“问题的关键分析者”（Bengio）与“解决方案的提出者”（Hochreiter & Schmidhuber）进行了混淆。
证据：
1. 原文摘录：Long Short-Term Memory · Sepp Hochreiter, J. Schmidhuber · Published in Neural Computation 1 November 1997 · Computer Science.
  链接：https://www.semanticscholar.org/paper/Long-Short-Term-Memory-Hochreiter-Schmidhuber/2e9d221c206e9503ceb452302d68d10e293f2a10
  信源名称：Semantic Scholar
  信源发布时间：1997-11-01
2. 原文摘录：Learning long-term dependencies with gradient descent is difficult. Yoshua Bengio, Patrice Simard, Paolo Frasconi.
  链接：https://ieeexplore.ieee.org/document/279181
  信源名称：IEEE Transactions on Neural Networks
  信源发布时间：1994-03
3. 原文摘录：A very similar, more recent analysis was presented by Bengio et al. 1994.
  链接：https://www.researchgate.net/publication/13853244_Long_Short-term_Memory
  信源名称：ResearchGate (Hochreiter & Schmidhuber, 1997 论文页面)
  信源发布时间：1997
4. 原文摘录：Long Short-Term Memory networks (Hochreiter & Schmidhuber, 1997) aim to mitigate the exploding and vanishing gradient problem (Bengio et al., 1994 ...
  链接：https://www.sciencedirect.com/science/article/abs/pii/S0893608021003439
  信源名称：ScienceDirect
  信源发布时间：2021
信源结论一致性：一致
是否需复核：否

待补充核查清单

（无）

行动计划

（无）

核查｜LSTM网络发明者归属【陈述错误，不适用于学术引用】

核查结果

本次核查共涉及1条内容，其中1条内容经证据证实为不可靠。

核查发现，"LSTM网络由Bengio等人在1997年提出"的说法是错误的。大量独立学术数据库和权威文献一致表明，LSTM网络实际上是由Sepp Hochreiter和Jürgen Schmidhuber在1997年首次提出。Yoshua Bengio及其合作者在1994年的工作虽然对LSTM的发展有重要影响，但他们的贡献是分析了传统循环神经网络面临的梯度消失问题，而非提出LSTM解决方案。

建议：在学术写作或技术讨论中，应准确引用LSTM网络的发明者为Hochreiter和Schmidhuber，以避免学术归属错误。

核查过程

原文陈述	可靠性分析	证据
"LSTM网络由Bengio等人在1997年提出。"	不可靠该陈述包含两个错误：1）LSTM网络的发明者错误，实际发明者是Sepp Hochreiter和Jürgen Schmidhuber，而非Bengio等人；2）虽然年份1997年是正确的，但将其与错误的发明者关联导致整个陈述不可靠。 Yoshua Bengio及其合作者在1994年确实发表了关于循环神经网络梯度消失问题的重要研究，这是LSTM旨在解决的核心问题，但他们并非LSTM的提出者。这种混淆可能源于对"问题分析者"与"解决方案提出者"角色的错误归属。	1. "Long Short-Term Memory · Sepp Hochreiter, J. Schmidhuber · Published in Neural Computation 1 November 1997 · Computer Science." https://www.semanticscholar.org/paper/Long-Short-Term-Memory-Hochreiter-Schmidhuber/2e9d221c206e9503ceb452302d68d10e293f2a10 (1997-11-01) 2. "长期以来，隐变量模型存在着长期信息保存和短期输入缺失的问题。解决这一问题的最早方法之一是长短期存储器（long short-term memory，LSTM）(Hochreiter and Schmidhuber, 1997)。" https://zhuanlan.zhihu.com/p/719122101 (2024-09-10) 3. "Learning long-term dependencies with gradient descent is difficult. Yoshua Bengio, Patrice Simard, Paolo Frasconi. IEEE Trans. on Neural Networks 1994." https://pdfs.semanticscholar.org/ba18/247cd3ce9f711eecc7296f1c3561dbfb6cc2.pdf (1994) 4. "Unfortunately, it has been observed by, e.g., Bengio et al. [1994] that it is difficult to train RNNs to capture long-term dependencies because the gradients tend to either vanish (most of the time) or explode (rarely, but with severe effects)." http://arxiv.org/pdf/1412.3555v1 (2014-12-08) 5. "A very similar, more recent analysis was presented by Bengio et al. 1994." https://www.researchgate.net/publication/13853244_Long_Short-term_Memory (1997) 6. "Long Short-Term Memory networks (Hochreiter & Schmidhuber, 1997) aim to mitigate the exploding and vanishing gradient problem (Bengio et al., 1994 ..." https://www.sciencedirect.com/science/article/abs/pii/S0893608021003439 (2021) 7. "The most successful RNN architectures for sequence learning stem from two papers published in 1997. The first paper, Long Short-Term Memory by Hochreiter and Schmidhuber [1997], introduces the memory cell, a unit of computation that replaces traditional nodes in the hidden layer of a network." http://arxiv.org/pdf/1506.00019v3 (2015) 8. "Long short-term memory network is an advanced recurrent neural network (Hochreiter and Schmidhuber, 1997) and provides a well-constructed structure..." https://www.sciencedirect.com/topics/chemical-engineering/long-short-term-memory (未知)

Deep Research

一、待核查内容列表

二、核查要点展开

摘要

与问题相关的信息

相关网页链接

相关图片

摘要

与问题相关的信息

相关网页链接

相关图片

1. 网页内容摘要

2. 与探究问题相关的信息

3. 相关网页链接

4. 相关图片

摘要

与问题相关的信息

相关网页链接

相关图片

核查｜LSTM网络发明者归属【陈述错误，不适用于学术引用】

核查结果

核查过程