References
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. “Neural
Machine Translation by Jointly Learning to Align and Translate.”
arXiv Preprint abs/1409.0473. https://arxiv.org/abs/1409.0473.
Bronstein, Michael M., Joan Bruna, Taco Cohen, and Petar Veličković.
2021. “Geometric Deep Learning: Grids, Groups, Graphs, Geodesics,
and Gauges.” arXiv Preprint. https://arxiv.org/abs/2104.13478.
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan,
Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language
Models Are Few-Shot Learners.” arXiv Preprint
arXiv:2005.14165. https://arxiv.org/abs/2005.14165.
Copeland, B. Jack. 2002. “Hypercomputation: Computing More Than
the Turing Machine.” Minds and Machines 12 (4): 461–502.
https://doi.org/10.1023/A:1021111623943.
Courbariaux, Matthieu, Yoshua Bengio, and Jean-Pierre David. 2015.
“BinaryConnect: Training Deep Neural Networks with Binary Weights
During Propagations.” arXiv Preprint arXiv:1511.00363.
Dziri, Nouha, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang,
Bill Yuchen Lin, Peter West, et al. 2023. “Faith and Fate: Limits
of Transformers on Compositionality.” arXiv Preprint
arXiv:2305.18654. https://arxiv.org/abs/2305.18654.
Gödel, Kurt. 1931. “Über Formal Unentscheidbare Sätze Der
Principia Mathematica Und Verwandter Systeme i.” Monatshefte
Für Mathematik Und Physik 38 (1): 173–98. https://doi.org/10.1007/BF01700692.
Gu, Albert, and Tri Dao. 2023. “Mamba: Linear-Time Sequence
Modeling with Selective State Spaces.” arXiv Preprint
arXiv:2312.00752. https://arxiv.org/abs/2312.00752.
Hamkins, Joel David, and Andy Lewis. 2000. “Infinite Time Turing
Machines.” The Journal of Symbolic Logic 65 (2):
567–604. https://doi.org/10.2307/2586556.
Kaplan, Jared, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin
Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario
Amodei. 2020. “Scaling Laws for Neural Language Models.”
arXiv Preprint abs/2001.08361. https://arxiv.org/abs/2001.08361.
LeCun, Yann. 2022. “A Path Towards Autonomous Machine
Intelligence.” OpenReview. https://openreview.net/pdf?id=BZ5a1r-kVsf.
———. 2024. “Objective-Driven AI: Towards AI Systems That Can
Learn, Remember, Reason, and Plan.” Seattle, WA: Dean W. Lytle
Electrical & Computer Engineering Endowed Lecture, University of
Washington. https://www.ece.uw.edu/wp-content/uploads/2024/01/lecun-20240124-uw-lyttle.pdf.
Penrose, Roger. 1994. Shadows of the Mind: A Search for the Missing
Science of Consciousness. Oxford, UK: Oxford University Press.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. 1986.
“Learning Representations by Back-Propagating Errors.”
Nature 323 (6088): 533–36. https://doi.org/10.1038/323533a0.
Shazeer, Noam, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc
Le, Geoffrey Hinton, and Jeff Dean. 2017. “Outrageously Large
Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.”
arXiv Preprint arXiv:1701.06538.
Su, Jianlin, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng
Liu. 2021. “RoFormer: Enhanced Transformer with Rotary Position
Embedding.” arXiv Preprint arXiv:2104.09864.
Turing, A. M. 1936. “On Computable Numbers, with an Application to
the Entscheidungsproblem.” Proceedings of the London
Mathematical Society 42 (2): 230–65. https://doi.org/10.1112/plms/s2-42.1.230.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017.
“Attention Is All You Need.” https://arxiv.org/abs/1706.03762.