True or False Questions
For linearly inseparable problems (e.g., XOR operation), a single-layer perceptron cannot solve them, but a multilayer perceptron can handle them.
When scammers use AI to synthesize a 'grandchild's' voice to deceive elderly people into transferring money, this risk demonstrates the misuse of speech synthesis technology.
In the case of Amazon's unmanned store, the system uses object recognition technology to identify products selected by customers.
In AI art creation, tools like DALL-E and Midjourney can generate images from audio descriptions.
The 1956 Dartmouth Conference is considered the birth of artificial intelligence as a field.
AlphaFold is an AI system that can predict DNA structures, which is important for biological research.
The M-P neuron model was proposed by Warren McCulloch alone to simulate how biological neurons work.
In machine vision, ImageNet database was created by Li Fei-Fei, which helped validate the power of artificial neural network models.
Geoffrey Hinton's breakthrough in 2006 that significantly improved handwritten digit recognition performance was achieved by increasing the depth of neural networks.
The three key technical elements of autonomous driving are sensors, algorithms, and a grading system.
One reason big model technologies may exacerbate the spread of misinformation is their ability to generate highly realistic synthetic content at low cost.
Video content generated by visual large models (such as Sora) is entirely based on precise simulations of the physical laws of the real world.
High demand for computing resources during the training of big models is temporary; with algorithm optimization, energy consumption will no longer pose a challenge.
When large language models exhibit 'emergent abilities', it often indicates that after reaching a critical scale, the model suddenly acquires new capabilities not explicitly trained for.
The 'new paradigm' in the era of big models refers to solving all artificial intelligence problems simply by increasing the size of model parameters.
The generation process of visual large models usually involves cross-modal alignment technology that maps textual semantics to image pixel space.
With each iteration of the GPT series, there has been a continuous increase in the amount of model parameters, size of training data, and length of the context window.
Models in the GPT series adopt the complete encoder-decoder architecture of the Transformer.
The self-attention mechanism allows the model to directly focus on information from all other positions in the input sequence when processing a word at a certain position.
The Transformer architecture completely abandons the recurrent neural network (RNN) structure, relying solely on fully connected layers for sequence modeling.
The core task of language models is to predict the probability distribution of subsequent words based on preceding text.
Traditional machine learning typically has more advantages than deep learning when dealing with complex tasks.
The basic idea of pre-training methods is to first train a deeper network and then split it into multiple shallow networks for separate use.
It is stated that deep learning has had no significant impact on the field of natural language processing.
The backpropagation algorithm calculates the error at the output end and propagates the error signal backward to update weights.
In deep neural networks, the function of the input layer is to classify data and output the final results.
When discussing AI applications in healthcare, the main focus is on replacing doctors for all diagnostic decisions rather than assisting them.
Brain-inspired computing chips have shown superior general-purpose performance compared to Nvidia GPUs in running large models with hundreds of billions of parameters like ChatGPT.
'Superintelligence' refers to specialized programs that defeat human champions in particular board games.
The issue of value alignment is important because powerful AI systems with goals misaligned with human values could lead to unforeseen negative consequences.
The core idea of embodied intelligence is that intelligent behavior can be achieved without a physical body, solely through abstract symbolic computation.
The primary design objective of brain-inspired computing chips is to significantly reduce energy consumption while maintaining high computational power.
In interdisciplinary integration, artificial intelligence can assist astronomers in automatically filtering out celestial bodies with research value from massive observational data.
The fundamental difference between Artificial General Intelligence (AGI) and Narrow AI lies in the former being capable of handling only textual data, while the latter can process only image data.
One reason big model technologies may exacerbate the spread of misinformation is their ability to generate highly realistic synthetic content at low cost.
Video content generated by visual large models (such as Sora) is entirely based on precise simulations of the physical laws of the real world.
High demand for computing resources during the training of big models is temporary; with algorithm optimization, energy consumption will no longer pose a challenge.
When large language models exhibit 'emergent abilities', it often indicates that after reaching a critical scale, the model suddenly acquires new capabilities not explicitly trained for.
The 'new paradigm' in the era of big models refers to solving all artificial intelligence problems simply by increasing the size of model parameters.
The generation process of visual large models usually involves cross-modal alignment technology that maps textual semantics to image pixel space.
With each iteration of the GPT series, there has been a continuous increase in the amount of model parameters, size of training data, and length of the context window.