通用人工智能(AGI) · 大专栏-aigc导航|ai导航
Artificial General Intelligence(Lex Fridman)Something about Lex FridmanMIT AGI Misson: Engineer Intelligence
Goals:
- avoid the pitfalls of “black box”: Media often reports AI like fiction. Hype is the first enemy to us.avoid the pitfalls of “I am just a scientist”.
How far away from creating intelligent systems
Analogy: we are in the dark room looking for a switch with no knowledge of where the light switch is.
Exploration travel for the sake of discovery and adventure is human compulsion.
Building machines that see, and think like people(Josh Tenenbaum)Something about Josh TenenbaumAI technologies no real AI
Intelligence is not just about pattern recognition, it is about modeling the world.
Sally port: visual intelligence of our near-term focus
Some part of your brain is tracking the whole world around you. And you track your world model to plan your actions
The roots for common sense
reverse engineer our brain, figure out how brain can formulate a goal and be able to acheive it.
How do we build this architecture?symbolic language for knowledge representationprobabilistic inference in generative models to capture uncertaintyneural network for pattern recognition
Inference means that our model runs a few low precision simulations for a few time steps
Mental simulation engines based on probabilistic programs
OpenAI Meta-Learning and Self-play(Ilya Sutskever)Why do neural networks work?
shortest program that fits training data is the best possible generalization.
Reinforcement Learning is a good framework for building intelligent agent.
But note that RL framework is not quite complete because it assumes the reward is given by the environnment. But in reality, the agent rewards itself.
Reinfocement Learning algorithms in a nutshell
Try something new add randomness directions and compare the result to your expectation.
If the result was better than expected, do more of the same in the future.
Model-free RL: Two classes of algorithmsPolicy Gradients:
- Just take the gradientStable, easy to useVery few tricks neededOn policy
Q-learning based:
- Less stable, more sample efficientwon’t explain how it worksOff policy: can be trained on data generated by some other policy
Meta-learning
Our dream is:
Learn to learnTrain a system on many tasksResulting system can solve new tasks quicklyExploration in RL: a key challenge
Random behavior must generate some reward and you must get rewards from time to time, otherwise learning will not occur. So if the reward is too sparse, agent cannot learn.
It would be nice if learning was hierarchical
Current RL learns by trying out random actions at each timestep
Agent may require a real “model” to really solve this problem.
Self-Play: that is very cool
Crux: The agents create the environment by virture of the agent acting in the environment
Here comes the question: can we train AGI via self-play among multi-agents?
It’s unknown.
MSRA presentation given by Yoshua BengioPrinciple in Bengio’s idea.World Models
we(human-being) have a mental model, that could capture facts of our world to some extending and humans generalize better than other animals thanks to a more accurate internal model of the underlying causal relationships
Shortcoming in current model
So long as our machine learning models ‘cheat’ by relying only on superficial statisticalregularities, however they remain vulnerable to out-of-distribution examples.
Possible solutionsPrediction
To predict future situations(e.g., the effect of planned actions) far from anything seen before while involving known concepts, an essential component of reasoning intelligence and science.
Invariance
Our systems need to be invariant about deep understanding : models for recognition and generation clearly don’t understand in the crucial abstractions.
Imagination
Real-life applications often require generalizations in regimes not seen during training, so humans can project themselves in situation they have never seen or never experience.
Subjective Knowledge
Our brain can come up with control policies that can influence specific aspects of the world: an agent acquires by interacting in the world which is that it’s not universal knowledge, it’s subjective knowledge
Present Development StageMore elements as prior
- Spatial & temporal scalesMarginal independenceSimple dependencies between factors
- Consciousness prior: arXiv(1709.08568)
Causal / Mechanism independence
- Controllable factors
Content-based Attention(Attention Mechanism)
to select a few relevant abstract concepts making a thought.
TODO future work
The ability to do credit assignment through very long time spans. There are also shortcoming for current RNN architecture: I can remember something I did laster year.
Godel Machines, Meta-Learning, and LSTMS(Jurgen Schmidhuber)Simplicity is beautyHistory of science is a history of compression progress.Humans are curious and curiosity strategy is a discovery of evolution(A guy who explores the unknown world has a higher chance of solving problems that he needs to survive in this world)Consciousness may be a byproduct of problem-solving.What we do now since 2015(now is 2018) is CM(controller model) system which we give the controller the opportunity to learn by itselfAMA(Ask me anthing) on reddit: Jurgen SchimidhuberQuestion: What’s something that’s true, but almost nobody agrees with you on?
Intelligence is just the product of a few principles that will be considered very simple inhindsignt. There are partial justification:
Theoretically optimal in some abstract sense although they just consist of a few formulas:
Humanbeing make predictions based on observations. Every AI scientist wants to find atheoretically optimal way of predicting:
Normally we do not know the true conditional probability: $P(next|past)$.But assume we do know that $p$ is in some set $P$ of distrikkbutions.
Given $q$ in $P$, we obtain Bayiesmix: $M(x)=sum_q w_q q(x)$. We can predict using $M$ instead of the optimal but unknown $p$,
Let $LM(n)$ and $Lp(n)$ be the total expected losses of the M-predictor and the p-predictor.
Then LM(n)-Lp(n) is at most of the order of $sqrt{[Lp(n)]}$. That is, M is not much worse than p. And in general, no other predictor can do better than that!
Once we have an optimal predictor, in principle we alse should have an optimal decision maker or reinforcement learner that always picking those action sequences with the highest predicted success, that is a universal AI.
His favorite Theory of Consciousness(TOC)
Karl Popper famously said: “All life is problem solving.” No theory of consciousness is necessary to define the objectives of a general problem solver. From an AGI point of view, consciousness is at best a by-product of a general problem solving procedure.
Where do the symbols and self-symbols underlying consciousness and sentience come from? I think they come from data compression during problem solving.