Skip to content
Sonic
AI
Sonic
AI
Home
Discover
Ask Sonic
Projects
Use with Claude or ChatGPT
Show me around
Request source or feature
Eric Jang, Sonic AI
Home
/
Discover
/
Eric Jang
E
Eric Jang
Person
44
Mentions
Episodes
44
Claims
Claims
By Source
Timeline
All
(20)
Business
(0)
Healthcare
(0)
Government
(0)
Tech
(44)
Energy
(0)
Science
(0)
Geopolitics
(0)
Based on Eric Jang's experience, large language models like Claude Opus 4.6 and 4.7 are effective at hyperparameter optimization and experiment execution but are not yet capable of high-level strategi...
Expert perspective
Eric Jang
May 15
Eric Jang's project to rebuild AlphaGo was funded by a $10,000 donation from Prime Intellect, of which approximately $7,000 was spent on research and model serving.
Expert perspective
Eric Jang
May 15
DeepMind's institutional experience in solving games like Go and StarCraft likely provided a positive transfer of research skills to their subsequent work on large language models.
Expert perspective
Eric Jang
May 15
Systems like AlphaStar and OpenAI's Dota bot used an algorithm called Neural Fictitious Self-Play (NFSP) instead of Monte Carlo Tree Search.
Expert perspective
Eric Jang
May 15
The 2021 paper by Andy Jones on scaling laws for board games also showed that it is possible to predict the amount of compute required to solve a larger version of a board game.
Expert perspective
Eric Jang
May 15
Most Go practitioners today train against the AI model Katago.
Expert perspective
Eric Jang
May 15
All Go AIs are trained against and resolve games using the Tromp-Taylor rules because they are completely unambiguous for computers.
Expert perspective
Eric Jang
May 15
The naive search tree for a game of Go has a complexity on the order of 361 to the power of 300, which is larger than the number of atoms in the universe.
Expert perspective
Eric Jang
May 15
Eric Jang found that Claude 4.6 was able to generate a reasonable data structure for a Monte Carlo Tree Search implementation.
Expert perspective
Eric Jang
May 15
The Katago paper found that aggregating global features throughout the network was useful for giving the model a global sense of the board state.
Expert perspective
Eric Jang
May 15
A Go-playing neural network trained on expert human data, without any search, can become a very strong player that beats most humans by simply taking the highest-probability move from its policy netwo...
Expert perspective
Eric Jang
May 15
Katago proposed an architecture that can be trained on both 9x9 and 19x19 boards, enabling effective transfer learning from the smaller board to the larger one, particularly for the value head.
Expert perspective
Eric Jang
May 15
The Monte Carlo Tree Search (MCTS) algorithm provides a low-variance learning signal for every action, in contrast to naive reinforcement learning methods which suffer from high variance.
Expert perspective
Eric Jang
May 15
Current large language model reinforcement learning (LLM RL) typically treats an entire generated sequence as a single action, which is a major contributor to the high variance of the learning signal.
Expert perspective
Eric Jang
May 15
Q-learning propagates value estimates backward over trajectories an agent has already visited, whereas Monte Carlo Tree Search plans forward over trajectories the agent has not yet been to.
Expert perspective
Eric Jang
May 15
The compute required to be the first to achieve a research breakthrough is always much larger than the compute it takes for others to catch up and replicate the result.
Expert perspective
Eric Jang
May 15
Off-policy training can harm performance if the replay buffer contains too many states that the current policy would never visit, causing the model to waste capacity on irrelevant states.
Expert perspective
Eric Jang
May 15
Many algorithmic improvements that act as "compute multipliers" may not stack effectively because they can have correlated benefits or become redundant as hardware performance increases.
Expert perspective
Eric Jang
May 15
To learn tabula rasa like AlphaZero or Katago, one can start by playing tens of thousands of random games on a smaller 9x9 board to generate enough data to train an initial, reasonably accurate value ...
Expert perspective
Eric Jang
May 15
Neural Fictitious Self-Play (NFSP) works by training a "best response" policy against a fixed opponent using a model-free reinforcement learning algorithm.
Expert perspective
Eric Jang
May 15
Sign up free to see the full entity analysis
Get started free
Back to Entities
Entity Detail