“Vision Language Action (VLA) models are ineffective for robotics because most of their parameters are dedicated to language, making them strong at encoding knowledge and nouns but weak at understanding physics and verbs.”

Jim FanRobotics

Loading full analysis…