
Researching agents that see, think, and act. Somewhere between global minima and the edge of chaos.
Hope you find something that makes you curious here.
papers
- MultiNet v1.0: A Comprehensive Benchmark for Evaluating Multimodal Reasoning and Action Models Across Diverse Domains
[project] [arxiv] [src] - An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models
[project] [arxiv] - Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
[project] [arxiv] [src] - Benchmarking Vision, Language, & Action Models On Robotic Learning Tasks
[project] [arxiv] [src]