Sakana AI, Founded by Transformer Co-Author, Unveils Evolutionary Approach to Building AI Models

Tokyo-based AI startup Sakana AI, co-founded by a key author of the seminal “Attention Is All You Need” paper that introduced the Transformer architecture, has officially unveiled its first open-source models. The company is pioneering a novel method called “evolutionary model merging,” which challenges the industry’s standard approach of building AI by training massive models from scratch.

Founded by former Google researchers David Ha and Llion Jones, Sakana AI is taking a page from nature’s playbook. Instead of using brute-force computation and vast datasets, their technique combines and adapts existing open-source models, treating them like organisms in an ecosystem. Through a process inspired by natural selection, the company merges different models, allowing their strengths to combine and create new, more specialized and powerful versions without extensive retraining. This evolutionary approach is significantly more computationally efficient and cost-effective.

The first models released demonstrate the power of this technique. One is a highly proficient Japanese-language model created by merging several English-centric models. Another is a vision-language model adept at generating image captions. Perhaps most impressively, Sakana AI also developed a leading mathematical reasoning model by evolving a pool of existing open-source models, resulting in a new AI that outperforms its individual predecessors on key benchmarks.

This methodology represents a fundamental shift in AI development. By building upon the collective progress of the open-source community, Sakana AI is creating a more sustainable and accessible path to innovation. Their work suggests that the future of AI may not lie solely in creating ever-larger models, but in the intelligent and creative combination of what already exists. The Tokyo-based lab, backed by $30 million in seed funding from investors like Lux Capital and Andreessen Horowitz, is positioning itself as a critical new player in the global AI landscape, championing a more collaborative and efficient paradigm.

Leave a Comment

Your email address will not be published. Required fields are marked *

en_USEnglish
Scroll to Top