We present Audio Flamingo 3 (AF3), a fully open state-of-the-art (SOTA) large audio-language model that advances reasoning and understanding across speech, sound, and music. AF3 introduces: (i) AF-Whisper, a unified audio encoder trained using a novel strategy for joint representation learning across all 3 modalities of speech, sound, and music; (ii) flexible, on-demand thinking, allowing the model to do chain-of-thought-type reasoning before answering; (iii) multi-turn, multi-audio chat; (iv) long audio understanding and reasoning (including speech) up to 10 minutes; and (v) voice-to-voice interaction. To enable these capabilities, we propose several large-scale training datasets curated using novel strategies, including AudioSkills-XL, LongAudio-XL, AF-Think, and AF-Chat, and train AF3 with a novel five-stage curriculum-based training strategy. Trained on only open-source audio data, AF3 achieves new SOTA results on over 20+ (long) audio understanding and reasoning benchmarks, surpassing both open-weight and closed-source models trained on much larger datasets.
To save everyone a click: It’s a non-commercial license (with a very rude yoink clause, if anyone is foolish enough to build something on it.)
By the by, there’s a good chance that AI models are not copyrightable under US law; making the license moot in the US. In other regions, such as the EU, it likely holds.
3.3 Use Limitation. The Work and any derivative works thereof only may be used or intended for use non-commercially. Notwithstanding the foregoing, NVIDIA Corporation and its affiliates may use the Work and any derivative works commercially. As used herein, “non-commercially” means for non-commercial research and educational purposes only.
> fully open
> looks inside
> not open source
To save everyone a click: It’s a non-commercial license (with a very rude yoink clause, if anyone is foolish enough to build something on it.)
By the by, there’s a good chance that AI models are not copyrightable under US law; making the license moot in the US. In other regions, such as the EU, it likely holds.
i know that AI output is not copyrighteable, because it wasn’t made by a human.
however the model itself is a product of a shit ton of work. and I doubt any court will claim them non copyrighteable.
That’s not how US copyright works.
https://en.wikipedia.org/wiki/Sweat_of_the_brow