From Multi-Modal Models to a Multi-Modal Civilization
Multi-modal is the buzzword of the moment in AI circles. The prevailing wisdom sounds reasonable enough: there's no one-size-fits-all model, so you pick the right tool for the right task. A language model for text, a vision model for images, an audio model for speech. You assemble a