Google Open Sources File-Identifying Magika AI For Malware Hunters And Others
The Register, Monday, February 12th, 2024
Cool, but it's 2024 - needs more hype, hand wringing, and flashy staged demos to be proper ML
Google has open sourced Magika, an in-house machine-learning-powered file identifier, as part of its AI Cyber Defense Initiative, which aims to give IT network defenders and others better automated tools.
Working out the true contents of a user-submitted file is perhaps harder than it looks. It's not safe to assume the file type from, say, its extension, and relying on heuristics and human-crafted rules - such as those in the widely used libmagic - to identify the actual nature of a document from its data is, in Google's view, "time consuming and error prone."