MHTML
multipart/related
Magic Bytes
Offset: 0
4D 49 4D 45 2D 56 65 72 73 69 6F 6E 3A 20 31 2E 30
MHTML is a web archive format standardized by the Internet Engineering Task Force (IETF) that combines HTML code and associated external resources into a single file. It is primarily utilized by web browsers and email clients to save complete webpages, seamlessly embedding images, style sheets, and scripts within a multipart structure. While generally considered safe for archiving static content, files from untrusted sources should be handled carefully due to the potential for malicious script execution.
Validation Code
How to validate .mhtml files in Python
Python
def is_mhtml(file_path: str) -> bool:
"""Check if file is a valid MHTML by magic bytes."""
signature = bytes([0x4D, 0x49, 0x4D, 0x45, 0x2D, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6F, 0x6E, 0x3A, 0x20, 0x31, 0x2E, 0x30])
with open(file_path, "rb") as f:
return f.read(17) == signature
How to validate .mhtml files in Node.js
Node.js
function isMHTML(buffer: Buffer): boolean {
const signature = Buffer.from([0x4D, 0x49, 0x4D, 0x45, 0x2D, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6F, 0x6E, 0x3A, 0x20, 0x31, 0x2E, 0x30]);
return buffer.subarray(0, 17).equals(signature);
}
Go
func IsMHTML(data []byte) bool {
signature := []byte{0x4D, 0x49, 0x4D, 0x45, 0x2D, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6F, 0x6E, 0x3A, 0x20, 0x31, 0x2E, 0x30}
if len(data) < 17 {
return false
}
return bytes.Equal(data[:17], signature)
}
API Endpoint
GET
/api/v1/mhtml
curl https://filesignature.org/api/v1/mhtml