Perfect Office document
application/msword
Magic Bytes
Offset: 0
D0 CF 11 E0 A1 B1 1A E1
The Microsoft Word Binary File Format is a proprietary document standard developed and maintained by Microsoft Corporation. It is primarily used for creating and storing formatted text, tables, and embedded objects within legacy versions of the Microsoft Office productivity suite. Although largely superseded by the XML-based DOCX standard in 2007, this legacy format remains common for historical archiving but requires strict security precautions due to its support for embedded macros.
Validation Code
How to validate .doc files in Python
Python
def is_doc(file_path: str) -> bool:
"""Check if file is a valid DOC by magic bytes."""
signature = bytes([0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1])
with open(file_path, "rb") as f:
return f.read(8) == signature
How to validate .doc files in Node.js
Node.js
function isDOC(buffer: Buffer): boolean {
const signature = Buffer.from([0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1]);
return buffer.subarray(0, 8).equals(signature);
}
Go
func IsDOC(data []byte) bool {
signature := []byte{0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1}
if len(data) < 8 {
return false
}
return bytes.Equal(data[:8], signature)
}
API Endpoint
GET
/api/v1/doc
curl https://filesignature.org/api/v1/doc