WARC (.warc)
.warc file signature | application/warc
Web ARChive (WARC) is a standardized archival file format for storing web crawl content, developed by the International Internet Preservation Consortium and standardized as ISO 28500. It is used by web archives, libraries, and preservation systems to record HTML pages, images, metadata, and HTTP transactions for long-term access and replay. WARC files are generally safe, but they may contain large embedded resources or active web content captured from untrusted sites.
Magic Bytes
Offset 0
57 41 52 43 2F
Sources: Apache Tika
Extension
.warc
MIME Type
application/warc
Byte Offset
0
Risk Level
Safe
Validation Code
How to validate .warc files in Python
def is_warc(file_path: str) -> bool:
"""Check if file is a valid WARC by magic bytes."""
signature = bytes([0x57, 0x41, 0x52, 0x43, 0x2F])
with open(file_path, "rb") as f:
return f.read(5) == signature
How to validate .warc files in Node.js
function isWARC(buffer: Buffer): boolean {
const signature = Buffer.from([0x57, 0x41, 0x52, 0x43, 0x2F]);
return buffer.subarray(0, 5).equals(signature);
}
How to validate .warc files in Go
func IsWARC(data []byte) bool {
signature := []byte{0x57, 0x41, 0x52, 0x43, 0x2F}
if len(data) < 5 {
return false
}
return bytes.Equal(data[:5], signature)
}
API Endpoint
/api/v1/warc
curl https://filesignature.org/api/v1/warc
See the full API documentation for all endpoints and parameters.
Frequently Asked Questions
What is a .warc file?
A .warc file is identified by the magic bytes 57 41 52 43 2F at byte offset 0. Web ARChive (WARC) is a standardized archival file format for storing web crawl content, developed by the International Internet Preservation Consortium and standardized as ISO 28500. It is used by web archives, libraries, and preservation systems to record HTML pages, images, metadata, and HTTP transactions for long-term access and replay. WARC files are generally safe, but they may contain large embedded resources or active web content captured from untrusted sites.
What are the magic bytes for .warc files?
The magic bytes for WARC files are 57 41 52 43 2F at byte offset 0. These bytes uniquely identify the file format regardless of the file extension.
How do I validate a .warc file?
To validate a .warc file, read the first bytes of the file and compare them against the known magic bytes (57 41 52 43 2F) at offset 0. This is more reliable than checking the file extension alone, as extensions can be renamed.
What is the MIME type for .warc files?
The primary MIME type for .warc files is application/warc.
Is it safe to open .warc files?
WARC (.warc) files are generally safe to open. They are classified as low risk because they primarily contain data rather than executable code. However, always ensure files come from a trusted source.