ORC

application/octet-stream

Safe

Magic Bytes

Offset: 0
4F 52 43

Optimized Row Columnar (ORC) is a data storage format originally developed by Hortonworks for the Apache Hadoop ecosystem and currently maintained by the Apache Software Foundation. This format provides efficient compression and indexing for large-scale analytical processing within distributed frameworks like Apache Hive, Presto, and Spark. Although inherently safe, software implementations must validate schema metadata during file decompression to mitigate potential resource exhaustion or buffer overflow vulnerabilities during automated data ingestion.

Extension

.orc

MIME Type

application/octet-stream

Byte Offset

0

Risk Level

Safe

Validation Code

How to validate .orc files in Python

Python
def is_orc(file_path: str) -> bool:
    """Check if file is a valid ORC by magic bytes."""
    signature = bytes([0x4F, 0x52, 0x43])
    with open(file_path, "rb") as f:
        return f.read(3) == signature

How to validate .orc files in Node.js

Node.js
function isORC(buffer: Buffer): boolean {
  const signature = Buffer.from([0x4F, 0x52, 0x43]);
  return buffer.subarray(0, 3).equals(signature);
}
Go
func IsORC(data []byte) bool {
    signature := []byte{0x4F, 0x52, 0x43}
    if len(data) < 3 {
        return false
    }
    return bytes.Equal(data[:3], signature)
}

API Endpoint

GET /api/v1/orc
curl https://filesignature.org/api/v1/orc

Related Formats