Cookie #3: Why File Type Validation is Always an Untrusted Check
Attackers turn file type validation into an easy bypass.
đ Newsletter on Secure Coding and Web Security
The Secure Cookie is meant to help you write safer code, ship secure applications with less frustration, and expand your skills as a security-aware developer. Expect deep dives into OWASP guidelines, coding safeguards, secure architecture designs, and web security tips.
Everything you learn here can be put into practice on tablab.ioâthe platform I built with passion to offer secure coding hands-on labs for developers who are serious about their craft.
Hi Friends,
Welcome to the 3rd issue of the Secure Cookie newsletter.
File type validation in upload features is usually based on three indicators: the Content-Type header, the Magic Number signature, and the File Extension. However, none of them provide complete trust from a security perspective.
At best, these checks serve as superficial filters to quickly discard obvious mismatches, but they cannot guarantee that the file is what it claims to be. Extensions can be renamed, headers can be spoofed, and magic numbers can be forged.
Letâs break down their practical role and show how they can be used wisely, meaning as quick filters that can simplify validation without giving them more trust than they deserve.
File Type Validation via the Content-Type Header
The Content-Type
header is used to indicate the original MIME type (e.g., image/png, text/plain, application/pdf) of the resource prior to any content encoding applied before transmission.
MIME, short for Multipurpose Internet Mail Extensions, is a standard developed in the early 1990s to enable emails to include multimedia content and other binary files, and it is also employed on the web to define the nature of the data in the message body, the encoding applied, and how it should be processed or displayed:
HTTP/1.1 200 OK
Content-Type: multipart/form-data; boundary="ExampleBoundary"
--ExampleBoundary
Content-Disposition: form-data; name="text"
Here is the text you were looking for.
--ExampleBoundary
Content-Disposition: form-data; name="file"; filename="example.jpg"
Content-Type: image/jpeg
[binary JPEG data]
--ExampleBoundary--
Validating file types using the MIME type from the Content-Type
header is unreliable for security because the header is provided by the client and can be trivially spoofedâeven though some libraries and packages rely on it to assert that an upload matches the expected type.
As example, the following implementation uses multer
to handle file uploads, which identifies the file type via the received Content-Type
HTTP header, and mistakenly uses it as the basis for a security check:
const ALLOWED_TYPES = ["image/jpeg", "image/png"];
const upload = multer({
...
});
app.post("/upload", upload.single("file"), (req, res) => {
if (!req.file) {
res.status(400).json({ message: "No file uploaded" });
return;
}
const { mimetype } = req.file;
if (!ALLOWED_TYPES.includes(mimetype)) {
res.status(400).json({ message: "Unexpected file type" });
return;
}
res.send("File uploaded successfully");
});
How to Manipulate the Content-Type Header
Malicious users can manipulate the Content-Type
header in HTTP requests to bypass validation that relies on this header to determine the file type. The following curl
command offers a straightforward way to demonstrate how to send a malicious PHP file while spoofing the Content-Type
header to appear as a jpeg
file:
curl -F "file=@malicious.php;type=image/jpeg" https://domain.tbl/upload
If the server relies solely on the Content-Type
header for validation, this request will be accepted and the file treated as a JPEG despite containing PHP code.
File Type Validation via the Magic Number
The magic number is a unique sequence of bytes located at the beginning of a fileâs content that is used to identify the file type, according to a list of file signatures. These bytes serve as a signature for the file, allowing the operating system or applications to determine its type, even without relying on the file extension:
jpeg (jpg)
files start withFF D8 FF DB
(corresponding toĂżĂĂżĂ
).png
files start with89 50 4E 47 0D 0A 1A 0A
(corresponding toâ°PNGââââ
).pdf
files start with25 50 44 46 2D
(corresponding to%PDF-
).zip
files start with50 4B 03 04
(corresponding toPKââ
).
The code snippet below uses the npm file-type
package, which identifies the file type via the magic number, and then proceeds with a security check that can be easily circumvented through the manipulation of the fileâs signature:
import { fileTypeFromBuffer } from "file-type";
const ALLOWED_TYPES = ["image/jpeg", "image/png"];
const upload = multer({
...
});
app.post("/upload", upload.single("file"), async (req, res) => {
if (!req.file) {
res.status(400).json({ message: "No file uploaded" });
return;
}
const { buffer } = req.file;
// Get the MIME type from buffer
const fileType = await fileTypeFromBuffer(buffer);
if (!ALLOWED_TYPES.includes(fileType.mime)) {
res.status(400).json({ message: "Unexpected file type" });
return;
}
res.send("File uploaded successfully");
});
Bypassing Magic Number Checks
Malicious users can easily prepend a valid magic number to malicious files, making them seem legitimate. For instance, adding the %PDF-2.0
signature at the start of a webshell
file can trick the system into thinking itâs actually a PDF file. This can actually be the process:
Letâs start by showing the content of the original
webshell.php
file:
:~$ cat webshell.php
<?php system($_GET[âcmdâ]); ?>
Then, determine the file type by using the
file
system command, confirming its original format as PHP:
:~$ file webshell.php
webshell.php: PHP script text, ASCII text
Proceed to add the appropriate magic number for a PDF file (i.e.,
%PDF-2.0
) at the beginning of the file:
:~$ echo â%PDF-2.0$(cat webshell.php)â > webshell.php
Display the content of the
webshell.php
file once more to verify the change:
:~$ cat webshell.php
%PDF-2.0<?php system($_GET[âcmdâ]); ?>
Show the hex dump of the file to check the initial bytes, ensuring they correspond to
25 50 44 46 2D
as stated in the list of file signatures:
:~$ xxd webshell.php
00000000: 2550 4446 2d32 2e30 3c3f 7068 7020 7379 %PDF-2.0<?php sy
00000010: 7374 656d 2824 5f47 4554 5b22 636d 6422 stem($_GET[âcmdâ
00000020: 5d29 3b20 3f3e 0a ]); ?>.
Determine the file type again using the
file
command, and notice how it has changed to PDF:
:~$ file webshell.php
webshell.php: PDF document, version 2.0
As you can see, the outcome is a malicious file that tools that rely on the magic number will report as a PDF.
File Type Validation via the File Extension
In security terms, a file extension is little more than noise. Its role is limited to guiding the operating systemâparticularly Windowsâin choosing which application should open the file. Unix-like systems, however, often ignore extensions altogether and instead rely on the magic number to determine the format. Since extensions are not enforced by the OS as protection mechanisms and serve only as usability metadata, they cannot be regarded as a reliable security measure.
That said, legitimate files are still expected to carry the appropriate extension. This is why extension validation is performed: not as a true security control, but as a superficial filter to quickly reject obvious inconsistencies.
For instance, the following multer
implementation validates extensions by checking for .jpg
, .jpeg
, or .png
, but this simplistic approach is insecure and can be circumvented through a double extension evasion technique:
const express = require(âexpressâ);
const multer = require(âmulterâ);
const upload = multer({
...
});
const app = express();
app.post(â/uploadâ, upload.single(âfileâ), (req, res) => {
if (!req.file) {
res.status(400).json({ message: âNo file uploadedâ });
return;
}
const { originalname } = req.file;
if (!originalname.match(/\.(jpg|jpeg|png)/)) {
res.status(400).json({ message: âUnexpected file extensionâ });
return;
}
res.send(âFile uploaded successfullyâ);
});
How Attackers Bypass File Extension Checks
Using uppercase letters (e.g.,
.pHp
,.pHP5
or.ASP
).Adding a valid extension before the execution extension (e.g.,
image.png.php
orimage.png.php5
).Adding special characters at the end (e.g.,
file.php%20
,file.php%0d%0a
orfile.php/
).Tricking the server-side extension parser by using techniques such as inserting junk data (null bytes) between extensions (e.g.,
image.php%00.png
orimage.php\x00.png
).Adding another layer of extensions (e.g.,
image.png.jpg.php
orimage.php%00.png%00.jpg
).Putting the execution extension before the valid extension, which can be useful in case of server misconfigurations (e.g.,
image.php.png
).Using NTFS Alternate Data Stream (ADS) in Windows inserting a colon character
:
after a forbidden extension and before a permitted one (e.g.,image.asp:.jpg
).
Performing Secure File Extension Validation
File extensions should only be validated after file names have been properly sanitized. Keeping this order prevents attackers from hiding tricks within filenames and ensures extension checks are consistently applied.
Use the following guidelines to keep uploads restricted to safe, authorised extensions:
Decode from URL-encoded format file names prior to validation to prevent bypass techniques like null byte characters (e.g.,
image.php%00.png
).In cases where the web application only accepts a single file type (e.g.,
.pdf
), hardcode the allowed extension when storing the file. If multiple file types are permitted, define an allow-list that restricts file extensions to only those necessary for business needs (e.g.,.jpg
,.jpeg
and.png
).Reject files that have missing or multiple extensions, unless explicitly required, to reduce the risk of exploitation.
Apply robust filtering when validating to avoid common pitfalls, such as regex patterns that can be bypassed.
The code snippet below secures the file upload feature by properly decoding the file name before validation, applying an allow-list of allowed extensions, and preventing files with multiple or missing extensions:
const ALLOWED_EXTENSIONS = [".jpg", ".jpeg", ".png"];
const isAllowedFileExtension = (filename) => {
const decodedFilename = decodeURIComponent(filename);
const lowerCaseFilename = decodedFilename.toLowerCase();
const lastDotIndex = lowerCaseFilename.lastIndexOf(".");
if (lastDotIndex === -1) return false; // No extension found
if (lowerCaseFilename.split(".").length - 1 > 1) return false; // Multiple extension found
const extension = lowerCaseFilename.slice(lastDotIndex);
return ALLOWED_EXTENSIONS.includes(extension);
};
const upload = multer({
...
fileFilter: (req, file, cb) => {
if (isAllowedFileExtension(file.originalname)) {
cb(null, true);
return;
}
const error = new multer.MulterError("LIMIT_UNEXPECTED_FILE", file.fieldname);
error.message = "Unexpected file extension";
cb(error);
}
});
File Extension Validation on the Client-Side
File extension can also be easily checked on the frontend using the HTML accept
attribute, which helps prevent users from sending unexpected files, though it is not reliable for security purposes:
<input type="file" id="fileInput" accept=".jpg, .jpeg, .png" />
To wrap up, the most reliable way to determine a fileâs type is not by checking the Content-Type header, the Magic Number signature, or the File Extension, but by evaluating its actual content with specialised security tools and platformsâand relying on them to decide whether the file should be accepted or rejected. Weâll take a closer look at this approach in the upcoming post.
đ Ready to explore a full code scenario and put it into practice?
Unlock free access to tablab.io by subscribing to the newsletter, and begin building real expertise. Start practicing, enjoy learning, and level up your skills đ
Interesting Reads
Some interesting articles I read in the past days:
Cognitive load is what matters by Artem Zakirullin
Frequent reauth doesnât make you more secure by Avery Pennarun