REW

Does Converting To PDF Remove Metadata?

Published Aug 29, 2025 4 min read
On this page

No, converting a document to PDF does not automatically remove all metadata.

While the process of "printing to PDF" can discard certain types of internal revision history, the resulting PDF file will still retain basic, and potentially sensitive, metadata from the original document. The specific metadata that transfers depends on the source application, the conversion method used, and the settings selected during the export process.

The two main types of metadata and how conversion affects them

The metadata within a document can be broadly split into two categories, and understanding this distinction is crucial to comprehending how it is handled during conversion.

File description metadata

This is the basic, high-level information that describes the document itself. It includes fields such as:

  • Author: The name of the user who created the file.
  • Title: The designated title of the document.
  • Subject: A summary of the document's content.
  • Keywords: Terms associated with the document for searchability.
  • Creation and Modification Dates: The dates and times the document was first created and last changed.
  • Creator and Producer: The application (e.g., Microsoft Word) and PDF converter (e.g., Adobe PDF Library) used to generate the PDF.

Conversion impact: This category of metadata is almost always carried over when converting a document to PDF, especially if you use a standard "Save As PDF" or "Export" function. This is by design, as it helps with document organization, searchability, and accessibility.

Revision metadata

This type of metadata contains the detailed history of a document's creation and editing process. It is often internal to the original application and generally hidden from immediate view. Examples include:

  • Tracked Changes: In a program like Microsoft Word, the full history of revisions, additions, and deletions.
  • Comments: The remarks or annotations made by collaborators.
  • Document Versioning: Older, un-redacted versions of the document's text.
  • Hidden Content: Text that is covered by another object or has been given a "hidden" attribute.

Conversion impact: This type of metadata is more volatile during conversion.

  • Printing to PDF: Using the "Print to PDF" command, which essentially takes a snapshot of the document as it appears on the screen, is the most effective way to eliminate revision metadata. This process will strip out tracked changes, comments, and other hidden information because it only captures the visual layer of the file.
  • Exporting to PDF: Using the "Save as PDF" or "Export" feature in many modern applications may or may not include revision metadata, depending on the software and the user's selected options. For example, when exporting a Word document to PDF, there is an option to include or exclude document properties and revision information.

Potential risks of leaving metadata intact

Retaining metadata can lead to unintended—and sometimes serious—consequences, particularly in professional and legal contexts.

  • Confidential Information Exposure: Metadata can expose who created or edited a document, when and where it was worked on, and even details of the server it was saved on. This information can be confidential or sensitive.
  • Revealing Internal Communications: The comments and tracked changes in a document may contain information about internal discussions or client strategies that were never meant for external eyes.
  • Liability in Legal Cases: In a litigation context, failure to remove metadata from a document before it is disclosed to opposing counsel can expose privileged or damaging information and create legal liability.

How to effectively remove metadata from a PDF

Given that conversion is not a surefire way to remove all metadata, a dedicated process is required.

1. Manual removal with a PDF editor

For the most comprehensive and direct approach, use a PDF editing tool like Adobe Acrobat Pro.

  1. Open the PDF and go to File > Properties.
  2. Navigate to the Description tab to edit or delete fields like Title, Author, Subject, and Keywords.
  3. Use the "Remove Hidden Information" tool: For a deeper cleaning, Adobe Acrobat Pro has a feature to remove hidden data, including metadata, attached files, and invisible text.

2. Use a dedicated metadata scrubbing tool

Various applications and services specialize in metadata removal (often called "data scrubbing" or "redaction"). These tools are often used in legal and compliance settings to ensure no sensitive information is overlooked before sharing documents.

3. Print to PDF

As previously mentioned, using a virtual "Print to PDF" driver is a simple method to create a new file that only contains the visible content of the original document, discarding most hidden metadata.

4. Adjust export settings

When converting from an original application like Microsoft Word, be sure to check the export or "Save As" options. Look for checkboxes to control whether document properties, comments, or tracked changes are included in the final PDF.

Final considerations

For casual use, such as sharing a document with colleagues, leaving basic metadata intact is generally not a concern. However, for any document being distributed publicly or shared with outside parties, especially in legal, financial, or academic fields, it is a non-negotiable best practice to take deliberate steps to inspect and remove metadata. Relying on conversion alone is a risk that can lead to embarrassing and potentially damaging information leaks.

Enjoyed this article? Share it with a friend.