Top 3 ways to extract images from .docx/.doc Word documents
Lately I've been in a situation where I had an image embedded into a .docx Word document, and I absolutely needed to get the original image file in order to be able to retrieve the original best-quality picture. See the screenshot below? My document contains a picture, it shows up as a small thumbnail, but I need the original high-resolution picture. This tutorial shows the way, it works with Microsoft Word 2007 and above.
There are several ways to retrieve or extract images from a Word document. I am going to be listing the top 3 ways to do so in this article. Feel free to use whichever way you want, but expect different results. Here is the article index here:
- Renaming the .docx to .zip, opening the zip file and extracting the images (Best way!)
- Copying and pasting the image into a picture processing application (Alright)
- Saving the document as a web page and retreiving the generated images (Nice!)
Alternatively you could just use the Snipping Tool from Windows Vista/7, which allows you to save a portion of the screen into an image. Try it, it's pretty much self explanatory, but the final quality of the picture depends on how much you can fit on your screen.
Option #1: renaming the .docx to .zip
This is definitely the best way to retrieve your images. Microsoft Word's .docx documents are actually zip files! Before you ask: what if your document is a .doc, a .html, .odt or any other format? Well just open the document in Word and save it as .docx file, Word will convert it to a zip/docx file for you. So here is my file:
Note: if you do not see the .docx at the end of the file name, you *MUST* follow this tutorial for this to work (this will help you show file extensions in Windows Explorer).
I am going to rename the file to "products.zip" here, note as seen on the screenshot below, the icon changes to something different:
Now, double click your .zip file to open it with whatever program is associated with Zip files on your computer. In my case, Winrar does the job. If the document still opened with Word, you probably did not read carefully enough -- go back to the part where I say to follow this tutorial.
These are the original images that were inserted in the documents, regardless of the size in which they appear when viewing your document. You can extract them to your storage disk by selecting them and dragging&dropping them to the folder of your choice.
Option #2: copying and pasting the imageAnother obvious option is to simply select the picture in Word, click Ctrl+C to copy the picture to the clipboard:
If you have and use a design/picture processing tool such as Photoshop, Paint shop pro or something, just open it and paste the picture by pressing Ctrl+V. I don't have anything like this, so I'll just open Paint:
Press Ctrl+V to paste the image into the current canvas in Paint. Unfortunately, the picture is copied in the same size as it appears in Word. If you want a bigger version, you'll have to resize the image in Word first, copy it again, and paste it again in Paint! This is obviously not the best way to go, but it can be easier and faster than the first option I described earlier.
Option #3: saving the document as a Web page
This last option is technically the easiest in my opinion. First, open your document in Word, then save it as a Web page:
When offered to select the format, make sure to choose the "Web page" format... do NOT select "Web page (filtered)", this option will export low-resolution images. The "Web page" option on the other hand exports the original images as well as thumbnails.
There's the document itself, but also a folder that Word generated for you. Open the folder to reveal the images (both in original and thumbnail sizes):
That's it! That wasn't so hard, was it? I still wonder why Microsoft didn't originally include this feature in the Word application itself. In my opinion, we should be able to just right-click an image and see a "Save image as..." option in the menu.
Now, why did I bother writing about this on what is seemingly a sysadmin's/developer's blog, you ask? Because I've been looking this up on the web and most of the answers I found were unsatisfying. It took me ages before I finally stumbled upon a blog article that revealed the docx/zip trick. When I don't (or hardly) find the answer to my questions, I usually whip up a quick article about what I was looking for, and hopefully it ends up helping people.