For my research project I am interested in the DICOM meta-data of a couple of hundred patients. Is it possible to access the data and transfer it to for example an excel file?
Yes, you can acess DICOM meta-data using for example DICOMWORKS (a free software) or from other medical imaging software such as OSIRIX, or even from IMAGEJ (by pressing CTRL+i after opening the image).
For other part of your question, developed an automated approach to bulk process the CT files and extract the information and auto exported from Matlab into Excel.
If you know a bit of programming in python I would recommend pydicom. It's fairly easy to use and a small script could dump metadata to a text file and imported to Excel.
Not sure exactly your needs, but you might look into DICOM Browser which includes GUI and command line interfaces. It's open source, built on the wonderful dcm4che libraries. Paper here: http://www.springerlink.com/content/5394mt1115711205/. Binaries and source here: http://nrg.wustl.edu/software/dicom-browser/
It will show some errors: just ignore them in almost all cases
Dicom2 can also convert to jpeg or png
****
***
To process thousands of image files running a directory tree...??
A script is usually needed.
****
You can use pydicom, dicom2, dcmdump, and similar tools depending of your preferred computer ecosystem.
***
Using windows and an external USB disk drive, I did it last year.
Only 2,400,000 files processed with a humble and old laptop (Toshiba r200). A batch run that lasted 13-14 days ...Good battery and a UPS is a must ;o)) !
The script copied files (both dicom and txt) to different directories following a DICOM tag filter...in my case just following modality (US/RF).
Now I have 220,000+220,000 files to postprocess with a different set of scripts...
MATLAB will be a good bet. You can script in MATLAB and therefore save and compare elements from the structure set created by reading in the DICOM header info.
- DicomWorks (mentioned above) can be used to insert/delete/change selected DICOM tags in many files simultaneously without programming knowledge;
- ImageJ can be used to extract/use data on DICOM tags; enclosed macro renames all DICOM files in selected folder to names created from DICOM tags, but after modification it could be used to output selected tags into ImageJ's log window or into a file.
Python has a very nice library PyDICOM. Its easy to extract all the data you specifically want, then export it to an MS-Excel spreadsheet (there is a Python library to do that also).
The code is very minimal (~20 lines of overhead & then the code for your specific data items), and with the guides already on the web should be able to implement and test within 2 hours (even if you don't know python).
I would also like to extract information (name, birthday date, sex, exam ID, study date) from a single dicomdir file concerning more than 4000 patients (300Mo).
The file format of DICOM is not really as easy as compared to other image formats like JPEG or PNG whatever. First of all you need to understand the structure of the file to know how the information is encoded. Even though there are freely available solutions to read and display the DICOM file content, if you are really interested, please try to learn its basics. Off course it’s really interesting. The best way to start with is refer the DICOM standard chapters from http://dicom.nema.org/standard.html (it needs lot of patience and free time). Following two are the best software to play around with the DICOM header and image content.
1. syngo fastView is available at http://www.healthcare.siemens.com/medical-imaging-it/syngo-special-topics/syngo-fastview
2. DICOM viewer from Philips at http://philips-dicom-viewer.software.informer.com/
3. Read the attached pdf file to get the idea of basics.
Install the above software, and try to load the attatched images. Image courtesy: National Cancer Institute.
I have similar question as the initial one, but i want to query the dicom meta information from PACS (dicom server) without image data. I need to fill a database by these metainfo to make it searchable for retrospective studies.
It can be solved by downloading the whole image (header+pixels) and throwing pixels away, but in case of 100 000 patients the network overhead would be high enough. So, I would like to query the metainfo only, How?
Exiftool is able to scan whole directory trees of dicom images.
Binary and sources.
https://www.sno.phy.queensu.ca/~phil/exiftool/
Just build and export to a local folder a dicomdir structure (as it is expected to record a DVD) to a local disk. Any directory structure, any depth, will work even without dicomdir files.
Change to the folder with the DICOMDIR file. Copy exiftool to the same working folder, or a folder with executable rights, or included in the windows path.
This command example will generate a CSV file in the current folder (.) for all files without any extension (a regrettable but common dicom filename setting):
exiftool -csv -r -ext "." . > exif.csv
It will generate the exif.csv file with tabbed data of all your images, with few hundreds columns of data, in this same folder.
* There are exiftool options to reduce the output to a few DICOM tags.
Exiftool will process 5-10 images per second or faster.
Importing these CSV files into libreoffice works better than with Ms-excel. Then, from libreoffice, you may be able to export the sorted/cleaned file to your preferred format.
I've used win7 and Linux Mint versions successfully.
Thank you for answer, but it can not solve the problem.
I don't want to access the stored files, at all. I don't want to get more tebibyte (or terabyte) data on my computer. I need to use a tcp/ip network connection and i want toe get metinfo from the PACS server itself. So, I am looking some dicom protocol, which can be used in standardized manner to get metainfo through the network. In this scenario the PACS server needs to see the whole file, not for me.
your solution requires *dcm files, which are available in the server or locally. Locally stored files are not interesting, because the pixels will be thrown away, and the network overhead to get objects could be very high. On the other hand, the user can not log into the server to run scripts in most of the cases .
You as a dicom expert, who is see what is behind the scene: is there any initiative to extend dicom network communication to get all of the meta data of a dicom object w/o pixels?