Skip Navigation and go to textCIP Home Page Link

NIH/NCI/CIP Announcements

 NCI Plans & Priorities
 NIH/NCI/CIP Initiatives
 NCI/CIP Workshops
 NCI/CIP: What's New/ Newsletters
 Grants & Contracts
 Points of Contact


CIP Funded Projects/Networks

 Centers: (ICMICS and SAIRPS)
 Clinical Trials (ACRIN)
 CIP Database Resources (LIDC, IASC)
 Research Resources


Links

 NIH (BECON/BISTI)
 NIH Intramural Imaging
 Society Links
 Meeting Links
 
Image of Steering Committee

back button
Back

Links to Image Archive Resources

This list links the various components and interests of the National Cancer Institute Cancer Imaging Program Image Archive Committee, and related websites. There are special sections on image archive formats and standardization.


General References on Biomedical Image Archives

NCI Image Archive Committee Listserver
A listserver for medical image archive technology, applications, standards, and related topics sponsored by the National Cancer Institute Cancer Imaging Program.

NCI Image Archive Management Workshop Report - August 2000 [PDF File]
The National Cancer Institute (NCI) workshop entitled "Image Archive Management" that was presented on August 28 and 29, 2000, at the Natcher Conference Center on the National Institutes of Health (NIH) campus is summarized. The purpose of this workshop was to solicit expert input for the planned development of an archival system to make imaging databases readily accessible by the broad scientific community. This PDF file is the workshop report published in Academic Radiology.

BIRN - Biomedical Informatics Research Network
The BIRN is a National Center for Research Resources (NCRR) initiative iaimed at creating a testbed to address neuroscience researchers' need to access and analyze data at a variety of levels of aggregation located at diverse sites throughout the country. The BIRN testbed will bring together hardware and develop software necessary for a scalable network of databases and computational resources. Issues of user authentication, data integrity, security, and data ownership will also be addressed. BIRN initiative has created consortiums of biomedical technology and clinical research centers that are working together to address two fundamental biomedical research issues: 1) integrating data across modalities and scales; and 2) merging difficult to acquire data with heterogeneous collection attributes from multiple research sites. These initiatives are only first steps in creating the infrastructure to support the synergistic approaches needed to solve challenging biomedical problems. The BIRN Coordinating Center has a separate website

Bristol Biomedical Image Archive
The Bristol Biomedical Image Archive is an online collection of about 8500 medical, dental, and veterinary images for use in teaching and learning. All the images have been donated by academics working in the biomedical fields in different countries. Hosted at the Institute for Learning and Research Technology, University of Bristol, UK.

Global Image Database
The Global Image Database (GID) is a web-based structured central repository for scientific annotated images. The GID was designed to manage images from a wide spectrum of imaging domains ranging from microscopy to automated screening. The annotations in the GID define the source experiment of the images by describing who the authors of the experiment are, when the images were created, the biological origin of the experimental sample and how the sample was processed for visualization. A collection of experimental imaging protocols provides details of the sample preparation, and labeling, or visualization procedures. In addition, the entries in the GID reference these imaging protocols with the probe sequences or antibody names used in labeling experiments. The GID annotations are searchable by field or globally.

The BioImage Homepage
A European initiative for a database of multidimensional biological images. The BioImage database project, funded by the European Union, is a collaboration between eight European groups. Its aim is to provide the general scientific community with a flexible and searchable database of multi-dimensional biological images.

Open Archives Initiative
The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials. As a result, the Open Archives Initiative is currently an organization and an effort explicitly in transition, and is committed to exploring and enabling this new and broader range of applications. As we gain greater knowledge of the scope of applicability of the underlying technology and standards being developed, and begin to understand the structure and culture of the various adopter communities, we expect that we will have to make continued evolutionary changes to both the mission and organization of the Open Archives Initiative.

back to top of pageBack to Top


Image Archive Technology

David Clunie's Medical Image Format Site
This website includes FAQs on medical image data, applications, and formats. It was assembled and is sponsored by a key expert and advocate from the DICOM community.

Very Large Data Base Endowment Inc.
The Very Large Data Base Endowment Inc. (VLDB Endowment) is non-profit organization incorporated in the United States for the sole purpose of promoting and exchanging scholarly work in databases and related fields throughout the world. The contents of the VLDB journal are available. An annual VLDB Conference schedule and many PDF files with extended abstracts are available.

Digital Medical Image Archive Technology
An effective, low cost, long term archive is critical to the successful implementation of a PACS system and any filmless plan. Performance and capacity requirements must be carefully understood and cost/performance tradeoffs must be made in light of these requirements. Provided by StorageTech Corp.

Microsoft Research - Scalable Servers
Microsoft is exploring techniques to build large servers as arrays of commodity processors, disks, and interconnects - Scalable Networks and Platforms (SNAP). The resulting computer cluster should be as easy to program, manage, and use as a single system. In addition, by using spare modules and redundant storage, the cluster should mask component failures and so provide highly-available services. This work combines the expertise of the NTclusters group to help define the requirements for clusters and the SQLserver database team to add fault-tolerance, scalability, and parallelism to SQLserver.

Terra-server™
A one-node terabyte geo-spatial database server (the Terra-Sever™ ), and a 45-node cluster doing a billion transactions per day. There were also SAP + SQL + NT-Cluster failover demos, a 50 GB mail store, a 50k user POP3 mail server, a 100 million-hits-per-day web server, and 64-bit addressing SQL Server were also shown. The TerraServer started as a joint research project between Aerial Images, Inc., Microsoft, the USGS, and Compaq. The TerraServer concept grew out of the convergence of two needs. Aerial Images, Inc. wanted to sell imagery online and Microsoft Research needed a large database to demonstrate the capabilities of its new database software.

Teradata Corp.
Teradata, a division of NCR Corporation, offers powerful analytical solutions that help businesses drive growth. Teradata solutions include the Teradata warehouse, along with analytical applications for customer relationship management, operations/financial management, business performance management and e-business. NCR Corporation (NYSE: NCR) is a leader in providing Relationship Technology™ solutions to customers worldwide. In addition to Teradata solutions, NCR offers store automation systems and automated teller machines (ATMs). NCR employs over 33,000 people in more than 100 countries, and is a component stock of the Standard Poor's 500 Index.

Archive Builders Corp.
Archive Builders assists organizations with their plans for document management, document imaging systems and digital libraries. One valuable service has proven to be advice and discussion of document management plans drawn up by organizations considering a system installation. They offer onsite systems analysis, requirements planning and assistance in writing system specifications. Whitepapers and presentation materials used in the document management class taught by Archive Builders are available free for download. Please include a reference to http://www.ArchiveBuilders.com if you make reference to these materials or distribute them. The papers are updated periodically.

My SQL - Open Source Relational Database
MySQL is the world's most popular Open Source Database, designed for speed, power and precision in mission critical, heavy load use. MySQL AB is the company that develops, supports and markets the MySQL database server globally. Their mission is to make superior data management available and affordable for all, and to contribute to building the mission-critical high-volume systems and products of tomorrow. The product is available at zero price under the GNU General Public License (GPL), and it is sold under a commercial license to those who do not wish to be bound by the terms of the GPL. The MySQL database server embodies an ingenious software architecture that maximizes speed and customizability. Extensive reuse of code within the software and minimal but functionally rich features have resulted in a database management system unmatched in speed, compactness, stability and ease of deployment. The unique separation of the core server from the table handler makes it possible to run MySQL under strict transaction control or with ultrafast transactionless disk access, whichever is most appropriate for the situation. Today MySQL is the most popular open source database server in the world with more than 2 million installations powering websites, data warehouses, business applications, logging systems and more. Customers such as Yahoo! Finance, MP3.com, Motorola, NASA, Silicon Graphics, and Texas Instruments use the MySQL server in mission-critical applications.

back to top of pageBack to Top


Image Archive Standards

DICOM - Digital Imaging and Communications in Medicine
DICOM Standards Committee exists to create and maintain international standards for communication of biomedical diagnostic and therapeutic information in disciplines that use digital images and associated data. The goals of DICOM are to achieve compatibility and to improve workflow efficiency between imaging systems and other information systems in healthcare environments worldwide. DICOM is a cooperative standard. Therefore, connectivity works because vendors cooperate in testing via scheduled public demonstration, over the Internet, and during private test sessions. Every major diagnostic medical imaging vendor in the world has incorporated the standard into their product design and most are actively participating in the enhancement of the standard. Most of the professional societies throughout the world have supported and are participating in the enhancement of the standard as well. DICOM is used or will soon be used by virtually every medical profession that utilizes images within the healthcare industry. These include cardiology, dentistry, endoscopy, mammography, ophthalmology, orthopedics, pathology, pediatrics, radiation therapy, radiology, surgery, etc. DICOM is even used in veterinary medical imaging applications.

Health Level Screen (HL7)
Founded in 1987, Health Level Seven, Inc. is a not-for-profit, ANSI-accredited standards developing organization that provides standards for the exchange, management and integration of data that supports clinical patient care and the management, delivery and evaluation of healthcare services. Its 2,200 members represent over 500 corporate members, including 90 percent of the largest information systems vendors serving healthcare. HL7's endeavors are sponsored, in part, by the support of its benefactors: CAP Gemini Ernst & Young U.S. LLC, Eclipsys Corporation, Eli Lilly & Company, IDX Systems Corporation, Johnson & Johnson, McKesson Information Solutions, Microsoft Corporation, Philips Medical Systems, Quest Diagnostics Inc., Siemens Medical Solutions Health Services, Sun Microsystems and the U.S. Department of Veterans Affairs.

NCI Information Technology Standards (External)
Various information technology standards relevant to the National Cancer Institute (NCI) and the NCI Center for Bioinformatics (NCICB). This table provides links to many of the standards, standard development groups or standards-based tool developers' sites, especially XML, SXL, SOAP and related sites.

Cancer Bioinformatics Infrastructure Objects (caBIO)
The caBIO [Microsoft Word File] object modeling effort is an on-going effort to model the domains of cancer research. The caBIO objects simulate the behavior of actual components in biomedicine such as genes, chromosomes, sequences, agents, trials, ontologies, etc. They provide access to a variety of data sources including GenBank, Unigene, LocusLink, Ensemble, GoldenPath (through DAS), and NCICB's CGAP (Cancer Genome Anatomy Project) data repositories. The current object model was designed via the interaction of domain experts and IT professionals. The object model is designed using an iterative software development approach to accommodate new requirements for modeling genomic information. Details of each object were identifed during domain analysis and include information provided by domain experts as well as industry standards. caBIO is an "open source" software project.

XML Standards for Biology and Medicine
A set of links to diverse XML standards in biology and medicine are provided, including clinical trial data, gene expression, taxonomy, documents, and cell biology.

CODATA - Committee on Data for Science and Technology
CODATA, the Committee on Data for Science and Technology, is an interdisciplinary Scientific Committee of the International Council for Science (ICSU). CODATA was established over 30 years ago and its secretariat is located in Paris, France. CODATA seeks: 1) improvement of the quality and accessibility of data, as well as the methods by which data are acquired, managed, analyzed and evaluated, with a particular emphasis on developing countries; 2) facilitation of international cooperation among those collecting, organizing and using data; 3) promotion of an increased awareness in the scientific and technical community of the importance of these activities; and 4) consideration of data access and intellectual property issues.

National Informatics Standards in Cancer [PDF File]
The National Cancer Institute (NCI) has been building a compendium of cancer-related vocabulary and the technical resources to maintain and to disseminate the vocabulary. Standard vocabulary is an aspect of a larger set of standards that will be needed to provide cancer-related information and services in a structured form readily interpretable by people and computers. NCI must become a much more active voice in setting technical standards and promulgating cancer-related vocabulary.

SANE - Scanner Access Now Easy Image Data Format
SANE stands for "Scanner Access Now Easy" and is an application programming interface (API) that provides standardized access to any raster image scanner hardware (flatbed scanner, hand-held scanner, video- and still-cameras, frame-grabbers, etc.). The SANE API is public domain and its discussion and development is open to everybody. SANE is a universal scanner interface. The value of such a universal interface is that it allows writing just one driver per image acquisition device rather than one driver for each device and application. So, if you have three applications and four devices, traditionally you'd have had to write 12 different programs. With SANE, this number is reduced to seven: the three applications plus the four drivers. Of course, the savings get even bigger as more and more drivers and/or applications are added.

Cancer Informatics
"Cancer Informatics: Essential Technologies for Clinical Trials" is a book published in January 2002 that describes the National Cancer Institute's vision of a Cancer Informatics Infrastructure (CII). By exploring the best that the Internet and information technology have to offer, the CII will facilitate clinical trials, for all who are involved, including the patient along with the Myriad of health professionals involved in cancer trials.

NIH Home Page on Sharing Research Data
NIH is developing a new statement on sharing research data that will expect and support the timely release and sharing of final research data from NIH-supported studies for use by other researchers. Investigators submitting an NIH application will be required to include a plan for data sharing or to state why data sharing is not possible. This is an extension of NIH policy on sharing research resources. There is a link to a list of Frequently Asked Questions on data sharing.

Open Source Health Care Resources
Open source refers to software that comes with the source code in a form that customers can modify for their own needs and resell or give away to others under the same terms. Software in the Public Interest has registered the phrase "open source" as a certification mark and www.opensource.org contains a detailed definition of what open source means. An open source computed tomography (CT) simulator is available, for example.

back to top of pageBack to Top


Image Archive Applications - Clinical Trials

CDISC - Clinical Data Interchange Standards Consortium
CDISC is an open, multidisciplinary, non-profit organization committed to the development of industry standards to support the electronic acquisition, exchange, submission and archiving of clinical trials data and metadata for medical and biopharmaceutical product development. The mission of CDISC is to lead the development of global, vendor-neutral, platform independent standards to improve data quality and accelerate product development.

FDA Electronic Submissions and Review
The Food and Drug Administration (FDA) regulates drugs, biologics and medical devices. The FDA Center for Drug Evaluation and Research's Electronic Regulatory Submissions and Review (ERSR) web page provides information about the electronic submission of regulatory information to the Center and the review of it by CDER staff.

QARC - Quality Assurance Review Center
The Quality Assurance Review Center (QARC) is a Global Data and Review Center, providing Radiotherapy Quality Assurance and Diagnostic Imaging data management programs for several National Cancer Institute (NCI) supported Cooperative Groups and International Pharmaceutical Companies. QARC is a research program within the University of Massachusetts Medical School. It is an established research resource for clinical investigators around the world.

RCET - Resource Center for Emerging Technologies
The Resource Center for Emerging Technologies (RCET) at the University of Florida (UF) provides advanced technical resources necessary to support radiotherapy. The use of medical informatics is expected to facilitate education, collaboration, and peer review, as well as provide an environment in which clinical investigators can receive, share, and analyze voluminous multimodality clinical data.

Image Guided Therapy Center at Washington University
The Image-Guided Therapy Center (ITC) (formerly known as the 3DQA Center) WWW server at Washington University School of Medicine in St. Louis, Missouri supports image-based 3D conformal radiotherapy (CRT) multi-institutional trials.

ACRIN - American College of Radiology Imaging Network
The American College of Radiology Imaging Network (ACRIN) is a National Cancer Institute-funded cooperative group. ACRIN's overarching goal is - through clinical trials of diagnostic imaging and image-guided therapeutic technologies to generate information that will lengthen and improve the quality of the lives of cancer patients.

NIH CIT Medical Image Repository
At the NIH Center for Information Technology (CIT), in collaboration with NINDS, the Computational Methods and Applications Group (CMAG) has developed a Web-based medical image archive system for the archive of imaging and clinical data from the Suburban Hospital study. This archive system provides secure Web interfaces for clinical data entry, data upload, database query, and data download. CMAG is currently developing a separate archive system for the GAIN study.

back to top of pageBack to Top


Image Archive Applications - General

USF Digital Mammography Database
The Digital Database for Screening Mammography (DDSM) is a resource for use by the mammographic image analysis research community. The primary purpose of the database is to facilitate sound research in the development of computer algorithms to aid in screening. Secondary purposes of the database may include the development of algorithms to aid in the diagnosis and the development of teaching or training aids. The database contains approximately 2,500 studies. Each study includes two images of each breast, along with some associated patient information (age at time of study, ACR breast density rating, subtlety rating for abnormalities, ACR keyword description of abnormalities) and image information (scanner, spatial resolution, ...). Images containing suspicious areas have associated pixel-level "ground truth" information about the locations and types of suspicious regions. Also provided are software both for accessing the mammogram and truth images and for calculating performance figures for automated image analysis algorithms.

Mouse Brain Library
The MBL (http://mbl.org/) consists of high-resolution images and databases of brains from many genetically-characterized strains of mice. There are numerous uses of the MBL, but the developers' mission is to systematically map and characterize genes that modulate architecture of the mammalian CNS (for a complete description of projects refer to the P20 Human Brain Project award: Informatics Center for Mouse Neurogenetics). MBL databases also include detailed information on genomes of many strains of mice. The collection now consists of images from approximately 800 brains and numerical data from just over 8000 mice. MBL can be searched for cases by strain, age, sex, body or brain weight. Images of the slide collection are available at a series of resolutions. The base resolution is 24.5 ± 0.5 µm per pixel in the XY plane with a 150 µm interval between sections (300 µm on each slide, 2 slides per case). Significantly higher resolution images of single sections-4.5 µm/pixel-have been acquired for over a hundred cases marked with a blue "hi-res" button. They are now collecting 1 µm/pixel images for specific parts of the brain-at present, the neocortex, hippocampus, and the dorsal lateral geniculate nucleus. Very high resolution images (<0.2 µm/pixel) are available for C57BL/6J using the iScope, a web-controlled microscope equipped with DIC optics.

List of Image Database and Image Collection Sites
Assembled for the AMIA Fall Congress while Mark E. Shelton was a postdoctoral fellow at the University of Missouri-Columbia, this list gives a summary of many websites, some of which are no longer active.

RSNA Databases and Teaching Files: Medical Images
A set of links to web-accessible medical image collections, primarily at academic institutions, is included in the RSNA Education Portal. These sites comprise most radiological subspecialty areas and modality types, especially case-based teaching files.

CMU Computer Vision Test Images
The Computer Vision Homepage was established at Carnegie Mellon University in 1994 to provide a central location for World Wide Web links relating to computer vision research. The emphasis of the Computer Vision Homepage is on computer vision research rather than on commercial products. A comprehensive set of links to publicly accessible websites with computer vision test images is offered.

ECVNet Image Data Bases List
This page contains pointers to sites offering public access to image collections via the Internet. There you can find color and grayscale still images, medical images, textures, sequences, stereo pairs, range images, etc.

Computer Vision, Machine Learning, and Image Databases
A list of web links for computer vision, machine learning, and image databases is provided by Kelly at the University of Iowa. This list includes journals & conferences, organizations, courses, laboratories, bibliographies, content-based retrieval systems, image databases, image data mining, context-based indexing papers, and other resources.

MedPix™ Medical Image Database
MedPix™ is a fully web-enabled and cross-platform database, integrating images and textual information. The primary "target audience" includes resident and practicing physicians, medical students, graduate nursing students and other post-graduate trainees. The material is organized by disease category, disease location (organ system), and by patient profiles. The database can be searched through multiple internal text search engines. In addition, search formulations can be sent directly to PubMed, or to other outside search engines with just ONE CLICK. Registered users may browse the image database through a "slide sorter" module. Contributed content may be © copyrighted by the original author/contributor, and is used with permission.

BRAID: Brain Image Database
The BRAID project is developing database technology for the manipulation and analysis of 3-dimensional brain images derived from MRI, PET, CT, etc. BRAID is based on the Illustra server, an object/relational or SQL3 DBMS, which allows a standard relational DBMS to be augmented with application-specific datatypes and operators. The BRAID project is adding operations and datatypes to support querying, manipulation and analysis of 3D medical images, including: image datatypes, image operators, statistical operators, and a web interface.

Gastrolab Image Library
Gastrolab Endoscopy Pictures Archive
This website is an image library that will eventually contain pictures of every disease that make visible changes in the digestive system. Most of the endoscopic pictures are taken with Olympus videoendoscopes. The picture quality in this library is not as good as in the original pictures - the original quality would have made transmission times too long. In this image library typical x-ray-findings in gastroenterologic diseases are illustrated. This website is provided as a free service by The Wasa Workgroup on Intestinal Disorders, GASTROLAB, Vasa, Finland.

The Stanford Visible Female
"The Stanford Visible Female is an Academic Project sponsored by the Division of Anatomy and SUMMIT. Central to the project is a series of 95 photographed cryosections of a reproductive-age female cadaveric pelvis acquired in 1993. From these cross-sectional data, several research projects have arisen. These range from 2D imaging correlations with independent MR data to 3D models developed for anatomically accurate surgical simulation."

Visible Human Project
It is creating a complete, anatomically detailed, three-dimensional representations of the male and female human body. The current phase of the project is collecting transverse CT, MRI and cryosection images of representative male and female cadavers at one millimeter intervals. Includes an extensive collection of links to projects based upon the Visible Human data.

Visible Human Server
"...a virtual anatomic construction kit on the web using the Visible Human dataset." Features include: "Extract slices, curved surfaces, and slice animations from both datasets (male and female)" ; "Interactively navigate by slicing through the male dataset in real-time" ; "Construct 3D anatomical scenes using combinations of slices and 3D models of internal structures from the male dataset, and extract 3D animations" ; "Add voice comments to video sequences generated using the applets. Peripheral Systems Lab (Prof. Hersch and his team), Computer Science Dept., Ecole Polytechnique Fédérale de Lausanne

NIDCR - Craniofacial and Skeletal Diseases Branch
The NIDCR imaging web page will allow the NIH research and clinical community to collaborate on imaging studies through the internet. All authorized users on the NIH campus and abroad will be able to display and review the studies posted on the imaging web page with the NIH developed imaging tool Medical Image Processing, Analysis and Visualization (MIPAV).

back to top of pageBack to Top


Imaging in Clinical Trials

Image Collection in Clinical Trials [PDF File]
From the Image Guided Therapy QA Center, Washington University in St. Louis, this is W.B. Harms, Sr. presentatio at the Society of Clinical Trials meeting in Denver, CO on 20 May 2001. The PowerPoint file has been converted to PDF.

How does CDISC work with the FDA?
First, the FDA has identified and assigned three liasons to CDISC: Dr. Randy Levin (CDER), Michael Fauntleroy (CBER) and Sham Gupta (CBER). These individuals work with CDISC on a regular basis in terms of reviewing output of CDISC teams and suggesting future direction. They attend CDISC conference calls and meetings. CDISC also has several other FDA representatives who work with us by providing feedback, attending meetings with CDISC at the FDA and joining us in sessions and presentations at industry conferences.

Multidisciplinary Approach to Data Standards for Clinical Development [PDF File]
This article originally appeared in APPLIED CLINICAL TRIALS, Volume 11, Number 4, pages 35-44, April 2002, by Rebecca Kush, PhD who is president of CDISC. A common interchange standard for clinical data is described that will save time, effort, and money for everyone involved—and CDISC continues to develop new functional models to prove it.

back to top of pageBack to Top


Clinical Trials Image Archive Technology

CardioNow, Inc.
CardioNow's service, which is specifically designed to handle the large file sizes (greater than 200 megabytes) associated with DICOM cardiology images, enables study investigators to send complete trial images from their cath lab to the angiographic core lab in near real-time. All cases sent via the CardioNow network are transmitted and archived in native DICOM so the original image quality is preserved. Furthermore, cases associated with clinical trials are coded and anonymized to protect patient confidentiality. By facilitating the secure, electronic transmission of cases, CardioNow eliminates the unnecessary delays and expenses associated with copying, shipping and storing CDs and cine films.

DICOM 3.0 Implementation Workshop
The Washington University Image-Guided Therapy Center (ITC) held a workshop on the clinical implementation of the DICOM 3.0 standard for participation in advanced technology radiation therapy multi-institutional trials. The workshop reviewed the DICOM 3.0 objects required for multi-institutional trial participation as well as ITC's DICOM 3.0 Conformance Statement [PDF File] or all of those objects. Additional items to be covered are the use of Part 10 file sets, media of exchange of the DICOM data, patient confidentiality issues, and assitance the ITC can provide to those vendors who are planning to implement DIMSE (networked) communications but not (Part 10) file set creators. The DICOM objects discussed include Image Sets (CT, MRI and ultrasound), Structure Sets (region of interest contours: critical structures, tumor and target volumes), Plans (external beam and permanent prostate seed), Doses (both dose matrices and dose-volume histograms) and RT Images (e.g. DRRs, on-line images, digitized films, etc.). This workshop was held on Saturday, March 16, 2002.

back to top of pageBack to Top


NIH Information Standards

NIH Computer Security Requirements
The Department of Health and Human Services (DHHS) Automated Information Systems Security Program (AISSP) Handbook gives us guidelines for determining the sensitivity of information and the criticality of data processing capabilities as they pertain to the mission of an office. All NIH data has some degree of sensitivity, even data that is intended for unrestricted access by many and varied individuals and groups. Further information is available from http://www.cio.gov/.

NCICB Plans and Priorities - Standards-Based Repository (SBR)
NCICB is building a repository to comprehensively store the data used by NCI programs in adherence with the international standards used by many other Federal organizations. This keystone effort will ensure that all data from NCICB supported programs can be easily shared, whether from clinical trials, animal model programs, basic research, or any other discipline. NCICB is collaborating to create a unified national repository of health-related meta-data to describe much of the data held by NCI. The repository will be compliant with international standards and will make finding and using data held by NCI much easier.

NCI Common Data Elements
A repository of terms called Common Data Elements (CDEs) that medical providers may use to collect patient information for clinical trials or for cancer care. NCI hopes to facilitate uniform standards for both cancer clinical trials and patient care by assembling and maintaining this information on-line for all interested caregivers. The CDEs are compliant with the ISO 11179 standard. Of particular interest are “Medical Imaging” CDEs, especially those developed for lung cancer CT screening.

ISO 11179: Specification and Standardization of Data Elements
ISO 11179 is a standard for describing data elements used in databases and documents that specifies basic aspects of data element composition, including metadata. The standard applies to the formulation of data element representations and meaning as shared among people and machines; it does not apply to the physical representation of data as bits and bytes at the machine level. This standard is used as the basis for the NCI Common Data Elements.

Dublin Core Metadata Initiative
The Dublin Core Metadata Initiative is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global workshops, conferences, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.

NLM Communications Engineering Branch
Projects in the Communications Engineering Branch focus on R&D in image engineering: the capture, storage, processing, online retrieval, transmission and display of both biomedical documents (mainly journals) and medical imagery. The data repositories available from the NLM Communications Engineering Branch have been collected from a variety of sources. This collection contains digitized versions of radiographs and rare manuscripts. Data Repositories include the National Health and Nutrition Examination Surveys (NHANES) with collateral data and x-ray images.

back to top of pageBack to Top


Other Federal Agencies

DARPA Information Processing Technology Office
DARPA IPTO will create Information Processing Technology for new generation intelligent systems, transforming our national infrastructure to enhance global stability. The IPTO has a 4-part mission: 1) Create transformational information technologies to anticipate and meet National Security imperatives; 2) Validate technologies with prototypes of real National Security solutions; 3) Lead, stimulate, and complement commercial technology; and 4) Transition technologies to National Security users, via partnerships with other DARPA offices, industry, armed services, and government agencies.

NIST Fingerprint Image Software (NFIS)
A new public domain software release from the Image Group of the National Institute of Standards and Technology (NIST). NIST Fingerprint Image Software (NFIS) contains software technology, developed for the Federal Bureau of Investigation (FBI), designed to facilitate and support the automated manipulation and processing of fingerprint images. Source code for over 50 different utilities and an extensive User's Guide are distributed on CD-ROM free of charge without licensing and usage restrictions. A listing of test data produced by the NIST Image Group for use in evaluating automated OCR, fingerprint classification/matching, and face recognition systems is available. In addition, NIST offers image databases that constitute standard reference data for a variety of applications.

CDC Health Information and Surveillance Systems Board
The Centers for Disease Prevention and Control (CDC) Health Information and Surveillance Systems Board (HISSB) website lists organizations and resources related to development of health information standards. These include Coordinators/Promoters of Standards Development, Standards Development Organizations, and Classification/Nomenclature Systems.

Public Health Data Standards Consortium
In November 1998, the National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention (CDC), in conjunction with the Agency for Healthcare Research and Quality (AHRQ) and the National Committee on Vital and Health Statistics (NCVHS), convened a workshop to examine the implications of the Health Insurance Portability and Accountability Act of 1996 (HIPAA) for the practice of public health and health services research. The workshop, "The Implications of HIPAA's Administrative Simplification Provisions for Public Health and Health Services Research," brought together 85 leaders in health statistics, research, and informatics to examine the challenges and opportunities presented by HIPAA. This resulted in creation of a new consortium, officially established in January 1999 as the Public Health Data Standards Consortium, that serves as a mechanism for ongoing representation of public health and health services research interests in HIPAA implementation and other data standards-setting processes.

NASA ESAD Scientific Data Purchase Program
The Scientific Data Purchase (SDP) is a demonstration program developed in response to the President's Space Policy, directing NASA to purchase remote sensing data from the private sector. Initiated in fiscal year 1997, the SDP was funded under the Earth Science Enterprise (ESE) Program to provide scientific data to the ESE science community. The $50 million program is an opportunity to advance global-systems research, to strengthen the U.S. economy through development of remote sensing technologies, and to test a new way of doing business. The NASA Earth Science Applications Directorate (ESAD) at the John C. Stennis Space Center in Mississippi manages the SDP.

NASA EOSDIS Core System Information for Scientists = ECS Info
The Earth Observing System Data and Information System (EOSDIS) is designed to archive unprecedented amounts of Earth observing data from a wide range of instruments collecting information over decades. Its diverse user community can search, retrieve, and analyze any of these observations, also over a period of decades. EOS data products need descriptive information, or metadata, to enable users and data providers to locate and use the information. Over several years, numerous teams of scientists, computer scientists, and information engineers have collaborated to develop the data model, with its metadata attributes and how they are organized, to meet these needs. A catalog of EOSDIS related information has been prepared.

Astronomy Digital Image Library (ADIL)
ADIL collects astronomical, research-quality images and makes them available to the astronomical community and the general public. Patrons access the Library through the World Wide Web to search for and browse images. Once images are located in the Library, users may download them to their local machines in FITS format for further analysis. The Library provides a number of benefits not only to those looking for images, but also to those who add images to the Library's growing collection. The Library is being developed and maintained by the Radio Astronomy Imaging Group at the National Center for Supercomputing Application (NCSA) on the campus of the University of Illinois at Urbana-Champaign (UIUC) with support from:

back to top of pageBack to Top


XML and DICOM

What is XML?
A markup language is a mechanism to identify structures in a document. The XML specification defines a standard way to add markup to documents. In order to appreciate XML, it is important to understand why it was created. XML was created so that richly structured documents could be used over the web. The only viable alternatives, HTML and SGML, are not practical for this purpose. This is the 1st part of a technical introduction to XML.

Why is XML so important? and What is metadata?
XLML allows us to focus the problem on metadata, and metadata is the key to interoperability. Metadata is one of the critical success factors to sharing information. Metadata also is one of the critical success factors to storing information cost-effectively. Metadata can make information sharing and storage efforts great successes, or great failures. Metadata costs money and has its own ROI. There are many specific recommendations on XML and metadata by the Federal Council of Chief Information Officers.

Transcoding DICOM to XML
Supplement 23 to DICOM (Digital Imaging and Communications for Medicine), Structured Reporting, is a specification that supports a semantically rich representation of image and waveform content, enabling experts to share image and related patient information. DICOM SR supports the representation of textual and coded data linked to images and waveforms. Nevertheless, models that work as bridges between the DICOM relational model and open object-oriented technologies are needed. An object-oriented model to represent the DICOM SR standard and generate XML-exchangeable representations using World Wide Web Consortium specifications was developed. [ABSTRACT] A distributed database course project on the exchange of DICOM-compatible medical images using XML was done at the University of Waterloo.

XML for Molecular Biology
A list of XML resources compiled by Paul Gordon that may be of use to the bioinformatician. If you don't think XML is where it's at, count how many times the word appears in EU/US Workshop on Large Scientific Databases report.

Object Management Group (OMG)
The OMG was formed to create a component-based software marketplace by hastening the introduction of standardized object software. The OMG's charter includes the establishment of industry guidelines and detailed object management specifications to provide a common framework for application development. Conformance to these specifications will make it possible to develop a heterogeneous computing environment across all major hardware platforms and operating systems. The nearly 800 member companies of the Object Management Group produce and maintain a suite of specifications that support distributed, heterogeneous software development projects from analysis and design through coding, deployment, runtime, and maintenance. These include:

  1. Model Driven Architecture (MDA)
  2. Unified Modeling Language (UML)
  3. MetaObject Facility (MOF)
  4. XML Metadata Interchange (XMI)
  5. Common Warehouse Metamodel (CWM)
  6. Common Object Request Broker Architecture (CORBA)
  7. Object Management Architecture (OMA)

XML - CORBA - DICOM
What Digital Imaging and Communication in Medicine (DICOM) could look like in common object request broker (CORBA) and extensible markup language (XML). [ABSTRACT]

CDISC Operational Data Model (ODM)
The final version 1.1 Specification for the Operational Data Model (ODM) was released by CDISC on 9 May 2002. The XML-based Operational Data Model "provides a format for representing the study metadata, study data, and administrative data associated with a clinical trial. It represents only the data that would be transferred among different software systems during a trial, or archived after a trial. It need not represent any information internal to a single system, for example, information about how the data would be stored in a particular database." The version 1.1 release includes the text of the specification, with XML DTDs and supporting documentation. ODM v1.1 Final "represents the culmination of more than three years of effort by a multi-disciplinary team of pharmaceutical and biotechnology sponsors and technology vendors; the development team believes the CDISC 1.1 DTD is now ready for widespread adoption among sponsors, vendors and CROs to facilitate the interchange of clinical trial data.

DARPA Agent Markup Language (DAML)
The World Wide Web (WWW) contains a large amount of information and is expanding at a rapid rate. Most of that information is currently being represented using the Hypertext Markup Language (HTML), which is designed to allow web developers to display information in a way that is accessible to humans for viewing via web browsers. While HTML allows us to visualize the information on the web, it doesn't provide much capability to describe the information in ways that facilitate the use of software programs to find or interpret it. The World Wide Web Consortium (W3C) has developed the Extensible Markup Language (XML) which allows information to be more accurately described using tags. As an example, the word Algol on a web site might represent a computer language, a star or an oceanographic research ship. The use of XML to provide metadata markup, such as Algol, makes the meaning of the work unambiguous. However, XML has a limited capability to describe the relationships (schemas or ontologies) with respect to objects. The use of ontologies provides a very powerful way to describe objects and their relationships to other objects. The DAML language is being developed as an extension to XML and the Resource Description Framework (RDF). The latest release of the language (DAML+OIL) provides a rich set of constructs with which to create ontologies and to markup information so that it is machine readable and understandable.

XML Multimedia Radiology Report
The clinical display of radiologic information as an interactive multimedia report is accomplished using a multimedia report model based on Extensible Markup Language (XML), rather than a traditional workstation model. XML does not replace existing standards (i.e., Digital Imaging and Communications in Medicine [DICOM], Transmission Control Protocol/Internet Protocol [TCP/IP]). Instead, it provides a powerful framework that is used in combination with existing standards to allow system designers to modify display characteristics based on user need. The application of XML to the clinical display of radiologic information is described. [ABSTRACT]

Review/Tutorial on Standards for Radiology Networks
Medical communication standards, i.e., HL 7, DICOM, and in the near future the migration towards XML, support the interoperability between the IT subsystems and pave the way to patient information systems with access to unified and complete electronic medical records (EMR). Furthermore, with standardized communication techniques, such as CORBAmed [PDF File], an object-oriented design of Healthcare applications will be possible in the near future. [ABSTRACT]

MIMOS: A framework for exchanging medical image processing results
DICOM presently supports structured reporting of image studies, but does not accommodate semantics in the image handling domain. This can impede the exchange and the interpretation of processing results. To overcome this limitation, a framework based on a formal grammar was developed, with documents encoded using XML. [ABSTRACT]

back to top of pageBack to Top


Implementation of Biological Databases

SIDB - Scientific Image Data Base (SIDB)
A web-driven open source database for 2-D and 3-D images (http://sidb.sourceforge.net/), specifically designed for (confocal) microscopy units, but applicable wherever groups of users collaborate with images.

OpenHealth™ -- Open source software in health care
Electronic medical records and networks are the solutions to the technical issues around coordinating the work of diverse health care professionals caring for a single person across multiple sites. Open source software has potential to overcome some of the obstacles now being encountered in this transition: 1) Open source reference implementations of medical record standards could speed their adoption and increase interoperability in practice. The differences in adoption between TCP/IP and ISO network protocols illustrate the importance of reference implementations. 2) Open source software could reduce the issue of "Who pays?" in community health networks by eliminating per user and per site license costs and unbundling implementation and support charges.

ASN.1 - Abstract Syntax Notation One
ASN.1, or Abstract Syntax Notation One, is an International Standards Organization (ISO) data representation format used to achieve interoperability between platforms. The National Center for Biotechnology Information (NCBI) uses ASN.1 for the storage and retrieval of data such as nucleotide and protein sequences, structures, genomes, and MEDLINE records. It permits computers and software systems of all types to reliably exchange both the data structure and content. The NCBI Software Development ToolKit (known as the 'NCBI Toolbox') is a set of software and data exchange specifications used by NCBI to produce portable, modular software for molecular biology. The software in the Toolbox is primarily designed to read ASN.1 format records. It is freely available to the public, and can be used in its own right or as a foundation for building tools with similar properties.

VISIM: Information Retrieval and Exploration in Large Medical Image Collections
Visual information systems in medicine (VISIM) are emerging capable of retrieving items from large collections of images and exploring connections between them to discover new insights, confirm hypotheses, or search for similar findings. The advance of these systems is at the crossroads of computer vision, man-machine interaction and image database technology, invoking many novel issues that need to be addressed. This one day workshop was held in Utrecht, NL on October 18, 2001.

Biological Databases: Information Retrieval
Author/Sponsor: David Landsman (NCBI, NIH) - Videocast lecture [PDF File] from the current topics in genome analysis course.

Biological Databases: Content and Submission
Author/Sponsor: Francis Ouellette (Canadian Center for Molecular Medicine and Therapeutics) - Videocast lecture [PDF File] from the current topics in genome analysis course.

NCSA Emerge
Emerge is an National Center for Supercomputer Applications (NCSA) effort to develop middleware components of a new distributed search infrastructure which addresses the scale and heterogeneity of scientific data. The components enable search services to interoperate across scientific domains by providing user-configurable tools for mapping between metadata schemas, performing search queries against multiple data sources, and performing query pre- and post-processing. Access to our search services is through platform-neutral standard and emerging-standard tools such as Z39.50, Open Archives, XML, and Java. This work was done in collaboration with the National Cancer Institute.

Micromine, Corp.
Chris Barnes's consultant services company in Powder Springs, GA has processed mammograms (and many other types of imagery) using multimedia data warehouses and data mining that combine pixel-searchable image archives with linkable descriptive data.

PEIPA -- the Pilot European Image Processing Archive
PEIPA is an archive of material relating to the processing of images, with an emphasis on image analysis and computer vision. The archive is supported from the British Machine Vision Association, the University of Essex, and the EU-funded project Performance Characterization in Computer Vision.

back to top of pageBack to Top


Cancer Image Archives

National Digital Mammography Archive (NDMA)
The National Digital Mammography Archive (NDMA [PDF File]) represents a collaborative effort between the University of Pennsylvania Medical Center (including the National Scalable Cluster Project - NSCP), the University of Chicago Department of Radiology, the University of North Carolina - Chapel Hill School of Medicine, the Department of Radiology - Breast Imaging - Sunnybrook and Women's College Health Sciences Centre of the University of Toronto, and Advanced Computing Technologies Division of BWXT Y-12 L.L.C. in Oak Ridge Tennessee. This is a Next Generation Internet project sponsored by the National Library of Medicine. NDMA seeks to develop a testbed that demonstrates the feasibility of a national breast imaging archive and network infrastructure [PDF File] to support digital mammography using Next Generation Internet (NGI) technologies.

Virtual Cancer Image Data Warehouse
At the National Cancer Center (Tokyo, Japan), more than 100 virtual cancer images from CT or MR data of individual patients with cancer (Cancer Edutainment Virtual Reality Theater: CEVRT). These images can be used to help explain procedures, findings, etc. to the patient, to obtain informed consent, to simulate surgery, and to estimate cancer invasion to surrounding organs. A web-based object-oriented database [PDF File] was created to access these cancer images and to register medical images at international research sites via the Internet.

Mammographic Image Analysis Society - Mammographic Database
The original MIAS Database (digitised at 50 micron pixel edge) has been reduced to 200 micron pixel edge and clipped/padded so that every image is 1024 pixels x 1024 pixels. There are 322 cases. Reference: J Suckling et al (1994) "The Mammographic Image Analysis Society Digital Mammogram Database" Exerpta Medica. International Congress Series 1069, pp 375-378.

UCSF Digital Mammography Warehouse
The design and development of a digital mammography data warehouse to facilitate clinical and research activities is described (SPIE Medical Imaging 2002 meeting). A data warehouse is a complete and consistent integration of data from many information sources. It enables users to explore the warehouse for various analyses and decision support purposes. The information system incorporates breast imaging data from a diversity of existing clinical systems, into a digital data warehouse. Various types of breast imaging data, including patient demographics, family history, digital mammography and radiological reports, will be acquired for the University of California San Francisco digital mammography PACS modules, as well as Radiological Information System.

UCSF Neuroimaging Data Warehouse
An image data warehouse infrastructure containing a broad array of biomedical imaging and clinical data is built on a Picture Archiving and Communication Systems (PACS) environment. Up to now, the primary purpose of most database systems and tools was to meet the needs of operational systems, which are typically transactional in nature. In contrast, an object-oriented analysis and design (OOAD) process was used for epilepsy research and patient care. The implementation is based on a Java CORBA (Common Object Request Broker Architecture) and Web-based architecture that separates the graphical user interface presentation, data warehouse business services, data staging area, and backend source systems into distinct software layers.

back to top of pageBack to Top


Biological Databases

Molecular Biology Database Collection
The Molecular Biology Database Collection is an online resource listing key databases of value to the biological community. This Collection is intended to bring fellow scientists' attention to high-quality databases that are available throughout the world, rather than just be a lengthy listing of all available databases. As such, this up-to-date listing is intended to serve as the initial point from which to find specialized databases that may be of use in biological research. The databases included in this Collection provide new value to the underlying data by virtue of curation, new data connections or other innovative approaches. Short, searchable summaries and updates for each of the databases included in the Collection are available through the Nucleic Acids Research Web site at http://nar.oupjournals.org.

Nucleic Acids Research - Special Database Issue (2002)
The 2002 Database Issue of Nucleic Acids Research is the ninth in a series dedicated to factual biological databases. These databases have become an essential resource for working biologists and the aim of this compilation is to provide descriptions of the most important of these databases and especially to introduce newly compiled databases that provide specialist information in the biological area. In the current issue (Jan 2002), there are descriptions of 2112 databases.

back to top of pageBack to Top

back button
Back


National Cancer Institute Logo and Link
Last Updated: March 2003