stuff you never thought you wanted to know.


Imaging systems - Current and future possibilities of Medical Informatics

This is a text written by Jon Berg <jon.berg|a|turtlemeat.com> spring 2005 in the Computer Science course Medical Informatics at Tromsø University, Norway.

Current and future possibilities of Medical Informatics

Table of Contents

11. Information retrieval in medical informatics systems

Information retrieval is identification and efficient use of recorded media. Previously the main purpose of information retrieval systems was to search biomedical literature. In the last years the need for Information retrieval has increased with the incorporation of new media into the information storage systems. The new media are gene and protein structures, all the types of multimedia that has become common for example digitized sound, video and pictures.


Theory behind Information Retrieval

The Information Retrieval process can be decomposed into four main tasks;

  • Indexing.
  • Query formulation.
  • Retrieval.
  • Evaluation and Refinement.


In indexing the information is to be represented in the smallest possible way to facilitate for rapid and efficient retrieval. In many cases an index is a list of likely words a person is going to be searching for. In the simplest form in the case of the inverted index it is a list of words and a pointer to where the information can be found. There is also the possibility humans doing indexing for example the way it is done in MEDLINE. In MEDLINE the article to be indexed is put in one of 15 trees based on the subject headings, and this is assigned by a person doing the indexing. Full-text indexing is a widely used method of automatically indexing the content. Full-text indexing involves first extracting all the words in the text, then removing stop words which is common words like “I”, “was”, “go”, “do” and so on. Then words are stemmed to remove common endings such as: “s”, “es”, “ed” and “ing”. Words are then weighted according to their importance and likeliness of giving a good discrimination of the document it appears in. A simple weighting formula is TF*IDF, where TF is the log(frequency of term in the given document) and IDF is log(number of documents/number of documents with the given word).


Query formulation is the process of transforming something a person is searching for into a quality query. A query often will contain Boolean operators such as AND OR NOT. The query may also contain wildcards such for one or several letters. Often there will be a user interface for the user to fill out and the query will be constructed from what the user enters. There is also the possibility of having users enter natural queries, such as natural sentences without any particular syntax.


Retrieval is the process that happens after a user has entered what he is searching for and a query has been constructed. The retrieval process involves matching, ranking and display. Matching is sorting out entries that match the query. Ranking is sorting the matched results in a particular order that can be chronologically, relevance and alphabetically.


Display is the process of outputting the search result to the user.


Current state of information retrieval

Information retrieval has benefited a lot from the introduction of the Internet specially the web. It provides means for distributing many types of information both text and multimedia. The web provides a great source for information, but a lot of the information is aimed at the health consumer, not the health professional. Finding credible information on the web for the health professional is also difficult because it is difficult to measure the quality of the information. One way that has been adopted to try to give health professionals quality information is that some health organizations that already have a great deal of credibility in the health community have created their own websites. Other initiatives have been aggregation services that will allow users to search information collected from many credible sources.


Future challenges in information retrieval

The web provides a promising way to distribute information. It could be a problem to use information on the web because it can more easily be changed. This is a problem compared to static content published in journals because they will change as evidence change. This raises the question if the versions should be archived. The web also makes it possible for publishers to get their work out without being published in journals. This would make it easy to get your findings out, but it will lack the integrity and quality control that articles have to go through to get published in traditional paper journals. As more and more multimedia information is used in the health care, information retrieval will in the future also have to deal with recalling of multimedia information. Data such as movie clips, sounds and pictures are not possible to search for by entering text. One way of solving this has been to incorporate meta-data and use the meta-data for matching of queries. This meta-data has to be derived from the context that the data is used in for example the text entries surrounding it where it appears in a health record.

Setup Software Raid 1 with LVM
Setup Linux with Desktop

Manage your website ads with DFP
Google AdSense for Domains - not so great
Let Google Handle Email for your Domain Name
Page Rank banned by Google
Google's highest ranked web pages
SEO intelligent spam causes irrelevant search results
Google Sandbox
Google ranking factors
How to not give out page rank in web page links

Web Server Programming
Simple Java web server
Simple Python web server
Configuring Apache webserver with .htaccess file

Turn off the loginscreen in XP, after installing .NET .

Turn off xp login screen unread mail count
What is .NET

Web (webmastering)
Introduction to Cascading style sheets (CSS)
The value of Alexa traffic rank
HTML META tag, not a search engine optimization tool
Create a maintainable webpage with modularization
The www prefix in your domain name
What is RSS and cool things RSS can be used for
MySql backup and restore with phpMyAdmin

Mix Computer related text
Doing business (making money) with Information Technology
Business with Computer Science
Research in Computer Science
Current and future possibilities of Medical Informatics
Tasks that make sense to have automated
Programming handheld mobile devices (overview)
Security tips for the web surfer
Price and Capacity in computer hardware
Java RMI Tutorial.

Microsoft Word
Page numbering in Word
Numbering headers or outlines in Word
Create a List of Figures
Turn off the default collapsing menus in Word

Turtlmeat.com 2004-2011 ©