Identification of Units and Other Terms in Czech Medical Records

Zvára K; Kašpar V

doi:10.24105/ejbi.2010.06.1.15

Identification of Units and Other Terms in Czech Medical Records

Author(s): Zvára K, Kašpar V

Healthcare documentation in the Czech Republic usually has the form of a free text formatted just using spaces, tabs and line breaks. Extracting information from such a documentation is a challenge that if fulfilled would allow to use Czech medical reports by physicians with no knowledge of the Czech language as well as information transfer to a structured form. It is possible to approach this task as a task of finite- state machine, as a task of the linguistic analysis or as a task of statistics. This article summarizes our findings gained using finite-state machines and using commonly used code lists. Excerpts from real medical reports are translated to English in a way that demonstrates the same or similar problems as in the Czech language. Original Czech excerpts are available in the Czech version of this article.

PDF

Abstract

Identification of Units and Other Terms in Czech Medical Records