From documents to datasets: challenges and solutions in the context of IDMP and pharmacology

Industry

To obtain authorization to bring a medicinal product on the market, 200,000 pages of text need to be submitted. The upcoming effectuation of the IDMP directive (EU) forces pharma companies to submit datasets instead. This has enormous impact. The challenges that this poses are manifold. Semantic Web technology is optimally positioned to address many of these. This presentation focusses on one of these challenges. When the authorization for an existing product has to be renewed, an IDMP-compliant dataset has to be compiled. Some 70 to 80 percent of the datapoints is described in the text and not obtainable from IT-systems. Manual data entry is error prone and does not scale, since it is estimated that the total number of datapoints may exceed 1700 for a single submission. Based on state of the art entity extraction software, a solution is developed that generates those parts of the dataset that can be obtained from the text. The presentation describes some of the major challenges that had to be overcome and details the solutions that were found. It presents some results and describes the major business requirements that need to be met.

Speakers:

Jan Voskuil

Dr.

A cognitive linguist by background, Jan is a technology evangelist in the field of Linked Data, and specializes in language processing, semantic web technology and AI. Jan is currently employed as CEO of Taxonic, which he co-founded in 2012. Taxonic is a consultancy that focusses on applying Linked Data technologies to real world business problems.

Access the Recording and Slide Deck?

As a registered participant, you got a login to access the recording and slide deck. You may also purchase an on-demand ticket (36,- incl. VAT).

Search form

From documents to datasets: challenges and solutions in the context of IDMP and pharmacology

Speakers:

Jan Voskuil

Access the Recording and Slide Deck?