Hola

I'm working on a project that deals with clinical named entity recognition, relation extraction etc. I'm currently using Scispacy library for NER work. However, I'm searching for a open source package for relation extraction from clinical notes (Eg. In the following sentence "Dementia due to Alzheimer disease." I except a model that should recognize the relationship that its not just dementia and its is dementia due to Alzheimer.)

Spending sometime on reading articles and surfing google

I found the following packages:

1. SemRep

2. BioBERT

3. Clincal BioBERT

etc.

from the articles, I also got to know that clincal BioBERT to be the suitable model. However, when I tried running the model from transformer library I just found the following output

Code

from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline

model =  AutoModelForTokenClassification.from_pretrained("emilyalsentzer/Bio_Discharge_Summary_BERT")

tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_Discharge_Summary_BERT")

nlp = pipeline('ner', model=model, tokenizer=tokenizer)

text = "Dementia due to Alzheimers disease. Kidney failure due to liver disease."

nlp(text)

Out put:

[{'entity': 'LABEL_1', 'index': 1, 'score': 0.562394917011261, 'word': 'dementia'}, {'entity': 'LABEL_0', 'index': 2, 'score': 0.5325632691383362, 'word': 'due'}, {'entity': 'LABEL_1', 'index': 3, 'score': 0.5473843812942505, 'word': 'to'}, {'entity': 'LABEL_1', 'index': 4, 'score': 0.5070908069610596, 'word': 'alzheimer'}, {'entity': 'LABEL_0', 'index': 5, 'score': 0.5742462873458862, 'word': '##s'}, {'entity': 'LABEL_1', 'index': 6, 'score': 0.5498184561729431, 'word': 'disease'}, {'entity': 'LABEL_1', 'index': 7, 'score': 0.5163406133651733, 'word': '.'}, {'entity': 'LABEL_1', 'index': 8, 'score': 0.5038259625434875, 'word': 'kidney'}, {'entity': 'LABEL_1', 'index': 9, 'score': 0.5872519612312317, 'word': 'failure'}, {'entity': 'LABEL_0', 'index': 10, 'score': 0.523786723613739, 'word': 'due'}, {'entity': 'LABEL_1', 'index': 11, 'score': 0.5193214416503906, 'word': 'to'}, {'entity': 'LABEL_1', 'index': 12, 'score': 0.5457456707954407, 'word': 'liver'}, {'entity': 'LABEL_1', 'index': 13, 'score': 0.5755748748779297, 'word': 'disease'}, {'entity': 'LABEL_1', 'index': 14, 'score': 0.5418881177902222, 'word': '.'}]

From the above output, I except labels such as disease, organ etc. However, the model labeled the entity as 'LABEL_1' or 'LABEL_0'.

How do I use the clinical BioBERT to extract relations. Please advice.

More Umer s Khalifa's questions See All
Similar questions and discussions