Health History BERT en
The HealthHistoryBERT-en was pre-trained from the pre-trained model bert-base-uncased and with patient data from health insurances organized in the form of historical sentences. The initial objective of the training was to predict hospitalizations, however, due to the possibility of applications in other tasks, we made these models available to the scientific community. This model was trained with English data translated from Portuguese Health Insurance Data. There are also other training approaches that can be seen at:
Other pre-trained models
- HealthHistoryRoBERTa-en
- HealthHistoryRoBERTa-pt
- HealthHistoryBERT-en
- HealthHistoryBioBERT-en
- HealthHistoryBio_ClinicalBERT-en
- HealthHistoryBERTimbau-pt
Other Models trained to predict hospitalizations (fine-tune)
- HealthHistoryOpenLLaMA3Bv2-en-ft
- HealthHistoryOpenLLaMA7Bv2-en-ft
- HealthHistoryOpenLLaMA13B-en-ft
- HealthHistoryOpenCabrita3B-pt-ft
- HealthHistoryRoBERTa-en-ft
- HealthHistoryRoBERTa-pt-ft
- HealthHistoryBERTimbau-pt-ft
- HealthHistoryBERT-en-ft
- HealthHistoryBioBERT-en-ft
- HealthHistoryBio_ClinicalBERT-en-ft
Pretraining Data
The model was pre-trained from 837,159 historical sentences from health insurance patients generated using the approach described in this paper Predicting Hospitalization from Health Insurance Data.
Model Pretraining
Pretraining Procedures
The model was trained on a GeForce NVIDIA RTX A5000 24GB GPU from laboratories of IT departament at UFPR (Federal University of Paraná). The model parameters were initialized with bert-base-uncased.
Pretraining Hyperparameters
We use a batch size of 16, a maximum sequence length of 512, accumulation steps of 4, masked language model probability = 0.15, number of epochs = 1 and a learning rate of 10−4 to pre-train this model.
Pre-training time
The training time was 5 hours 22 minutes per epoch.
How to use the model
Load the model via the transformers library:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("efbaro/HealthHistoryBERT-en")
model = AutoModel.from_pretrained("efbaro/HealthHistoryBERT-en")
More Information
Refer to the original paper, Predicting Hospitalization with LLMs from Health Insurance Data
Refert to another article related to this research, Predicting Hospitalization from Health Insurance Data
Questions?
Email:
- Everton F. Baro: efbaro@inf.ufpr.br, everton.barros@ifpr.edu.br
- Luiz S. Oliveira: luiz.oliveira@ufpr.br
- Alceu de Souza Britto Junior: alceu.junior@pucpr.br
- Downloads last month
- -