PHeP - Healthcare Predictive System
Model hyperparameters
Epochs | 8 |
Max sequence length | 512 |
Train Batch Size | 10 |
Learning Rate | 1e-8 |
Hidden Size | 768 |
Number of Hidden Layer | 12 |
Number of Attention Layer | 12 |
Intermediate Size | 3072 |
Hidden Act | gelu |
Hidden Dropout Prob | .1 |
Initializer Range | .02 |