Model Download

RF-models

ModelSize
IT+LC (1)2355 KB
IT+LC (2)2509 KB
IT+LC (3)2421 KB
CNN-Models for predicting
CNN_ModelSize
IT+LC CNN (1)1.36 GB
IT+LC CNN (2) 1.36 GB
IT+LC CNN (3) 1.36 GB
Models plus with the codes and features could generate the final scores.
Example tree of ITLC/RF model , tree number :01

Structure of the CNN

The structure of the CNN is demonstrated below. We leveraged the tensor flow embedded function of summary(). The numbers of parameters and the shape are shown below:

________________________________________________________________

Layer (type)                 Output Shape              Param #  

=================================================================

conv2d (Conv2D)              (None, 90, 87, 64)        320      

_________________________________________________________________

max_pooling2d (MaxPooling2D) (None, 90, 43, 64)        0        

_________________________________________________________________

conv2d_1 (Conv2D)            (None, 89, 42, 128)       32896    

_________________________________________________________________

max_pooling2d_1 (MaxPooling2 (None, 89, 21, 128)       0        

_________________________________________________________________

flatten (Flatten)            (None, 239232)            0        

_________________________________________________________________

dense (Dense)                (None, 512)               122487296

_________________________________________________________________

dense_1 (Dense)              (None, 2)                 1026     

=================================================================

Total params: 122,521,538

Trainable params: 122,521,538

Non-trainable params: 0

_________________________________________________________________

Experimental Parameters settings

Platform: GTX-2080ti Intel(R) Core(TM) i7-9700 CPU 64GB DDR4

CNN parameters:
epoch=50, batch_size=2, categorical_crossentropy,
optimizer=RMSprop,Learning rate=0.000002,metrics=accuracy

RandomForestClassifier: (n_estimators=120)

Kernal PCA:

KernelPCA (n_components=Feature_dimension, kernel=’poly’, gamma=0.15, degree=2)

Feature_dimension(lncRNA/ RF): 64-4096 (4096 selected)

___________________________________________________________________

Parameters concerning Repetitions of CV

Evaluation inside datasetNegative sampling RS (File Random seeds FRS)repetitionDataset Divide Random seeds
(DRS)Repetition
Trials
Type15525
Type2 (5folds Cross-validation)5525
Type3 (10-fold Cross-validation)5525
Type3 (HC involve)3515
Negative sampling Rate (NR Exploration)339
The high throughput constructed dataset (HC) is extremely big and took 10 Gibyte of memory. Thus, we only conducted repeated three times.
Parameters concerning Repetitions of Transfer Verification
Prediction PartsTrain File RepetitionTest File RepetitionMetrics Calculation
Transfer Verification (HC, not involved)5520 (25-5)
Transfer Verification (HC, involved)336 (9-3)
To avoid AUC inflation, for the transfer verification, if the random seed for train and test are the same, the results are deleted from the calculation.