How to split a dataset in test and training set?

Question

How can I split a data set in training and test data set after creating a data set named

first_data (contains 3000 samples)

in MATLAB?

I want to split $2000$ samples as training and $1000$ samples as test data set.

I am new to MATLAB- thanks in advance.

score 2 · Accepted Answer · answered Dec 02 '14 at 11:40

Well, put your data in a matrix and take the first 2000 data and train your model on it(I am assuming this is a supervised learning). Then you can test your model on the 1000 data and compute your error. This would be in matlab(if your data is row vector).

 training_data = first_data(1:2000,:);
 test_data     = first_data(2001:end,:);

Alternative approach would be to split the data into k-sections and train on the K-1 dataset and test on the what you have left. Doing this repeatedly is helpfully to avoid over-fitting. For much detail read about bias-variance dilemma and cross-validation.

Usually data comes ordered by class, therefore this is not a great answer. Something like this is probably more useful: http://stackoverflow.com/questions/5444248/random-order-of-rows-matlab — Paul, Jun 04 '15 at 07:49

How to split a dataset in test and training set?

1 Answers1