All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
First, separate the target and the features into two variables.
Store the features in
Xand the target
train_test_split to split the data into training and testing sets. Split the dataset in 80% training, 20% testing, and random_state=0.
Store the values in the variables in
Train a liner SVM (import LinearSVC) using the training data, and store the model in
svm. You can specify the model parameters such as the C.
Remember to standarize the dataset (code provided below), please store the results in
sc_X = StandardScaler() X_train_sd=sc_X.fit_transform(X_train) X_test_sd=sc_X.transform(X_test)
Calculate the f1-score of both the training and testing sets and run the code in a Jupyter Notebook.
Store the results in the variables
The expected accuracy for a simple problem varies depending on the specifics of the problem and data. However, for a well-defined and simple problem with a large and diverse training dataset, a well-trained machine learning model could achieve an f1-score of over 85% in some cases.