|
29 | 29 | "### Introduction\n", |
30 | 30 | "Classification algorithms and methods for machine learning are essential for pattern recognition and data mining applications. Well known techniques such as support vector machines and neural networks have blossomed over the last two decades as a result of the spectacular advances in classical hardware computational capabilities and speed. This progress in computer power made it possible to apply techniques, that were theoretically developed towards the middle of the 20th century, on classification problems that were becoming increasingly challenging.\n", |
31 | 31 | "\n", |
32 | | - "A key concept in classification methods is that of a kernel. Data cannot typically be separated by a hyperplane in its original space. A common technique used to find such a hyperplane consists on applying a non-linear transformation function to the data. This function is called a feature map, as it transforms the raw features, or measurable properties, of the phenomenon or subject under study. Classifying in this new feature space -and, as a matter of fact, also in any other space, including the raw original one- is nothing more than seeing how close data points are to each other. This is the same as computing the inner product for each pair of data in the set. So, in fact we do not need to compute the non-linear feature map for each datum, but only the inner product of each pair of data points in the new feature space. This collection of inner products is called the kernel and it is perfectly possible to have feature maps that are hard to compute but whose kernels are not.\n", |
| 32 | + "A key concept in classification methods is that of a kernel. Data cannot typically be separated by a hyperplane in its original space. A common technique used to find such a hyperplane consists of applying a non-linear transformation function to the data. This function is called a feature map, as it transforms the raw features, or measurable properties, of the phenomenon or subject under study. Classifying in this new feature space -and, as a matter of fact, also in any other space, including the raw original one- is nothing more than seeing how close data points are to each other. This is the same as computing the inner product for each pair of data points in the set. So, in fact we do not need to compute the non-linear feature map for each datum, but only the inner product of each pair of data points in the new feature space. This collection of inner products is called the kernel and it is perfectly possible to have feature maps that are hard to compute but whose kernels are not.\n", |
33 | 33 | "\n", |
34 | | - "In this notebook we provide an example of a classification problem that requires a feature map for which computing the kernel is not efficient classically -this means that the required computational resources are expected to scale exponentially with the size of the problem. We show how this can be solved in a quantum processor by a direct estimation of the kernel in the feature space. The method we used falls in the category of what is called supervised learning, consisting of a training phase (where the kernel is calculated and the support vectors obtained) and a test or classification phase (where new unlabelled data is classified according to the solution found in the training phase).\n", |
| 34 | + "In this notebook we provide an example of a classification problem that requires a feature map for which computing the kernel is not efficient classically -this means that the required computational resources are expected to scale exponentially with the size of the problem. We show how this can be solved in a quantum processor by a direct estimation of the kernel in the feature space. The method we used falls in the category of what is called supervised learning, consisting of a training phase (where the kernel is calculated and the support vectors obtained) and a test or classification phase (where new unlabeled data is classified according to the solution found in the training phase).\n", |
35 | 35 | "\n", |
36 | 36 | "References and additional details:\n", |
37 | 37 | "\n", |
|
64 | 64 | "metadata": {}, |
65 | 65 | "source": [ |
66 | 66 | "### [Optional] Setup token to run the experiment on a real device\n", |
67 | | - "If you would like to run the experiement on a real device, you need to setup your account first.\n", |
| 67 | + "If you would like to run the experiment on a real device, you need to setup your account first.\n", |
68 | 68 | "\n", |
69 | 69 | "Note: If you do not store your token yet, use `IBMQ.save_account('MY_API_TOKEN')` to store it first." |
70 | 70 | ] |
|
83 | 83 | "cell_type": "markdown", |
84 | 84 | "metadata": {}, |
85 | 85 | "source": [ |
86 | | - "First we prepare the dataset, which is used for training, testing and the finally prediction.\n", |
| 86 | + "First we prepare the dataset, which is used for training, testing and the final prediction.\n", |
87 | 87 | "\n", |
88 | 88 | "*Note: You can easily switch to a different dataset, such as the Breast Cancer dataset, by replacing 'ad_hoc_data' to 'Breast_cancer' below.*" |
89 | 89 | ] |
|
143 | 143 | "- the input dictionary (params) \n", |
144 | 144 | "- the input object containing the dataset info (algo_input).\n", |
145 | 145 | "\n", |
146 | | - "With everything setup, we can now run the algorithm.\n", |
| 146 | + "With everything set up, we can now run the algorithm.\n", |
147 | 147 | "\n", |
148 | 148 | "For the testing, the result includes the details and the success ratio.\n", |
149 | 149 | "\n", |
|
0 commit comments