This repository has been archived on 2021-10-31. You can view files and clone it, but cannot push or open issues or pull requests.
AICup/Lectures/Student_lecture 1.ipynb

520 lines
17 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## First Lab\n",
"\n",
"What we are going to do today:\n",
"- read TSP data\n",
"- define euclidean distance function\n",
"- define a ProblemInstance python class \n",
"- store nodes in an instance of the class defined before\n",
"- plot raw data\n",
"- generate naive solution \n",
"- check if the solution is valid\n",
"- evaluate solution!#\n",
"\n",
"NOTE: I've marked all the code that you will have to fill with a `# TODO` comment\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This cell below is simply importing some useful stuff for later"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import glob\n",
"import numpy as np\n",
"from matplotlib import pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read TSP data\n",
"In this Cup you will have to deal with predefined set of problems. These problems are located in the `problems` folder.\n",
"\n",
"First lets get list them out"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ch130.tsp\n",
"d198.tsp\n",
"eil76.tsp\n",
"fl1577.tsp\n",
"kroA100.tsp\n",
"lin318.tsp\n",
"pcb442.tsp\n",
"pr439.tsp\n",
"rat783.tsp\n",
"u1060.tsp\n"
]
}
],
"source": [
"problems = glob.glob('../problems/*.tsp')\n",
"# example_problem = [\"../problems/eil76.tsp\"]\n",
"for prob in problems:\n",
" print(prob[12:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Checking by hand if all of the 10 problems are in the folder would be a waste of time so we can write a line of code just to check if they are all there"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n"
]
}
],
"source": [
"print(np.all([n[12:] in ['fl1577.tsp','pr439.tsp','ch130.tsp','rat783.tsp','d198.tsp', 'kroA100.tsp','u1060.tsp','lin318.tsp','eil76.tsp','pcb442.tsp'] for n in problems]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### File format\n",
"All the problems are stored in a `.tsp` (this file is actually a renamed `.txt` file, so you could open them with your favorite text editor)\n",
"\n",
"As we will see in a bit all the problems files are composed of different sections:\n",
"* `NAME`: the shortned name of the problem\n",
"* `COMMENT`: a comment area that can contain the full name of the problem\n",
"* `TYPE`: this defines the type of problem at hand, in our case is always TSP\n",
"* `DIMENSION`: this states the problem dimension\n",
"* `EDGE_WEIGHT_TYPE`: this section states the types of weights applied to edges, in our case it is always EUC_2D or the weights are giveng using the euclidean distance in 2 dimension\n",
"* `BEST_KNOWN`: this states the best known result obtained, note that as the Prof said, it is unlikely to get a better performance than this\n",
"* `NODE_COORD_SECTION`: finally we have the section that states the triplets that defines the problems points. These triplets are (point_number, x,y).\n",
"\n",
"Now that we know all of that, lets print the content of a single problem"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['NAME : eil76', 'COMMENT : 76-city problem (Christofides/Eilon)', 'TYPE : TSP', 'DIMENSION : 76', 'EDGE_WEIGHT_TYPE : EUC_2D', 'BEST_KNOWN : 538', 'NODE_COORD_SECTION', '1 22 22', '2 36 26', '3 21 45', '4 45 35', '5 55 20', '6 33 34', '7 50 50', '8 55 45', '9 26 59', '10 40 66', '11 55 65', '12 35 51', '13 62 35', '14 62 57', '15 62 24', '16 21 36', '17 33 44', '18 9 56', '19 62 48', '20 66 14', '21 44 13', '22 26 13', '23 11 28', '24 7 43', '25 17 64', '26 41 46', '27 55 34', '28 35 16', '29 52 26', '30 43 26', '31 31 76', '32 22 53', '33 26 29', '34 50 40', '35 55 50', '36 54 10', '37 60 15', '38 47 66', '39 30 60', '40 30 50', '41 12 17', '42 15 14', '43 16 19', '44 21 48', '45 50 30', '46 51 42', '47 50 15', '48 48 21', '49 12 38', '50 15 56', '51 29 39', '52 54 38', '53 55 57', '54 67 41', '55 10 70', '56 6 25', '57 65 27', '58 40 60', '59 70 64', '60 64 4', '61 36 6', '62 30 20', '63 20 30', '64 15 5', '65 50 70', '66 57 72', '67 45 42', '68 38 33', '69 50 4', '70 66 8', '71 59 5', '72 35 60', '73 27 24', '74 40 20', '75 40 37', '76 40 40', 'EOF']\n"
]
}
],
"source": [
"example_problem = \"../problems/eil76.tsp\"\n",
"with open(example_problem,\"r\") as exprob:\n",
" print(exprob.read().splitlines())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Euclidean Distance\n",
"Since all of our problems are using the euclidean distance between points for the edges weights.\n",
"We will now define a function that computes the euclidean distance. This distance will also be used to build the distance matrix"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def distance_euc(point_i, point_j): # TODO\n",
" rounding = 0\n",
" x_i = point_i[0]\n",
" y_i = point_i[1]\n",
" x_j, y_j = point_j[0], point_j[1]\n",
" distance = np.sqrt((x_i - x_j) ** 2 + (y_i- y_j) ** 2)\n",
" return round(distance, rounding)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's test it"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4.0"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"point_1 = (2, 2)\n",
"point_2 = (5, 5)\n",
"distance_euc(point_1, point_2)\n",
"# Expected output is 4.0 with rounding to 0 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading and storing the data\n",
"We will now define a Class called `ProblemInstance`\n",
"\n",
"in the Constructor of the class (`__init__()`method of a class in Python) you will have to implement the code for:\n",
"* reading the raw data\n",
"* store the metadata\n",
"* read all the point and store them\n",
"* code the method that creates the distance matrix between points\n",
"* \\[optional\\] check if the problem loaded has an optimal and in that case store the optimal solution\n",
"* \\[optional\\] code the plotting method\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.utils import distance_euc\n",
"\n",
"class ProblemInstance:\n",
"\n",
" def __init__(self, name_tsp):\n",
" self.exist_opt = False\n",
" self.optimal_tour = None\n",
" self.dist_matrix = None\n",
" \n",
" # read raw data \n",
" # TODO\n",
" with open(name_tsp) as f_o:\n",
" data= f_o.read()\n",
" self.lines = data.splitlines()\n",
" \n",
"# file_object = open(name_tsp)\n",
"# data = file_object.read()\n",
"# file_object.close()\n",
"# self.lines = data.splitlines()\n",
"\n",
" # store metadata set information \n",
" # TODO\n",
" self.name = self.lines[0].split(' ')[2]\n",
" # here we expect the name of the problem\n",
" self.nPoints = np.int(self.lines[3].split(' ')[2])\n",
" self.best_sol = np.float(self.lines[5].split(' ')[2])\n",
" # here the lenght of the best solution\n",
" \n",
" # read all data points and store them \n",
" # TODO\n",
" self.points = np.zeros((self.nPoints, 3)) # this is the structure where we will store the pts data \n",
" for i in range(self.nPoints):\n",
" line_i = self.line[7 + i].split(' ')\n",
" self.points[i, 0] = int(line_i[0])\n",
" self.points[i, 1] = line_i[1]\n",
" self.points[i, 2] = line_i[2]\n",
" \n",
" self.create_dist_matrix()\n",
" \n",
" # TODO [optional]\n",
" # if the problem is one with a optimal solution, that solution is loaded\n",
" if name_tsp in [\"../problems/eil76.tsp\", \"../problems/kroA100.tsp\"]:\n",
" self.exist_opt = True\n",
" file_object = open(name_tsp.replace(\".tsp\", \".opt.tour\"))\n",
" data = file_object.read()\n",
" file_object.close()\n",
" lines = data.splitlines()\n",
"\n",
" # read all data points and store them\n",
" self.optimal_tour = np.zeros(self.nPoints, dtype=np.int)\n",
" for i in range(self.nPoints):\n",
" line_i = lines[5 + i].split(' ')\n",
" self.optimal_tour[i] = int(line_i[0]) - 1\n",
"\n",
" def print_info(self):\n",
" print(\"\\n#############################\\n\")\n",
" print('name: ' + self.name)\n",
" print('nPoints: ' + str(self.nPoints))\n",
" print('best_sol: ' + str(self.best_sol))\n",
" print('exist optimal: ' + str(self.exist_opt))\n",
"\n",
" def plot_data(self,show_numbers=False): # todo [optional]\n",
" plt.figure(figsize=(8, 8))\n",
" plt.title(self.name)\n",
" plt.scatter(self.points[:, 1], self.points[:, 2])\n",
" if show_numbers:\n",
" for i, txt in enumerate(np.arange(self.nPoints)): # tour_found[:-1]\n",
" plt.annotate(txt, (self.points[i, 1], self.points[i, 2]))\n",
" plt.show()\n",
"\n",
" def create_dist_matrix(self): # TODO\n",
" self.dist_matrix = np.zeros((self.nPoints, self.nPoints))\n",
" \n",
" for i in range(self.nPoints):\n",
" for j in range(i, self.nPoints):\n",
" self.dist_matrix[i, j] = distance_euc(self.points[i][1:3], self.points[j][1:3])\n",
" self.dist_matrix += self.dist_matrix.T\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"------------------------\n",
"Now we can test our Class with an example problem"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"example_problem = \"../problems/eil76.tsp\"\n",
"p_inst=ProblemInstance(example_problem)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p_inst.print_info()\n",
"p_inst.plot_data()\n",
"#Expected output\n",
"\"\"\"\n",
"#############################\n",
"\n",
"name: eil76\n",
"nPoints: 76\n",
"best_sol: 538.0\n",
"exist optimal: True\n",
"\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"p_inst.plot_data(show_numbers=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"-------------\n",
"### Random solver \n",
"Now we will code the random solver and test it with a class called `SolverTSP` that takes the solvers and the problem instance and act as a framework to compute the solution and gives us some additional information.\n",
"We will also need to code the `evaluate_solution` method of the the `SolverTSP` class"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def random_method(instance_): # TODO\n",
" return solution\n",
"available_methods = {\"random\": random_method} # this is here because the SolverTSP will check for the available methods"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from time import time as t\n",
"\n",
"class SolverTSP:\n",
" def __init__(self, algorithm_name, problem_instance):\n",
" self.duration = np.inf\n",
" self.found_length = np.inf\n",
" self.algorithm_name = algorithm_name\n",
" self.name_method = \"initialized with \" + algorithm_name\n",
" self.solved = False\n",
" self.problem_instance = problem_instance\n",
" self.solution = None\n",
"\n",
" def compute_solution(self, verbose=True, return_value=True):\n",
" self.solved = False\n",
" if verbose:\n",
" print(f\"### solving with {self.algorithm_name} ####\")\n",
" start_time = t()\n",
" self.solution = available_methods[self.algorithm_name](self.problem_instance)\n",
" assert self.check_if_solution_is_valid(self.solution), \"Error the solution is not valid\"\n",
" end_time = t()\n",
" self.duration = np.around(end_time - start_time, 3)\n",
" if verbose:\n",
" print(f\"### solved ####\")\n",
" self.solved = True\n",
" self.evaluate_solution()\n",
" self._gap()\n",
" if return_value:\n",
" return self.solution\n",
"\n",
" def plot_solution(self):\n",
" assert self.solved, \"You can't plot the solution, you need to compute it first!\"\n",
" plt.figure(figsize=(8, 8))\n",
" self._gap()\n",
" plt.title(f\"{self.problem_instance.name} solved with {self.name_method} solver, gap {self.gap}\")\n",
" ordered_points = self.problem_instance.points[self.solution]\n",
" plt.plot(ordered_points[:, 1], ordered_points[:, 2], 'b-')\n",
" plt.show()\n",
"\n",
" def check_if_solution_is_valid(self, solution):\n",
" rights_values = np.sum([self.check_validation(i, solution) for i in np.arange(self.problem_instance.nPoints)])\n",
" if rights_values == self.problem_instance.nPoints:\n",
" return True\n",
" else:\n",
" return False \n",
" def check_validation(self, node , solution):\n",
" if np.sum(solution == node) == 1:\n",
" return 1\n",
" else:\n",
" return 0\n",
"\n",
" def evaluate_solution(self, return_value=False):\n",
" total_length = 0\n",
" from_node = self.solution[0] # starting_node\n",
" # TODO\n",
" # [...] compute total_lenght of the solution \n",
" self.found_length = total_length\n",
" if return_value:\n",
" return total_length\n",
"\n",
" def _gap(self):\n",
" self.evaluate_solution(return_value=False)\n",
" self.gap = np.round(\n",
" ((self.found_length - self.problem_instance.best_sol) / self.problem_instance.best_sol) * 100, 2)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----------------------------\n",
"Now we will test our code"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"solver_name=\"random\"\n",
"# here I'm repeating this two lines just to remind you which problem we are using\n",
"example_problem = \"../problems/eil76.tsp\"\n",
"p_inst = ProblemInstance(example_problem)\n",
"\n",
"# TODO\n",
"# create an instance of SolverTSP\n",
"# compute a solution\n",
"# print the information as for the output\n",
"# plot the solution\n",
"\n",
"# this is the output expected and after that the solution's plot\n",
"\"\"\"\n",
"### solving with random ####\n",
"### solved ####\n",
"the total length for the solution found is 2424.0\n",
"while the optimal length is 538.0\n",
"the gap is 350.56%\n",
"the solution is found in 0.0 seconds\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--------------------\n",
"Finally since our example problem has an optimal solution we can plot it"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"solver = SolverTSP(\"optimal\", p_inst)\n",
"solver.solved = True\n",
"solver.solution = np.concatenate([p_inst.optimal_tour, [p_inst.optimal_tour[0]]])\n",
"solver.plot_solution()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "PyCharm (AI2020BsC)",
"language": "python",
"name": "pycharm-61970693"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 1
}