mp1: done 2 and 4
This commit is contained in:
parent
db29a75788
commit
d24c1f252d
19 changed files with 271 additions and 1 deletions
22
mp1/files_data/ex2.m
Normal file
22
mp1/files_data/ex2.m
Normal file
|
@ -0,0 +1,22 @@
|
|||
%[U,G] = surfer('https://www.usi.ch',500);
|
||||
% pagerank(U,G);
|
||||
|
||||
% A = (1/40) * [1 1 1 35 1 1;
|
||||
% 18 1 1 1 1 1;
|
||||
% 18 18 1 1 1 1;
|
||||
% 1 18 35 1 1 1;
|
||||
% 1 1 1 1 1 35;
|
||||
% 1 1 1 1 35 1];
|
||||
A = (1/40) * [
|
||||
0 0 0 40 0 0;
|
||||
20 0 0 0 0 0;
|
||||
20 20 0 0 0 0;
|
||||
0 20 40 0 0 0;
|
||||
0 0 0 0 0 40;
|
||||
0 0 0 0 40 0];
|
||||
|
||||
[v,d] = eig(A);
|
||||
|
||||
display(d);
|
||||
display(v);
|
||||
|
BIN
mp1/files_data/run1.mat
Normal file
BIN
mp1/files_data/run1.mat
Normal file
Binary file not shown.
BIN
mp1/files_data/run1rank.fig
Normal file
BIN
mp1/files_data/run1rank.fig
Normal file
Binary file not shown.
32
mp1/files_data/run1rank.txt
Normal file
32
mp1/files_data/run1rank.txt
Normal file
|
@ -0,0 +1,32 @@
|
|||
page-rank in out url
|
||||
360 0.0869 31 1 https://creativecommons.org/licenses/by-sa/3.0
|
||||
204 0.0406 8 1 https://forum.gitlab.com
|
||||
82 0.0189 117 18 https://www.mediawiki.org
|
||||
81 0.0188 117 4 https://wikimediafoundation.org
|
||||
87 0.0150 6 1 https://docs.gitea.io
|
||||
78 0.0145 114 9 https://www.mediawiki.org/wiki/Special:MyLanguage/How_to_contribute
|
||||
77 0.0132 77 13 https://foundation.wikimedia.org/wiki/Privacy_policy
|
||||
217 0.0127 40 8 https://bugs.archlinux.org
|
||||
80 0.0115 107 6 https://foundation.wikimedia.org/wiki/Cookie_statement
|
||||
215 0.0114 38 5 https://bbs.archlinux.org
|
||||
216 0.0114 38 8 https://wiki.archlinux.org
|
||||
218 0.0114 38 6 https://security.archlinux.org
|
||||
428 0.0107 9 1 https://www.dnb.de/kataloghilfe
|
||||
219 0.0102 38 7 https://aur.archlinux.org
|
||||
359 0.0098 9 1 https://creativecommons.org/publicdomain/zero/1.0
|
||||
366 0.0092 27 21 https://archive.org
|
||||
446 0.0089 24 5 https://foundation.wikimedia.org/wiki/Terms_of_Use
|
||||
83 0.0079 78 0 https:\/\/schema.org
|
||||
85 0.0074 77 0 https:\/\/www.wikimedia.org\/static\/images\/wmf-hor-googpub.png
|
||||
181 0.0066 8 2 https://gitlab.com
|
||||
95 0.0062 2 1 https://www.enable-javascript.com
|
||||
113 0.0061 13 1 https://www.britannica.com/topic/polenta
|
||||
417 0.0058 8 1 https://www.dnb.de/DE/Home/home_node.html
|
||||
429 0.0058 8 1 https://www.dnb.de/EN/Home/home_node.html
|
||||
432 0.0058 8 1 https://www.dnb.de/expertensuche
|
||||
379 0.0057 24 2 https://blog.archive.org
|
||||
213 0.0057 32 7 https://www.archlinux.org
|
||||
99 0.0056 3 1 https://www.usi.ch/it
|
||||
19 0.0051 4 1 https://creativecommons.org/licenses/by-nc-sa/4.0
|
||||
214 0.0050 31 5 https://www.archlinux.org/packages
|
||||
220 0.0050 31 6 https://www.archlinux.org/download
|
BIN
mp1/files_data/run2.mat
Normal file
BIN
mp1/files_data/run2.mat
Normal file
Binary file not shown.
BIN
mp1/files_data/run2rank.fig
Normal file
BIN
mp1/files_data/run2rank.fig
Normal file
Binary file not shown.
23
mp1/files_data/run2rank.txt
Normal file
23
mp1/files_data/run2rank.txt
Normal file
|
@ -0,0 +1,23 @@
|
|||
page-rank in out url
|
||||
411 0.0249 42 1 https://twitter.com/mozilla
|
||||
63 0.0248 145 1 https://twitter.com/firefox
|
||||
68 0.0203 142 1 https://www.instagram.com/firefox
|
||||
412 0.0164 37 1 https://www.instagram.com/mozilla
|
||||
62 0.0080 21 1 https://github.com/mozilla/kitsune
|
||||
81 0.0070 110 2 https://www.apple.com
|
||||
384 0.0064 5 1 https://www.xfinity.com/privacy/policy/dns
|
||||
4 0.0064 32 0 https:
|
||||
377 0.0059 19 1 https://abouthome-snippets-service.readthedocs.io/en/latest/data_collection.html 1
|
||||
393 0.0059 19 1 https://www.adjust.com/terms/privacy-policy
|
||||
410 0.0057 16 1 https://wiki.mozilla.org/Firefox/Data_Collection
|
||||
400 0.0057 15 1 https://yandex.ru/legal/confidential
|
||||
396 0.0057 15 1 https://github.com/mozilla-mobile/firefox-ios/blob/master/Docs/MMA.md
|
||||
5 0.0056 31 0 https://ssl
|
||||
3 0.0054 36 0 https://www.iisbadoni.edu.it/sites/default/files/favicon.ico
|
||||
6 0.0054 36 0 https://www.iisbadoni.edu.it/sites/default/files/logo.png
|
||||
208 0.0054 159 0 https://schema.org
|
||||
74 0.0052 178 5 https://foundation.mozilla.org
|
||||
72 0.0052 33 32 https://www.mozilla.org/privacy/websites/#cookies
|
||||
23 0.0051 2 1 https://www.iisbadoni.edu.it/mad
|
||||
300 0.0051 157 0 https://accounts.firefox.com
|
||||
|
BIN
mp1/files_data/run3.mat
Normal file
BIN
mp1/files_data/run3.mat
Normal file
Binary file not shown.
BIN
mp1/files_data/run3rank.fig
Normal file
BIN
mp1/files_data/run3rank.fig
Normal file
Binary file not shown.
43
mp1/files_data/run3rank.txt
Normal file
43
mp1/files_data/run3rank.txt
Normal file
|
@ -0,0 +1,43 @@
|
|||
page-rank in out url
|
||||
55 0.0741 354 1 https://www.instagram.com/usiuniversity
|
||||
53 0.0324 366 3 https://www.facebook.com/usiuniversity
|
||||
299 0.0248 6 1 https://twitter.com/usi_en
|
||||
329 0.0243 8 1 https://www.facebook.com/USIeLab
|
||||
308 0.0156 7 3 https://www.facebook.com/USIFinancialCommunication
|
||||
60 0.0155 316 2 https://www.swissuniversities.ch
|
||||
424 0.0144 96 1 https://it.bul.sbu.usi.ch
|
||||
330 0.0123 6 4 https://www.facebook.com/USI.ITDxC
|
||||
320 0.0122 7 1 https://www.facebook.com/usiimeg
|
||||
56 0.0107 320 0 https://www.youtube.com/usiuniversity
|
||||
5 0.0096 317 71 https://usi.ch
|
||||
62 0.0090 319 18 https://search.usi.ch
|
||||
337 0.0087 7 1 https://twitter.com/usisoftware
|
||||
63 0.0080 303 19 https://desk.usi.ch
|
||||
130 0.0077 25 0 https://www.swissuniversities.ch/it
|
||||
54 0.0072 208 0 https://twitter.com/USI_university
|
||||
323 0.0066 9 5 https://www.facebook.com/usiorientamento
|
||||
150 0.0062 12 1 https://www.innosuisse.ch/inno/it/home.html
|
||||
248 0.0061 10 1 https://www.facebook.com/usimdfc
|
||||
106 0.0060 132 8 https://newsletter.usi.ch/archive/en
|
||||
135 0.0057 201 0 https://schema.org
|
||||
326 0.0057 6 1 https://www.facebook.com/usialloggimendrisio
|
||||
322 0.0055 6 1 https://www.facebook.com/USImem
|
||||
366 0.0054 6 1 https://www.instagram.com/usi_ics_lugano
|
||||
212 0.0054 12 3 https://www.facebook.com/usimt
|
||||
7 0.0051 211 32 https://search.usi.ch/it
|
||||
6 0.0051 204 0 https://www.usi.ch/sites/all/themes/usiclean/img/bollino-usi.svg
|
||||
14 0.0051 204 62 https://www.usi.ch/originalnode/342
|
||||
15 0.0051 204 57 https://www.usi.ch/originalnode/358
|
||||
16 0.0051 204 62 https://www.usi.ch/originalnode/343
|
||||
17 0.0051 204 57 https://www.usi.ch/originalnode/344
|
||||
18 0.0051 204 58 https://www.usi.ch/en/originalnode/12174
|
||||
20 0.0051 204 60 https://www.usi.ch/originalnode/349
|
||||
21 0.0051 204 62 https://www.usi.ch/originalnode/8996
|
||||
22 0.0051 204 60 https://www.usi.ch/originalnode/348
|
||||
23 0.0051 204 59 https://www.usi.ch/originalnode/351
|
||||
24 0.0051 204 58 https://www.usi.ch/originalnode/350
|
||||
25 0.0051 204 61 https://www.usi.ch/originalnode/353
|
||||
27 0.0051 204 59 https://www.usi.ch/originalnode/8014
|
||||
26 0.0051 204 58 https://www.usi.ch/en/originalnode/354
|
||||
61 0.0051 204 0 https://www.usi.ch/sites/all/themes/usiclean/img/swissuniversities.svg
|
||||
57 0.0050 188 9 https://newsletter.usi.ch/archive
|
BIN
mp1/files_data/run3spy.fig
Normal file
BIN
mp1/files_data/run3spy.fig
Normal file
Binary file not shown.
BIN
mp1/run1rank.pdf
Normal file
BIN
mp1/run1rank.pdf
Normal file
Binary file not shown.
BIN
mp1/run1spy.pdf
Normal file
BIN
mp1/run1spy.pdf
Normal file
Binary file not shown.
BIN
mp1/run2rank.pdf
Normal file
BIN
mp1/run2rank.pdf
Normal file
Binary file not shown.
BIN
mp1/run2spy.pdf
Normal file
BIN
mp1/run2spy.pdf
Normal file
Binary file not shown.
BIN
mp1/run3rank.pdf
Normal file
BIN
mp1/run3rank.pdf
Normal file
Binary file not shown.
BIN
mp1/run3spy.pdf
Normal file
BIN
mp1/run3spy.pdf
Normal file
Binary file not shown.
BIN
mp1/template.pdf
BIN
mp1/template.pdf
Binary file not shown.
152
mp1/template.tex
152
mp1/template.tex
|
@ -1,5 +1,7 @@
|
|||
\documentclass[unicode,11pt,a4paper,oneside,numbers=endperiod,openany]{scrartcl}
|
||||
|
||||
\usepackage{graphicx}
|
||||
\usepackage{subcaption}
|
||||
\usepackage{amsmath}
|
||||
\input{assignment.sty}
|
||||
\begin{document}
|
||||
|
||||
|
@ -57,10 +59,158 @@ is the corresponding eigenvalue, while if $x$ is an eigenvector approximation, f
|
|||
|
||||
\subsection{Other webgraphs [10 points]}
|
||||
|
||||
The provided PageRank MATLAB implementation was run 3 times on the starting websites \texttt{http://atelier.inf.usi.ch/~maggicl}, \texttt{https://www.iisbadoni.edu.it}, and \texttt{https://www.usi.ch}, with results listed respectively in Figure \ref{fig:run1}, Figure \ref{fig:run2} and Figure \ref{fig:run3}.
|
||||
|
||||
One patten that emerges on the first and third execution is the presence of 1s in the main diagonal. This indicates that several pages found have a link to themselves. Another interesting pattern, this time observable in all executions, is the presence of contiguous rectangular regions filled with 1s, especially along the main diagonal. This may be due to the presence of pages belonging to the same website, thus having a common layout and perhaps linking to a common set of internal (when near to the main diagonal) or external pages.
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run1spy}
|
||||
\caption{Spy plot of connectivity matrix}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run1rank}
|
||||
\caption{Page rank bar graph}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\textwidth}
|
||||
\begin{verbatim}
|
||||
|
||||
360 0.0869 31 1 https://creativecommons.org/licenses/by-sa/3.0
|
||||
204 0.0406 8 1 https://forum.gitlab.com
|
||||
82 0.0189 117 18 https://www.mediawiki.org
|
||||
81 0.0188 117 4 https://wikimediafoundation.org
|
||||
87 0.0150 6 1 https://docs.gitea.io
|
||||
78 0.0145 114 9 https://www.mediawiki.org/wiki/Special:MyLanguage/
|
||||
How_to_contribute
|
||||
77 0.0132 77 13 https://foundation.wikimedia.org/wiki/Privacy_policy
|
||||
217 0.0127 40 8 https://bugs.archlinux.org
|
||||
80 0.0115 107 6 https://foundation.wikimedia.org/wiki/Cookie_statement
|
||||
215 0.0114 38 5 https://bbs.archlinux.org
|
||||
\end{verbatim}
|
||||
\caption{Top 10 webpages with highest PageRank}
|
||||
\end{subfigure}
|
||||
\label{fig:run1}
|
||||
\caption{Results of first PageRank calculation (for starting website \texttt{http://atelier.inf.usi.ch/~maggicl/})}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run2spy}
|
||||
\caption{Spy plot of connectivity matrix}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run2rank}
|
||||
\caption{Page rank bar graph}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\textwidth}
|
||||
\begin{verbatim}
|
||||
|
||||
411 0.0249 42 1 https://twitter.com/mozilla
|
||||
63 0.0248 145 1 https://twitter.com/firefox
|
||||
68 0.0203 142 1 https://www.instagram.com/firefox
|
||||
412 0.0164 37 1 https://www.instagram.com/mozilla
|
||||
62 0.0080 21 1 https://github.com/mozilla/kitsune
|
||||
81 0.0070 110 2 https://www.apple.com
|
||||
384 0.0064 5 1 https://www.xfinity.com/privacy/policy/dns
|
||||
4 0.0064 32 0 https:
|
||||
377 0.0059 19 1 https://abouthome-snippets-service.readthedocs.io/en/
|
||||
latest/data_collection.html
|
||||
393 0.0059 19 1 https://www.adjust.com/terms/privacy-policy
|
||||
410 0.0057 16 1 https://wiki.mozilla.org/Firefox/Data_Collection
|
||||
\end{verbatim}
|
||||
\caption{Top 10 webpages with highest PageRank}
|
||||
\end{subfigure}
|
||||
\label{fig:run2}
|
||||
\caption{Results of second PageRank calculation (for starting website \texttt{https://www.iisbadoni.edu.it/})}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run3spy}
|
||||
\caption{Spy plot of connectivity matrix}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{0.49\textwidth}
|
||||
\centering
|
||||
\includegraphics[width = \textwidth]{run3rank}
|
||||
\caption{Page rank bar graph}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\textwidth}
|
||||
\begin{verbatim}
|
||||
|
||||
55 0.0741 354 1 https://www.instagram.com/usiuniversity
|
||||
53 0.0324 366 3 https://www.facebook.com/usiuniversity
|
||||
299 0.0248 6 1 https://twitter.com/usi_en
|
||||
329 0.0243 8 1 https://www.facebook.com/USIeLab
|
||||
308 0.0156 7 3 https://www.facebook.com/USIFinancialCommunication
|
||||
60 0.0155 316 2 https://www.swissuniversities.ch
|
||||
424 0.0144 96 1 https://it.bul.sbu.usi.ch
|
||||
330 0.0123 6 4 https://www.facebook.com/USI.ITDxC
|
||||
320 0.0122 7 1 https://www.facebook.com/usiimeg
|
||||
56 0.0107 320 0 https://www.youtube.com/usiuniversity
|
||||
\end{verbatim}
|
||||
\caption{Top 10 webpages with highest PageRank}
|
||||
\end{subfigure}
|
||||
\label{fig:run3}
|
||||
\caption{Results of third PageRank calculation (for starting website \texttt{https://www.usi.ch/})}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Connectivity matrix and subcliques [10 points]}
|
||||
|
||||
\subsection{Connectivity matrix and disjoint subgraphs [10 points]}
|
||||
|
||||
\subsubsection{What is the connectivity matrix G (w.r.t figure 5)?}
|
||||
|
||||
The connectivity matrix G, with U being defined as $\{"alpha", "beta", "gamma", "delta", "rho", "sigma"\}$ is:
|
||||
|
||||
\[G = \begin{bmatrix}
|
||||
0&0&0&1&0&0\\
|
||||
1&0&0&0&0&0\\
|
||||
1&1&0&0&0&0\\
|
||||
0&1&1&0&0&0\\
|
||||
0&0&0&0&0&1\\
|
||||
0&0&0&0&1&0\\
|
||||
\end{bmatrix}\]
|
||||
|
||||
\subsubsection{What are the PageRanks if the hyperlink transition probability $p$ is the default value 0.85?}
|
||||
|
||||
First we compute the matrix A, finding:
|
||||
|
||||
\[A = \frac1{40} \begin{bmatrix}
|
||||
1 &1 &1 &35&1 &1 \\
|
||||
18&1 &1 &1 &1 &1 \\
|
||||
18&18&1 &1 &1 &1 \\
|
||||
1 &18&35&1 &1 &1 \\
|
||||
1 &1 &1 &1 &1 &35 \\
|
||||
1 &1 &1 &1 &35&1 \\
|
||||
\end{bmatrix}\]
|
||||
|
||||
We then find the eigenvectors and eigenvalues of A through MATLAB, finding that the solution of $A x = 1 x$ is:
|
||||
|
||||
\[x\approx\begin{bmatrix}
|
||||
0.4771\\
|
||||
0.2630\\
|
||||
0.3747\\
|
||||
0.4905\\
|
||||
0.4013\\
|
||||
0.4013\\
|
||||
\end{bmatrix}\]
|
||||
|
||||
Thus the pageranks are the components of vector $x$, w.r.t. the order given in U.
|
||||
|
||||
\subsubsection{Describe what happens with this example to both the definition of PageRank and the computation done by pagerank in the limit $p \to 1$.}
|
||||
|
||||
If $p$ is closer to 1, then the probability a web user will visit a certain page randomly decreases, thus giving more weight in the computation of PageRank to the links between one page and another.
|
||||
|
||||
In the computation, increasing $p$ decreases $\delta$ (which represents the probability of a user randomly visiting a page), eventually making it 0 when $p$ is 1.
|
||||
|
||||
\subsection{PageRanks by solving a sparse linear system [50 points]}
|
||||
|
||||
|
||||
|
|
Reference in a new issue