This repository has been archived on 2024-10-22. You can view files and clone it, but cannot push or open issues or pull requests.
soft-analytics-01/docs/sections/scraping.tex
Claudio Maggioni 07232eddcc Final version of the bug-triaging project
Commit history has been discarded to remove large files from the repo.
2024-01-03 15:22:56 +01:00

7 lines
900 B
TeX

\section*{Issue Scraping}
To scrape the data from GitHub, we used the API that GitHub exposes to its users.
By using our GitHub token, we managed to make the appropriate requests to return the issues.
The raw issues where saved as single json files (one per issue), and zipped into a \verb|.tar.gz| archive.
Some downloaded issues, however, were blank JSON files.
We suspect that these issues were available at the time of listing, but they have been since deleted and are not available anymore through the GitHub API, therefore we choose to ignore them.
The internal issue IDs for these issues were: \verb|111293876|, \verb|116791101|, \verb|116805010|, \verb|116805553|, \verb|116805977|, \verb|116901067|, \verb|117010737|, \verb|117065474|, \verb|117067419|, \verb|117068152|, \verb|117069931|, \verb|116803071|, \verb|116923175|, \verb|1169895| \verb|17|, \verb|117063475|, and \verb|117067644|