Generic installation instructions
How to run GHTorrent locally
Depending on the size of the local mirror you have the following configuration simplification options:
You can skip using MongoDB if you only need to query the relational database and/or you just need to do use GHTorrent once.
You can use SQLite3 instead of MySQL if your setup only contains a few (say, less than 1000) small projects.
Install Ruby and dependencies
Make sure you run the latest release of Ruby. On the main server, GHTorrent runs on Ruby 2. If you are on Mac or Linux, you can use RVM to manage Ruby versions.
Install the necessary dependencies:
sudo apt-get install build-essential curl libmysqlclient-dev
# Install RVM and Ruby 2.2
gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3
curl -L https://get.rvm.io | bash -s stable --ruby=2.2
rvm use 2.2
gem install bundler sqlite3 #or mysql2
Install the source code
Checkout the latest version of the
ghtorrent Gem through Github.
By default, it will be checked out in the directory github-mirror
. The
released versions of the Gem represent good states in the project's lifetime;
the main mirror always works on the bleeding edge, which contains error fixes
and updates to comply with changes to Github's API. You then need to install
the dependencies:
cd github-mirror
bundle install
Alternatively, you can just install the latest version of the GHTorrent gem:
gem install ghtorrent
Configure
If you are using MySQL, you need to create a user and a database, like so
# Login as MySQL root user
mysql> create user ghtorrentuser@'localhost' identified by 'ghtorrentpassword';
mysql> create user ghtorrentuser@'' identified by 'ghtorrentpassword';
mysql> grant all privileges on . to 'ghtorrentuser'@'localhost';
mysql> grant all privileges on . to 'ghtorrentuser'@'';
Login as the ghtorrent user
mysql> CREATE SCHEMA IF NOT EXISTS ghtorrent
DEFAULT CHARACTER SET utf8 ;
If you are using MongoDB, you can just disable
authentication (run mongod
with --noauth
). If you do want to create a user,
it can be a bit more involved, see below:
> db.createUser(
{
user: "root",
pwd: "admin",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
}
)
> use ghtorrent > db.createUser( { user: "ghtorrent", pwd: "ghtorrent", roles: [ { role: "dbOwner", db: "ghtorrent" } ] } )
Download the
sample configuration file,
save it as config.yaml
and change options as necessary. Important things to
configure are:
- The database connection string
- The MongoDB connection details (if you are using it)
- Your GitHub username/password or an API token. See instructions here on how to obtain an API key
Run and profit
To download the data for your first project, run:
# Retrieve one repo
ruby -Ilib bin/ght-retrieve-repo -c config.yaml gousiosg github-mirror
You should see lots of output. After a while, you will have 1/2 databases full of data!