How to install CKAN on RHEL 6

As the CKAN documentation for Ubuntu explains, you can stand up an instance of Linux using VirtualBox, Amazon’s EC2 or through another method of your choosing. In our case, we used Amazon’s EC2. The process was identical to the standard instructions in the CKAN documentation for Ubuntu, except that we used an AMI for RHEL 6 instead of an AMI for Ubuntu to initialize the system. The AMI we used was named “RHEL-6.2-Starter-EBS-x86_64-4-Hourly2” (AMI ID ami-41d00528). This produced an Red Hat instance with a public address of ec2-23-20-186-177.compute-1.amazonaws.com.

Log on to the server

ssh -i cmapkey.pem root@ec2-23-20-186-177.compute-1.amazonaws.com

Ensure the required packages are installed and update yum for latest updates on repositories

yum update

You will get a list of dependencies being resolved and packages to be installed, following by a prompt asking you to confirm that you want to download the packages. Enter “y”, and you’ll see a list of lines that begin “Updating” following by the package name, and then a list of lines that begin “Cleanup.” Finally, you’ll see the world “Complete:” followed by a command-line prompt.
  • Mercurial is a distributed revision control tool for software developers.
  • PostgreSQL is an object-relational database management system.
  • virtualenv is a tool to create isolated Python environments. It lets you create a new Python environment to run a Python app and install all package dependencies into the virtualenv without affecting your system’s site-packages.

yum install mercurial postgresql
easy_install pip
pip install virtualenv
yum install wget git-core subversion

The following packages are not supported by YUM and therefore have to be installed using RPM:

  1. python-dev
  2. libpq-dev
  3. libxml2-dev
  4. libxslt-dev
  5. openjdk-6-jdk
  6. solr-jetty

Install python-devel

Find a download of python-devel at http://rpm.pbone.net and then:

wget ftp://ftp.icm.edu.pl/vol/rzm2/linux-centos/6.2/os/x86_64/Packages/python-devel-2.6.6-29.el6.x86_64.rpm
rpm -ivh python-devel-2.6.6-29.el6.x86_64.rpm

Install postgresql 9.1

wget http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-redhat91-9.1-5.noarch.rpm
rpm -ivh pgdg-redhat91-9.1-5.noarch.rpm

Install libxml2 development version “libxml2-dev”

libxml2 is dependent upon zlib and zlib-devel.

wget ftp://ftp.pbone.net/mirror/ftp.sourceforge.net/pub/sourceforge/f/fu/fuduntu/yum/2012/STABLE/RPMS/zlib-1.2.5-2.fu14.x86_64.rpm
rpm -ivh zlib-1.2.5-2.fu14.x86_64.rpm
rpm -Uvh zlib-1.2.5-2.fu14.x86_64.rpm

wget ftp://ftp.pbone.net/mirror/ftp.sourceforge.net/pub/sourceforge/f/fu/fuduntu/yum/2012/STABLE/RPMS/zlib-devel-1.2.5-2.fu14.x86_64.rpm
rpm -ivh zlib-devel-1.2.5-2.fu14.x86_64.rpm
rpm -Uvh zlib-devel-1.2.5-2.fu14.x86_64.rpm

wget ftp://rpmfind.net/linux/centos/6.2/updates/x86_64/Packages/libxml2-devel-2.7.6-4.el6_2.4.x86_64.rpm
rpm -ivh libxml2-devel-2.7.6-4.el6_2.4.x86_64.rpm
rpm -Uvh libxml2-devel-2.7.6-4.el6_2.4.x86_64.rpm

Install libxslt development version “libxslt-devel”

wget ftp://rpmfind.net/linux/centos/6.2/os/x86_64/Packages/libgpg-error-devel-1.7-4.el6.x86_64.rpm
rpm -ivh libgpg-error-devel-1.7-4.el6.x86_64.rpm

wget ftp://rpmfind.net/linux/centos/6.2/os/x86_64/Packages/libgcrypt-devel-1.4.5-9.el6.x86_64.rpm
rpm -ivh libgcrypt-devel-1.4.5-9.el6.x86_64.rpm

wget ftp://rpmfind.net/linux/centos/6.2/os/x86_64/Packages/libxslt-devel-1.1.26-2.el6.x86_64.rpm
rpm -ivh libxslt-devel-1.1.26-2.el6.x86_64.rpm

Install virtual environment for python

virtualenv pyenv

Activate your virtual environment

To work with CKAN it is best to adjust your shell settings so that your shell uses the virtual environment you just created. (Note that the period and space at the beginning of the command are necessary.)

. pyenv/bin/activate

When the shell is activated you will see the prompt change to something like this:

(pyenv)[root@ip-10-190-46-61 ~]#

Install CKAN source code

Here is how to install the latest code (HEAD on the master branch):

pip install --ignore-installed -e git+https://github.com/okfn/ckan.git#egg=ckan

Install Additional Dependencies

CKAN has a set of dependencies it requires which you should install too. These are listed in three text files: requires/lucid_*.txt, followed by WebOb explicitly.

First we install two of the three lists of dependencies:

pip install --ignore-installed -r pyenv/src/ckan/requires/lucid_missing.txt -r pyenv/src/ckan/requires/lucid_conflict.txt
pip install webob==1.0.8

Install psycopg2 and pylons

  • Psycopg is the most popular PostgreSQL adapter for the Python programming language.
  • Pylons is a Web application framework written in Python.

yum install python-psycopg2
yum install python-pylons

Install cython for c compiler

  • Cython is a programming language to simplify writing C and C++ extension modules for the CPython Python runtime.

easy_install cython (This takes awhile.)

Install more required python modules

In the standard install instructions you would run: pip install --ignore-installed -r pyenv/src/ckan/requires/lucid_present.txt. However, it produces an error message on psycopg2, so instead we’ll install each package listed in lucid_present.txt manually (except for Psycopg and webob, which were already installed in the instructions above). Note that the the install of the lxml package may take a long time.

  • Babel is a collection of tools for internationalizing Python applications.
  • lxml is a Pythonic binding for the C libraries libxml2 and libxslt. It provides a library for processing XML and HTML in Python.
  • repoze.who is an identification and authentication framework for WSGI applications.

pip install babel==0.9.4
pip install lxml==2.2.4
pip install Pylons==0.9.7
pip install repoze.who==1.0.19
pip install tempita==0.4
pip install zope.interface==3.5.3
pip install repoze.who.plugins.openid==0.5.3
pip install repoze.who-friendlyform==1.0.8
pip install routes==1.11
pip install paste==1.7.2
pip install pastescript==1.7.3

At this point you need to deactivate and then re-activate your virtual environment to ensure that all the scripts point to the correct locations.

deactivate

. pyenv/bin/activate

Set up a PostgreSQL database

Initialize and start Postgresql DB:

service postgresql initdb
service postgresql start

List existing databases:

sudo -u postgres psql -l

Create a user called ckanuser and enter “pass” for the password when prompted. (For actual production, you’ll want a more secure password).

sudo -u postgres createuser -S -D -R -P ckanuser

Now create the database (owned by ckanuser), which we’ll call ckantest:

sudo -u postgres createdb -O ckanuser ckantest

Install Postgresql development version

This will be used in setting up database using paster. Just execute the statement below.

yum install postgresql-devel

then find the pg_hba.conf and change “ident” to “trust”

find / -name pg_hba.conf
nano /var/lib/pgsql/data/pg_hba.conf

example lines in pga_hba.conf:

host all 127.0.0.1 255.255.255.255 trust
host all 127.0.0.1 255.255.255.255 trust

Disable firewall in Redhat

service iptables stop

Install and configure Solr

Installing Solr on Red Hat

Create a CKAN config file

cd ~/pyenv/src/ckan
paster make-config ckan development.ini

Modify the development.ini file, assigning the following values:

host = [DNS]
sqlalchemy.url = postgresql://ckanuser:pass@localhost/ckantest
ckan.simple_search = 1
(If Solr is installed, leave this line commented out.) ckan.site_url: [DNS]
ckan.site_id = [DNS]

Example:

host = ec2-184-73-91-27.compute-1.amazonaws.com
sqlalchemy.url = postgresql://ckanuser:pass@localhost/ckantest
ckan.simple_search = 1
ckan.site_url: ec2-184-73-91-27.compute-1.amazonaws.com
ckan.site_id = ec2-184-73-91-27.compute-1.amazonaws.com

Enable the synchronous_search Plugin

New datasets added via the web UI won’t show up in search results or on the datasets page until you enable the synchronous search plugin in your CKAN config file (e.g. development.ini). Find the plugins line and set it to something like this:

ckan.plugins = stats synchronous_search

Create database tables

Now that you have a configuration file that has the correct settings for your database, you’ll need to create the tables. Make sure you are still in an activated environment with (pyenv) at the front of the command prompt and then from the pyenv/src/ckan directory run these commands:

service postgresql reload
pip install psycopg2
paster --plugin=ckan db init

(If your config file is called something other than development.ini then you need to specify it in the final command, e.g. paster --plugin=ckan db init --config=test.ckan.net.ini.)

You will be prompted with this message if successful: “Initialising DB: SUCCESS.”

Create the cache and session directories

You need to create two directories for CKAN to put temporary files:

  • Pylon’s cache directory, specified by cache_dir in the config file.
  • Repoze.who’s OpenId session directory, specified by store_file_path in pyenv/src/ckan/who.ini

(from the pyenv/src/ckan directory or wherever your CKAN ini file you recently created is located):

mkdir data sstore

Link to who.ini

who.ini (the Repoze.who configuration) needs to be accessible in the same directory as your CKAN config file. So if your config file is not in pyenv/src/ckan, then cd to the directory with your config file and create a symbolic link to who.ini. e.g.:

ln -s pyenv/src/ckan/who.ini (this line is actually not necessary)

Test the CKAN webserver

(from the pyenv/src/ckan directory):

paster serve development.ini

If this runs successfully, you’ll see a message such as “serving on http://10.190.102.209:5000”.

browse CKAN Site at: localhost

Install console browser to be able to browse localhost or 127.0.0.1

Launch a separate ssh connection to the server. Then enter:

yum install w3m
w3m http://10.190.102.209:5000

To exit the browser, enter Command-z.

To quit serving the site, enter Command-c from the console browser where you ran paster serve development.ini.

Create an Admin User

By default, CKAN has a set of locked-down permissions. To begin working with it you need to set up a user and some permissions. First create an admin account from the command line (you must be root, sudo -s):

paster --plugin=ckan user add admin --config=development.ini

When prompted, enter a password. This is the password you will use to log in to CKAN. In the resulting output, note that you will also get assigned a CKAN API key.

This command is your first introduction to some important CKAN concepts. paster is the script used to run CKAN commands. std.ini is the CKAN config file. You can change options in this file to configure CKAN.

For exploratory purposes, you might was well make the admin user a sysadmin. You obviously wouldn’t give most users these rights as they would then be able to do anything. You can make the admin user a sysadmin like this:

paster --plugin=ckan sysadmin add admin --config=development.ini

Restart the server

paster serve development.ini

Then visit the site from your web brower at your server’s domain and port:

http://ec2-23-20-186-177.compute-1.amazonaws.com:5000

Success!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s