Urchin is a Web based, customisable, RSS aggregator and filter. It's primary purpose is to allow the generation of new RSS feeds by running queries against the collection of items in the Urchin database. However, other arbitrary output formats can be defined and generated using XSL transformations or HTML::Template templates.

In other words, the collection of Urchin Perl modules form a foundation for building an RSS aggregation or portal service.

Urchin also includes code to help administrators import RSS-like data from non-RSS sources. Currently there is code to create RSS objects using regular expression based scraping of text files or SQL queries to Perl DBI sources.

Urchin is written in Perl using, wherever possible, pre-existing Perl modules.

For a more detailed discussion of the internals of the software and function of the Urchin Perl modules, see The Design and Architecture of Urchin.

Changes in version 0.92:

System Requirements

The code has been tested on Red Hat Linux 8.0 running Apache version 2.0.40, MySQL version 4.0.13 and Perl version 5.8.0, and Libranet GNU/Linux version 2.8 running Apache version 1.3.27, MySQL version 4.0.13 and Perl version 5.8.0.

Download

A tarball can be downloaded from http://sourceforge.net/projects/urchin. The curent release is Urchin-0.92.tar.gz.

Alternatively, installation from CVS is described in the installation instructions.

Note that there are currently (August 20, 2004) some problems with Sourceforge CVS access. During the outage, you can get the latest development code snapshot by downloading Urchin-dev-20040820.tar.gz from the project download page.

Installation

These installation instructions currently only cover installation for users with root access to the host machine.

Usage

The Usage guidlines detail how to start using an Urchin installation, from the point of view of a webmaster wishing to provide RSS aggregation, filtering and transformation services.

Wishlist

There is a wishlist of not-yet-implemented features.

Review

As part of the JISC-funded project that led to Urchin, Ben Hammersley wrote an independent review of the software for JISC. You can read that review, which was based on version 0.81, here.

License

Urchin is Free Software. Portions of the code are licensed under the GNU General Public License, the rest under the GNU Lesser General Public License.

Credits

The code was written by Martin T. Flack (martin AT neoreality DOT com) of NeoReality, based on a design by Ben Lund (b.lund AT nature DOT com) and Timo Hannay (t.hannay AT nature DOT com) of Nature Publishing Group. The database schema was designed by Nicole Tindill (n.tindill AT nature DOT com), also of Nature Publishing Group.

Programming by NeoReality, along with other related costs, was funded by JISC, the UK Joint Information Systems Committee (JISC), as one of the PALS Metadata and Interoperability Group projects. Other staff time was provided by Nature Publishing Group.

Contact

For comments, questions, or contributions, please contact Ben Lund at b.lund AT nature DOT com