Plase note: Purple is no longer maintained.

Table of contents


What is it?

Purple is an RDF API for the Python programming language. It's heavily based on Pyrple toolkit, written by Sean B. Palmer. Purple aims to have a more extensible and provide in-memory and MySQL storage support. Purple is a work in progress effort, hance some of basic features are still missing. Purple is maintained by Andrea Peltrin.

Download

Download the latest snapshot of Purple.

License

Purple is released under the GPL 2 license.

Tutorial

During this tutorial a basic knownledge of RDF concepts is assumed.

Installation

Just unzip the purple archive in Python's site-package folder (or any other folder listed in your PYTHONPATH enviroment variable). If everything is correct by typing import purple into your Pyton console should appear blahblah..

PythonWin 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32.
Portions Copyright 1994-2001...
>>> import purple
>>> dir(purple)
['Graph', 'Literal', 'NTriplesParser', 'Namespace', 'Node', 'Triple', 'TurtleParser', 
'URI', 'Var', '__builtins__', '__doc__', '__file__', '__name__', '__path__', 'aliases', 
'bNode', 'graph', 'namespaces', 'node', 'parsers', 'quoting', 'serializers', 'triple', 'www']

Graph is your best friend

In Purple all the interesting stuff is done thru graph instances either with in-memory or MySQL storage. You can create an in-memory RDF graph using the memory as storage value.

from purple import Graph
#from purple.namespaces import FOAF, DC, VAR

G = Graph(storage='memory')

or using the shortcut

G = Graph()

Now you have a Graph bound to G variable and you are ready to rock.

We can add one or more triple to a graph instance by using add method

t = Triple(URI('http://www.deelan.com'), DC.description, Literal('Personal website of Andrea Peltrin'))
G.add([t])

add method expects an iterable to be supplied as parameter so we can either pass a list of triples, or a another graph instance.

F = Graph()
F.feedURI('http://www.deelan.com/foaf.rdf')
G.add(F)

Fundamentals RDF building blocks like URI's, literals and blank nodes are provided by corresponding URI, Literal and bNode classes.

from purple import URI, Literal, bNode

# URI resource
rez = URI('http://www.deelan.com/')

# Literal italian and english languages
litIt = Literal('Ciao mondo!', lang='it-it')
litEn = Literal('Hello world!', lang='en')

# An anonymous blank node
anon = bNode()
# A named blank node
blah = bNode('blah123')

# A literal with a XML Schema floating point datatype
floatLit = Literal('66.6', dtype=XSD.float)

Connect to a MySQL database and automagically stuff data into it (MySQLdb module must be already installed on the system).

db = Graph(storage='mysql', host='localhost', db='foo', user='bar', password='secret')

We can also supply an already initialized connection object, using the connection name parameter.

db = Graph(storage='mysql', connection=k)

Sometimes is more conventient or quick to just scribble triples with Turtle or N-Triples grammars and then create an in-memory graph instance with such data.

s = """
@prefix dc:<http://purl.org/dc/elements/1.1/> .
@prefix foaf:<http://xmlns.com/foaf/0.1/> .

<http://www.deelan.com>
  dc:title "deelan.com" ;
  dc:creator [ foaf:nick "deelan" ; foaf:fname "Andrea Peltrin" . ] 
  .
"""
G = Graph.fromString('turtle', s)

fromString let us to create a Graph instance directly from raw data, without using the feed* methods. Since Purple needs to know the serialization format used to encode RDF data we pass a mime-tyle (or an alias) to fromString in order to invoke the correct parser.

Namespaces

Purple defines a series of commonly used namespaces as instances of Namespace class. They live in the purple.namespaces module.

RDF = Namespace('http://www.w3.org/1999/02/22-rdf-syntax-ns#')
RDFS = Namespace('http://www.w3.org/2000/01/rdf-schema#')
OWL = Namespace('http://www.w3.org/2002/07/owl#')
FOAF = Namespace('http://xmlns.com/foaf/0.1/')
DC = Namespace('http://purl.org/dc/elements/1.1/')
CC = Namespace('http://web.resource.org/cc/')
XSD = Namespace('http://www.w3c.org/2001/XMLSchema#')

Here will import the FOAF namespace.

from purple.namespaces import FOAF

FOAF.homepage
<http://xmlns.com/foaf/0.1/homepage>

To create your own use the Namespace class

from purple.namespaces import Namespace

EX = Namespace('http://example.com/')
EX.foo
<http://example.com/foo>

Querying

db.feedURI('http://www.foafnaut.org/dump.rdf')
q=[
 Triple(URI('http://deelan.com/'), FOAF.knowns, VAR.who),
 Triple(VAR.who, FOAF.nick, VAR.nick),
 Triple(VAR.who, FOAF.given, VAR.given),
 Triple(VAR.who, FOAF.mbox, VAR.mbox)
]

results = db.query(Graph(q))

We you have filled the graph with some data you can query it for some specific values. VAR's let us to bind matching triple terms to some label, doing this we'll be able to extract some terms later.

for r in results:
  print r[VAR.who], 'AKA', r[VAR.nick], 'mbox'

You can iterate over results and find out values those match your search criteria.

MySQL goodies

Graph instances with MySQL storage have a more powerful query capabilities. limit and offset query parameters allow to be more selective about matching triples and exactMatch paramenter turns on a full fledged full-text search for literal nodes (at the moment no google-like boolean search). This will restrict the number of matching triples to 5...

q=[
 Triple(URI('http://deelan.com/'), FOAF.knowns, VAR.who),
 Triple(VAR.who, FOAF.nick, VAR.nick),
 Triple(VAR.who, FOAF.given, VAR.given),
 Triple(VAR.who, FOAF.mbox, VAR.mbox)
]

results = db.query(Graph(q), limit=5)

If you want to be really fancy you can store query themselves in the metabase.

import pickle, base64
s = base64.encodestring(pickle.dumps(q))

query = bNode()
t = [
  Triple(query, THIS.query, Literal(s)),
  Triple(query, DC.title, Literal('People I known')),
  Triple(query, DC.creator, URI('http://deelan.com/'))
]
G.add(t)

Metadata extraction

We can also extract metadata from popular media format like MP3

G = Graph(storage='memory')
G.feedURI('http://deelan.com/sample.mp3')
print G

blah blah...

TODO list

Contribute

Feel free to contribute with suggestions, ideas and code. Purple is pretty modular and it's easy to add parsers, serializers and additional storage implementations.

Written by Andrea Peltrin (aka deelan). Last updated on March 7, 2004. HTML 4 + CSS. This page is somewhat printer friendly.