Using Solr in Django for Full-Text Searching via Solango

Solr Logo I've been doing quite a bit of work with Solr lately, both at the office and at home and, by golly, I love it! It's very powerful and simple to integrate with regardless of your platform.

In this post I'll explain how to use Solr as a data-store independent search provider for Django projects. I'll assume that you have a functional Solr install and generally understand how to use it. If that's not the case Apache's Solr documentation can help.

Get Solango

The easiest way to get your project talking to Solr is via the Solango Django application. Grab the source from here and copy the solango sub-directory into your PYTHON_PATH.

Configure Solango

Solango must now be configured. Jump into the solango directory that you copied above and edit the settings.py file. Modify the SEARCH_UPDATE_URL, SEARCH_SELECT_URL, and SEARCH_PING_URLS settings to match your Solr environment. For example if your Solr instance was running locally on port 8080 your settings would look like:

SEARCH_UPDATE_URL = getattr(settings, "SEARCH_UPDATE_URL", "http://localhost:8080/solr/update")
SEARCH_SELECT_URL = getattr(settings, "SEARCH_SELECT_URL", "http://localhost:8080/solr/select")
SEARCH_PING_URLS =  getattr(settings, "SEARCH_PING_URLS", ["http://localhost:8080/solr/admin/ping",])

Configure Your Application

Solango configuration out of the way you can now configure your project's settings.py to include the solango app. For example:

INSTALLED_APPS = (
    'solango',
    # your other apps
)

Define the Model

Now I'll show you an example model that you could search upon. I'll create a standard model named Article and a Solr document type named ArticleDocument. Essentially ArticleDocuments are the Solr equivilent of Articles. The last line ties the model and the document together. Also note the copy setting for the document's fields. That instructs Solr to make them copy fields that target the default search field.

import solango

# regular model, lives relationally
class Article(models.Model):
    id = models.AutoField(primary_key=True)
    title = models.CharField(max_length=1024)
    content = models.TextField()

# SearchDocument, lives in Solr
class ArticleDocument(solango.SearchDocument):
    title = solango.fields.CharField(copy=True)    
    content = solango.fields.TextField(copy=True)

# tie them together
solango.register(Article, ArticleDocument)

Define the Schema

With the meta data defined at the Django level we must now define it within Solr. Luckily solango simplifies this. Drop to a shell and run the following command in your project's main directory:

python manage.py solr --fields

This will output field definitions that you'll have to add to Solr's schema.xml.

########## FIELDS ###########

<field name="title" type="string" indexed="true" stored="true" omitNorms="false"
    required="false" multiValued="false"/>
<field name="content" type="string" indexed="true" stored="true" omitNorms="false"
    required="false" multiValued="false"/>
<field name="model" type="string" indexed="true" stored="true" omitNorms="false"
    required="true" multiValued="false"/>
<field name="id" type="string" indexed="true" stored="true" omitNorms="false"
    required="true" multiValued="false"/>

######## COPY FIELDS ########

<copyField source="title" dest="text"/>
<copyField source="content" dest="text"/>

Index

Assuming you have article data in your main data store it's simple to get it indexed with the following command. Keep in mind that I've had to specify nothing about connectivity to the relational data source above. Rather than use a data store specific Solr data import handler solango will use the Django model.

python manage.py solr --reindex 

Pow, let it run and your data is now indexed in Solr for speedy and powerful searching!

Search

Well, none of this would be useful unless it was actually searched, no? Below is an abbreviated view that queries Solr for ArticleDocuments. Thanks to duct-typing they can fit right in most places an Article model is used.

import solango

def search(search_string):
    articles = solango.connection.select(q = search_string).documents

That's just a basic search. More advanced features like facets and highlighting are supported as well. For details check out this documentation.

Conclusion

Well, it might not be as seamless as Ruby ActiveRecord's acts_as_solr, but solango certainly makes using Solr as a search provider for Django manageable. It decouples Solr from the data store. It provides manage.py-based management of Solr. It supports high level Solr features. It's not too shabby at all, no sir.

This post was but a brief introduction. The Django Solr Documentation can help complete the picture for you.

Created on 2010-01-01 00:48:00
Share on Facebook Facebook
Comment Feed
Add a Comment: (HTML not accepted. URLs will automatically be converted to links)
Body
Nickname (Login || Register)
Home Page
Email Addy(kept private)
Are you human?
Tags:
linq .Net performance sql 2008 sql server powershell indexes scripting reporting services filestream ruby ironruby entity framework EF testing .net framework 4.0 ADO.NET SSRS rs setpolicies vb cte c# podcasts webdav exchange server data warehousing Data Services Web Services Astoria jQuery database object oriented cql refactoring remoting simpledb cloud HTML GObject GNOME Vala BI couchdb django ORM python erlang functional C curl stackless concurrency Groovy Java JVM dynamic tools windows ironpython dlr systems programming go CAPTCHA appengine natural language full-text rails lucene wave clr parallel virtualization Oracle iPhone xml Objective-C Haiku security cocoa touch C++ BeOS Operating Systems Lucene monitoring Solr lisp VS 2010
Blog History:
Solrnet, a Solr Client Library for .Net - 03/08/2010
Monitoring Solr with LucidGaze - 02/21/2010
Haiku, an Open Source Continuation of BeOS - 02/10/2010
Basic Authentication with a NSURLRequest in Cocoa Touch - 01/24/2010
Asynchronous Programming in Cocoa Touch - 01/17/2010
NSXML-like XPath Support in Cocoa Touch with TouchXML - 01/03/2010
Using Solr in Django for Full-Text Searching via Solango - 01/01/2010
Using Entity Framework with Oracle - 12/22/2009
Solutions to Common VirtualBox Problems - 12/20/2009
Parallel Programming with the Task Parallel Library and PLINQ in .Net 4.0 - 12/14/2009
Clojure, A Lisp for the JVM and CLR - 12/13/2009
Google Wave Robots in Java - 12/07/2009
Employing Solr/Lucene with SQL Server for Full-Text Searching - 12/05/2009
Full-Text Indexing in Ruby Using Ferret - 11/28/2009
Home-Brewing a Full-Text Search in Google's AppEngine - 11/22/2009
Using reCAPTCHA With Django - 11/21/2009
Phat Go Code Launched - 11/19/2009
A Little More of Google's Go - 11/17/2009
First Impressions of Go, Google's New Systems Language - 11/14/2009
Scripting Your .Net Applications with IronPython - 11/03/2009
Windows Services in Python - 11/02/2009
My Tool List - 10/26/2009
Groovy: Dynamic Language for the JVM... Groovy! - 10/23/2009
Easy Concurrency with Stackless Python - 10/03/2009
C from erlang via linked-in driver - 09/16/2009
Templating with NDjango - 09/06/2009
A little bit o' Erlang - 08/23/2009
Tale of a Website, from Rails to ASP.NET to Django - 08/20/2009
Now in Django - 08/19/2009
Stored Procedures in Django - 08/09/2009
CouchDBExtension - 08/06/2009
POCO Entities in ADO.NET 4.0 - 07/30/2009
Accessing SimpleDB from SSRS - 07/22/2009
Easy GNOME Development with the Vala Programming Language - 07/16/2009
HTML Parsing with Ruby and Nokogiri - 07/12/2009
Amazon SimpleDB Batched PUTs Usage and Performance - 07/10/2009
PowerShell 2.0 Out-GridView, ISE and ScriptCmdlets - 07/05/2009
Asynchronous and remote execution with powershell 2 ctp3 - 06/30/2009
Understanding Source Code with NDepend and CQL - 06/22/2009
Object Oriented Databases with db4o - 06/07/2009
ADO.Net Data Services with jQuery - 05/29/2009
Exchange webdav automation - 05/26/2009
Podcasts - 05/26/2009
Linq to Object Performance - 05/11/2009
SQL 2008 and powershell - 01/25/2009
SQL 2008 filtered indexes - 06/11/2008
SQL 2008's table valued parameters - 05/11/2008
SQL 2008's MERGE statement - 04/22/2008
ironruby - 04/11/2008
SSRS scripting with RS.EXE - 11/20/2007
SQL 2008 FILESTREAM - 08/04/2007
CTE Concatenation - 01/01/2007