POS Tagging with Python(Brill Tagger in Python)

In my previous post I demonstrated how to do POS Tagging with Perl. Being a fan of Python programming language I would like to discuss how the same can be done in Python. I downloaded Python implementation of the Brill Tagger by Jason Wiener . I just downloaded it. Nice one.

Download the program from http://wareseeker.com/Software-Development/simple-nlp-part-of-speech-tagger-in-python-1.0.zip/200a88782. Create a folder and move the .zip file to the folder. Unzip it. Open the terminal and run the MakeLex.py(Python MakeLex.py). It will create lexicon for the POS Tagger. Now you can run the NLPlib.py program(python NLPlib.py). It will show the below given result.

/Desktop/postag$ python NLPlib.py 
beginning test
unpickle the dictionary
Initialized lexHash from pickled data.
Tiger ( NNP )
Woods ( NNP )
finished ( VBD )
the ( DT )
big ( JJ )
tournament ( NN )
at ( IN )
par ( NN )

I just created a small python script to tag a given English text file. Before using the script you have to comment lines 119 to end in the NLPlib.py file. Otherwise always you will be getting the above given output.

#!/usr/bin/env python
import sys
from  NLPlib import *
nlp = NLPlib()

def tagText(t):
        text = open(t,'r').readlines()
        for line in text:
                tokens = nlp.tokenize(line)

                tagged = nlp.tag(tokens)
                for i in range(len(tokens)):
                        print tokens[i],"(",tagged[i],")"



inp = sys.argv[1]

tagText(inp)

Just copy the code in to an editor. Save it as tager.py. You should run the program in the same directory where you extracted the tagger. Run it like python tager.py

This software is licensed under the GNU GPL Licence. If you are interested to use it commercial you have to buy a licence.

Happy hacking !!!!!!!

Migrated from my old blog jaganadhg.freeflux.net

Written on July 17, 2009
The Opinions Expressed In This Post Are My Own And Not Necessarily Those Of My Employer.
[ Natural Language Processing  POS Tagging  NLPROC  ]