Python- removing punctuation from string/text

Written by Jaganadh Gopinadhan

While performing one may require to remove punctuations from the text. There is an easy way to do this in Python. Using the ‘string’ module we can do it. e.g.

#!/usr/bin/env python
import sys
import string

txt = open(sys.argv[1],'r').read()
for punct in string.punctuation:
    txt = txt.replace(punct,"")
print txt

The punctuations contained in the ‘string.punctuation are ` ‘!”#$%&'()*+,-./:;<=>?@[\]^_{|}~' Total 33 punctuation marks.

If you would like to retain any of the punctuation marks in the above list in your text, you can modify the loop. For example I would like to retain “%” and “$” in the text. For the same code will be like

#!/usr/bin/env python
import sys
import string

txt = open(sys.argv[1],'r').read()
excludu = ['$','%']
for punct in string.punctuation:
    if not punct in excludu:
        txt = txt.replace(punct,"")
print txt

Happy Hacking !!!!!!!

Migrated from my old blog jaganadhg.freeflux.net

Written on September 16, 2009
The Opinions Expressed In This Post Are My Own And Not Necessarily Those Of My Employer.
[ Python  Text Processsing  ]