The main library used is the urllib.request. In Python 2.x, it is just urllib.
When dealing with the challenge, I viewed the page source directly from Firefox and read the further hint: find rare characters in the mess below. With the hint, I copied the messy paragraph and tried to paste it in Vim for further text processing, but my Vim hung when I did the pasting. I had no idea about the hanging but didn't want to figure out the possible problem on the side of Vim. Instead, I decided to capture the messy paragraph within the code. It would be a better approach, I thought.
After getting the messy paragraph, I assumed that there were some readable letters embedded in the paragraph of big mess. Therefore I tried to pick out ascii characters from the messy paragraph. It worked. :-)
The original version of the my code (in Python 3) is as follows.
import urllib.request import string source = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/ocr.html") src_str = str(source.read()).replace('\\n','').split('<!--') target_str = src_str[-1] for c in target_str: if c in string.ascii_letters: print(c, end="")
No comments:
Post a Comment