Thursday, October 27, 2011

[Py] The Python Challenge -- Level 03

Here is a note of my solution to solve Level 03 of The Python Challenge.

The main library used is the urllib.request for retrieving the web page source and the re for regular expression operations.



At the beginning, I had been totally lost. There was no obvious thing for me to process with Python. So I did quick search on Google and tried not to view any possible solution. I glanced about the re library,  and thought about the Level 02 challenge in which the data to be processed was given in the web page source. Yes, in Level 03, the raw material is embedded in the page source again.

When writing the code, I happened to noticed that there was a hint on the tab of web page... Orz

I started with re.research() function. It took me some time to figure out what were the groups and the group which were returned by re.search(). I used them to get the string ``IQNlQSL'' but it was not the right answer for the link. Then I tried to modified the link with only the lowercase ``l'', and the web page responded a message read ``yes. but there are more.'' Since then I supposed that I have to find all the matched clips in the text.

The successful tool I used is the re.findall(). And the original version of the my code (in Python 3) is as follows.

import urllib.request
import re

source = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/equality.html") 

src_str = str(source.read()).replace('\\n','').split('<!--')
target_str = src_str[-1]

m = re.findall('([^A-Z][A-Z]{3})([a-z])([A-Z]{3}[^A-Z])', target_str)

for c in m:
    print(c[1], end="")

No comments:

Post a Comment