I am scraping profiles on ask.fm for a research question. The problem is
that only the top most recent questions are viewable and I have to
click "view more" to see the next 15.
The source code for clicking view more looks like this:
<input class="submit-button-more submit-button-more-active" name="commit" onclick="return Forms.More.allowSubmit(this)" type="submit" value="View more" />
What is an easy way of calling this 4 times before scraping it. I want the most recent 60 posts on the site. Python is preferable.
You could probably use selenium to browse to the website and click on the button/link a few times. You can get that here:
https://pypi.python.org/pypi/selenium
Or you might be able to do it with mechanize:
http://wwwsearch.sourceforge.net/mechanize/
I have also heard good things about twill, but never used it myself:
http://twill.idyll.org/
Source: http://stackoverflow.com/questions/19437782/scraping-dynamic-data
The source code for clicking view more looks like this:
<input class="submit-button-more submit-button-more-active" name="commit" onclick="return Forms.More.allowSubmit(this)" type="submit" value="View more" />
What is an easy way of calling this 4 times before scraping it. I want the most recent 60 posts on the site. Python is preferable.
You could probably use selenium to browse to the website and click on the button/link a few times. You can get that here:
https://pypi.python.org/pypi/selenium
Or you might be able to do it with mechanize:
http://wwwsearch.sourceforge.net/mechanize/
I have also heard good things about twill, but never used it myself:
http://twill.idyll.org/
Source: http://stackoverflow.com/questions/19437782/scraping-dynamic-data
No comments:
Post a Comment