python - Pandas: Repeat function for current keyword if except -
i have built web scraper. program enters searchterm searchbox , grabs results. pandas
goes through spreadsheet line-by-line in column retrieve each searchterm.
sometimes page doesn't load properly, prompting refresh.
i need way repeat function , try same searchterm if fails. right now, if return
, go on next line in spreadsheet.
import pandas pd selenium import webdriver selenium.webdriver.common.keys import keys df = pd.read_csv(searchterms.csv, delimiter=",") def scrape(searchterm): #loads url searchbox = driver.find_element_by_name("searchbox") searchbox.clear() searchbox.send_keys(searchterm) print "searching %s ..." % searchterm no_result = true while no_result true: try: #find results, grab them no_result = false except: #refresh page , above again current searchterm - how? driver.refresh() return pd.series([col1, col2]) df[["column 1", "column 2"]] = df["searchterm"].apply(scrape) #executes crawl each line in csv
the try
except
construct comes else
clause. else
block executed if goes ok. :
def scrape(searchterm): #loads url no_result = true while no_result: #find results, grab them searchbox = driver.find_element_by_name("searchbox") searchbox.clear() try: #assumes exception thrown if there no results searchbox.send_keys(searchterm) print "searching %s ..." % searchterm except: #refresh page , above again current searchterm driver.refresh() else: # executed if no exceptions thrown no_results = false # .. post-processing code here return pd.series([col1, col2])
(there finally
block executed no matter what, useful cleanup tasks don't depend on success or failure of preceding code)
also, note empty except
catches exceptions , never idea. i'm not familiar how selenium
handles errors, when catching exceptions, should specify exception expecting handle. how, if unexpected exception occurs, code abort , you'll know bad happened.
that why should try keeping few lines possible within try
block.
Comments
Post a Comment