selenium - 在selenium (Python )中,遍历表行

105 5

我有一个网页,只有在单击'检查元素(Inspect Element)'时才出现表格,并且在源代码视图中不可见,表格仅包含两个单元格,每行的单元格数量都相同:


<table class="datadisplaytable">


<tbody>


<tr>


<td class="dddefault">16759</td>


<td class="dddefault">MATH</td>


<td class="dddefault">123</td>


<td class="dddefault">001</td>


<td class="dddefault">Calculus</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


<tr>


<td class="dddefault">16449</td>


<td class="dddefault">PHY</td>


<td class="dddefault">456</td>


<td class="dddefault">002</td>


<td class="dddefault">Physics</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


</tbody>


</table>



我要做的是循环遍历行,并且返回每个单元格中包含的文本,我用selenium无法做到,因为这些元素不包含ID,我也不确定该如何获取它们,我对使用xpath并不十分熟悉。

这是一个返回TypeError的调试尝试:


def check_grades(self):


 table = []


 for i in self.driver.find_element_by_class_name("dddefault"):


 table.append(i)


 print(table)



从表格行中获取文本的简单方法是什么?

时间: 原作者:

122 4

如果要使用xpath,可以使用以下内容:


h ="""<table class="datadisplaytable">


<tr>


<td class="dddefault">16759</td>


<td class="dddefault">MATH</td>


<td class="dddefault">123</td>


<td class="dddefault">001</td>


<td class="dddefault">Calculus</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


<tr>


<td class="dddefault">16449</td>


<td class="dddefault">PHY</td>


<td class="dddefault">456</td>


<td class="dddefault">002</td>


<td class="dddefault">Physics</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


</table>"""



from lxml import html


xml = html.fromstring(h)


# gets the table


table = xml.xpath("//table[@class='datadisplaytable']")[0]



# iterate over all the rows 


for row in table.xpath(".//tr"):


 # get the text from all the td's from each row


 print([td.text for td in row.xpath(".//td[@class='dddefault'][text()])



输出:


['16759', 'MATH', '123', '001', 'Calculus']


['16449', 'PHY', '456', '002', 'Physics']



因此,要使用selenium执行同样的操作,请执行以下操作:


table = driver.find_element_by_xpath("//table[@class='datadisplaytable']")



for row in table.find_elements_by_xpath(".//tr"):


 print([td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][1]"])



对于多个表:


def get_row_data(table):


 for row in table.find_elements_by_xpath(".//tr"):


 yield [td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][text()]"])



for table in driver.find_elements_by_xpath("//table[@class='datadisplaytable']"):


 for data in get_row_data(table):


 # use the data



原作者:
80 4

使用Python 3.x测试


#!/usr/bin/python



h ="""<table class="datadisplaytable">


<tr>


<td class="dddefault">16759</td>


<td class="dddefault">MATH</td>


<td class="dddefault">123</td>


<td class="dddefault">001</td>


<td class="dddefault">Calculus</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


<tr>


<td class="dddefault">16449</td>


<td class="dddefault">PHY</td>


<td class="dddefault">456</td>


<td class="dddefault">002</td>


<td class="dddefault">Physics</td>


<td class="dddefault"></td>


<td class="dddead"></td>


<td class="dddead"></td>


</tr>


</table>"""



from lxml import html


xml = html.fromstring(h)


# gets the table


table = xml.xpath("//table[@class='datadisplaytable']")[0]



# iterate over all the rows 


for row in table.xpath(".//tr"):


 # get the text from all the td's from each row


 print([td.text for td in row.xpath(".//td[@class='dddefault']")])



原作者:
63 2

XPath很脆弱,最好使用CSS选择器或类:


mytable = find_element_by_css_selector('table.datadisplaytable')


for row in mytable.find_elements_by_css_selector('tr'):


 for cell in row.find_elements_by_tag_name('td'):


 print(cell.text)



原作者:
124 1


table = driver.find_element_by_xpath("//table[@class='datadisplaytable']")



for row in table.find_elements_by_xpath(".//tr"):


 print([td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault']")])



原作者:
...