Hi,
How can I retrieve the contents of a table on a web page displayed in the Chrome browser please?
I don’t want to read all the text using OCR as this is cumbersome for an entire table and it can be unreliable at times also.
I also don’t want to resort to the “Get all text on web page” action because this does not get the contents in a format that is easily relatable to the structure of the table (e.g. it comes back with every table cell in a new line).
I can use “Find elements by tag in browser” to get an element returned (in this particular case there is only one table on the page). But from there I can’t figure out how to make use of that to extract what I need.
The value returned from the “Find elements by tag in browser” action is a list something like this:
[<selenium.webdriver.remote.webelement.WebElement (session=“f957e8979303469d6e3911430a48ab9d”, element=“6FCEB2BAF7941B384467003E96BAEAFA_element_8736”)>]
I don’t mind getting the HTML of the table - I can strip what I need out of that once I have the text of that HTML.
I tried to use something along the lines of
chrome.find_element_by_id(element_id)
web_element.get_attribute(‘outerHTML’)
but couldn’t figure out how to make that work either. The chrome object doesn’t recognise the find_element_by_id function.
Then stepping back a bit I tried using the “Find element by ID in browser” action just to see if I could use the info originally returned from “Find elements by tag in browser” action to identify the same element, but doesn’t seem to matter how you slice and dice the info returned it doesn’t seem to include the ID value required.
Phew! So, questions:
- If you use the actions such as “Find elements by tag in browser”, then what can you actually use the output of that action for, and how?
2) How can I extract the HTML of a specific table from a web page?
Thanks very much.