Hi all,
I am trying to import some python packages that I need for advanced document analysis and ran into some troubles I was unfortunately not able to resolve myself.
As explained by you in another issue I changed my python path to my anaconda base environment using
python_path = user_path + '\\Anaconda3'
sys.path.append(os.path.join(python_path, 'Lib'))
sys.path.append(os.path.join(python_path, 'Lib\\site-packages'))
When I now try to import
from docx.api import Document
everything works. However, when I then want to read the document, an error occurs:
word_doc = Document(file)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "docx\api.py", line 25, in Document
File "docx\opc\package.py", line 128, in open
File "docx\opc\pkgreader.py", line 32, in from_file
File "docx\opc\phys_pkg.py", line 101, in __init__
File "zipfile.py", line 1258, in __init__
File "zipfile.py", line 1321, in _RealGetContents
File "zipfile.py", line 259, in _EndRecData
ValueError: I/O operation on closed file.
This does not happen for my local python interpreter (in the same environment).
When trying to solve the problem via
with open(file, "r") as f:
word_doc = Document(file)
I also get an error:
Traceback (most recent call last):
File "<string>", line 16, in read_table_word
File "docx\api.py", line 25, in Document
File "docx\opc\package.py", line 128, in open
File "docx\opc\pkgreader.py", line 32, in from_file
File "docx\opc\phys_pkg.py", line 101, in __init__
File "zipfile.py", line 1258, in __init__
File "zipfile.py", line 1325, in _RealGetContents
zipfile.BadZipFile: File is not a zip file
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 26, in read_table_word
AttributeError: '_io.TextIOWrapper' object has no attribute 'split'
I have similar issues with a couple of packages (for example, requests or spacy). All of them work with my local interpreter but not within liberty RPA.
Is this just a compatibility problem that cannot be resolved, (i.e., I would have to find another workaround)? Or am I missing something?
Also, just as I side note: In case you are wondering why I’m not simply using the built-in read word file node: I wanted to directly extract the tables of the word file
My environment has python 3.9.7. If you need any other info on packages etc. please let me know
Thank you,