Reading a CSV File

Hi,

I feel like I am missing something really simple and/or obvious. But, how do I simply read the contents of a .csv file?

  • There is no built in activity that I can see for working with csv files
  • Even though csv files are associated with Excel on the system, I can’t specify to work with the csv because the Excel Application/Excel File activities are restricted to files with .xlsx extension
  • Can’t use the “Read List from txt file” or “Read txt file” activities because again they are restricted to files with .txt extension

So, I guess at this point my options are to either:

  • Use the “Open a file” activity (which launches Excel) and then use mouse/keyboard actions to manipulate the application e.g. select all, copy to clipboard
  • Write some python code to read in the contents of the file

This is frustrating if these are indeed my options as it seems like a common business scenario (working with csv files) that should be simple to achieve without “manually” manipulating the file (it’s just a text file after all, and a common format), or having to write code…

Regards,
Jared Rumball

1 Like

Hello Jared, hope you’re doing well ! Thank you for raising this issue related to csv files. You are right currently we do not have an out of the box activity to read csv file.

Could you please let me know what sort of operations you want to perform once you read csv file ?

We’ll look to include CSV related activities in 24.1 release.

For now, you may use this code in python code node to convert a csv into excel to make progress.

import pandas as pd
# Read the csv file into df
csv_file = 'C:/Users/Jared.Rumball/input_file.csv'
df = pd.read_csv(csv_file)

# Write the df into an Excel file
excel_file = ('C:/Users/Jared.Rumball/input_file.xlsx')
df.to_excel(excel_file, index=False)

Regards
Manpreet

2 Likes

Thanks very much for the code snippet Manpreet, that definitely got me moving.

Apologies for the slow reply, both in regards to acknowledging the code snippet provided and answering the additional queries you posed around CSV activities.

As I have thought about this a bit more I think it would actually break down into two related but separate components:

  1. ease of interacting with csv files in terms of e.g. reading the contents
  2. ease of working with structured data

To put that another way, once I have read the contents of a CSV file into my process I am then not working with the CSV as such, but a data set - much like how the code snippet you provided reads the csv contents into a Pandas data frame. Once it’s in the data frame it then doesn’t matter what the source was to begin with.

1. CSV FILES
Pretty straight forward I think. You have other actions that will for instance read the contents of a text file, but these restrict you to files with a .txt extension. So, what would be great is:
a) potentially allow more flexibility in how we specify target files for various actions in some circumstances
b) provide action(s) that allow you to read the contents of a CSV file into text within the process (to then perform text manipulation etc.)
c) provide action(s) that allow you to read the contents of a CSV file into a data frame type of structure or similar
d) provide action(s) that allow you to output to a CSV file from the process (from text or data frame)
e) convert from csv text to data frame or vice versa

Note that “CSV” itself can imply some variations also i.e. a “CSV” that has a tab character as the separator rather than a comma. Or maybe a file has a .txt extension but we know it contains CSV formatted data, and so on.

2. Structured Data
I see we already have lists and dictionaries and some actions to work with those in the studio. Both are great, but also inherently limited when working with typical data in a business process. If the data frame structure, and the operations possible on them, could be surfaced through Liberty RPA actions I believe this would make the product much more powerful overall. The flexibility this would provide without having to write python code snippets would be hugely beneficial (in my opinion).

For instance, in the POC I am working on we read in a CSV file that contains transactions of various types for the previous day. I need to filter the data set by various attributes across multiple columns, calculate the sub-totals based on various filters, search for and report on particular data values/ scenarios, calculate a total for the entire file, compare these values to another data set obtained from another source, and so on.

I hope that helps to clarify things and give some useful info.

Thanks very much again for your prompt assistance on the original query.

Regards,
Jared Rumball

1 Like

Excellent, keep these new ideas coming. I’ll try and implement this in 2024.1 release.

Regards
Manpreet

1 Like