Calculate Total and Average Sales from Multiple CSV Files
School
Yale University**We aren't endorsed by this school
Course
ARCG 731
Subject
Computer Science
Date
Dec 12, 2024
Pages
2
Uploaded by zxrrmyt
import import import sy input_path = sys.argv[1] output_file = sys.argv([2] all_files = glob.glob(os.path.join(input_path, 'sales_*')) all_data_frames = [] for input_file in all_files: data_frame = pd.read_csv(input_file, index_col=None) total_cost = pd.DataFrame([float(str(value).strip('S$').replace(',',’"')) \ for value in data_frame.loc[:, 'Sale Amount']]).sum() average_cost = pd.DataFrame([float(str(value).strip('S$').replace(',’,'"')) \ for value in data_frame.loc[:, 'Sale Amount']]).mean() data = {'file_name': os.path,basename(input_file), 'total_sales': total_sales, 'average_sales': average_sales} all_data_frames.append(pd.DataFrame(data, \ columns=["'file_name', 'total_sales', 'average sales'])) data_frames_concat = pd.concat(all_data_frames, axis=0, ignore_index=True) data_frames_concat.to_csv(output_file, index = False) We use list comprehensions to convert the string dollar values in the Sale Amount column into floating-point numbers, and then we use the DataFrame function to convert the object into a DataFrame so we can use the two functions to calculate the sum and mean values for the column. Because each row in the output file should contain the input filename, sum, and mean for the Sale Amount column in the file, we combine these three pieces of data into a DataFrame, use the concat function to concatenate all of the DataFrames together into one DataFrame, and then write the DataFrame to an output file. To run the script, type the following on the command line and hit Enter: python pandas_ sum_average_from_multiple_files.py "C:\Users\Clinton\Desktop"\ output_files\pandas_output.csv You can then open the output file, pandas_output.csv, to review the results. We've covered a lot of ground in this chapter. We've discussed how to read and parse a CSV file, navigate rows and columns in a CSV file, process multiple CSV files, and calculate statistics for multiple CSV files. If you've followed along with the examples in this chapter, you have written 12 Python scripts. The best part about all of the work you have put into working through the examples in this chapter is that they are the basic building blocks for navigating and processing files. Having gone through the examples in this chapter, youre now well prepared to process Excel files, the topic of our next chapter. Sum and Average a Set of Values perFile | 99
Chapter Exercises 1. Modify one of the scripts that filters for rows based on conditions, sets, or regular expressions to print and write a different set of rows than the ones we filtered for in the examples. 2. Modify one of the scripts that filters for columns based on index values or column headings to print and write a different set of columns than the ones we filtered for in the examples. 3. Create a new set of CSV input files in a folder, create a separate output folder, and use one of the scripts that processes multiple files to process the new ihput-files and write the results to the output folder. 100 | Chapter2: Comma-Separated Values (CSV) Files