Try to automate pdf format report creation with Python

Introduction

・ In this article, we will use the data downloaded from the household account book application Zaim. For that matter, python is used to automatically generate a report in pdf format that summarizes the income and expenses for one year from the data.

-The source code and articles are probably poorly written and not beautiful, but please forgive me.

・ I started studying python with the phrase, "If you master python, you can automate it!" However, I don't think there are many concrete examples of practical automation that are friendly to beginners. (There is a big prejudice) So, especially for beginners, I'd be happy if you could know that there are things that can be automated with python.

What is the household account book app Zaim?

↓ Official homepage for the time being https://zaim.co.jp/

The household account book app that I love. (Paid members from around May 2020.) It is very convenient because you can analyze monthly income and expenses using pie charts and reserve bar graphs.

And if you become a paid member of Zaim, you can download the collected data in csv format. (I didn't know how to put the file in the Qiita article, so only the photo.)

zaim_data.JPG

Overview of each created file and folder

↓ Working folder structure Zaim_tree.JPG

Put simply, -The data file downloaded from Zaim is in the data folder. (File name: Zaim.2020.csv)

-The font_data folder contains font files for writing to pdf with python. (File name: GenShinGothic-Monospace-Medium.ttf)

-In analyze.py, first read the data file (Zaim.2020.csv) from Zaim with pandas etc. Next, use pandas, numpy, etc. to format the data, create a diagram, and output the formatted data to a new file.

-In the format_data folder, the data formatted by analyze.py is stored in csv format. (File name: diff_money.csv)

-In the graph folder, the image data created by analyze.py is stored in jpg format. (File name: in_money.jpg, out_money.jpg, money.jpg, difference_money.jpg)

-In report_pdf.py, first make detailed settings for the pdf report to be created. Next, paste the images of money.jpg and difference_money.jpg in the graph folder into the pdf. Then, from diff_money.csv in the format_data folder, read how much the balance has been in one year and write it in pdf. After that, it is output as a pdf file.

-In the report folder, the pdf data output by report_pdf.py is included as a pdf file. (File name: balance report.pdf)

source code for analyze.py

↓ Source code

analyze.Import of py library and appearance of graph



import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from matplotlib import rcParams
#Format the graph
rcParams['font.family'] = 'sans-serif' #Font to use
rcParams['font.sans-serif'] = ['Hiragino Maru Gothic Pro', 'Yu Gothic', 'Meirio', 'Takao', 'IPAexGothic', 'IPAPGothic', 'VL PGothic', 'Noto Sans CJK JP']
rcParams['xtick.direction'] = 'in'#x-axis scale line pointing inward('in')Or outward('out')Or bidirectional('inout')
rcParams['ytick.direction'] = 'in'#The y-axis scale line faces inward('in')Or outward('out')Or bidirectional('inout')
rcParams['xtick.major.width'] = 1.0#Line width of x-axis main scale line
rcParams['ytick.major.width'] = 1.0#Line width of y-axis main scale line
rcParams['font.size'] = 15 #Font size
rcParams['axes.linewidth'] = 1.0#Axis line width edge linewidth. Enclosure thickness

The appearance of the graph is full of good articles, so let's get some delicious things. https://qiita.com/qsnsr123/items/325d21621cfe9e553c17

analyze.py function definition



 #Transform the date into a shape that is easy to use every month
def devide_month(data):
    data = np.array(data)
    #Prepare a list to enter the data for each month
    month_01, month_02, month_03, month_04 = list(), list(), list(), list()
    month_05, month_06, month_07, month_08 = list(), list(), list(), list()
    month_09, month_10, month_11, month_12 = list(), list(), list(), list()
    #Get monthly date and key from data
    for i in range(len(data.T[1])):
        #January data month_Store in 01
        if data.T[1][i] == "01":
            #Date and dataframe key month_Store in 1
            month_01.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "02":
            month_02.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "03":
            month_03.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "04":
            month_04.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "05":
            month_05.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "06":
            month_06.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "07":
            month_07.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "08":
            month_08.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "09":
            month_09.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "10":
            month_10.append([data.T[2][i], data.T[3][i]])
        elif data.T[1][i] == "11":
            month_11.append([data.T[2][i], data.T[3][i]])
        else:
            month_12.append([data.T[2][i], data.T[3][i]])
    #Summarize monthly data
    date_month = [ month_01, month_02, month_03, month_04,
                   month_05, month_06, month_07, month_08,
                   month_09, month_10, month_11, month_12, ]
    return date_month

#Pulling a specific type of data from the dataframe
def collect_data_from_dataframe(dataframe, name, date_month):
    monthly_data = list()
    for i in range(len(date_month)):
        #If there is data in the month, get the data from the dataframe
        if date_month[i] != []:
            monthly_data.append(dataframe[name][int(date_month[i][0][1]):int(date_month[i][-1][1])+1].tolist())
        else:
            monthly_data.append(None)

    return monthly_data

#List the total amount of monthly spending
def monthly_sum_money(money):
    sum_list = list()
    for i in range(len(money)):
        if money[i] == None:
            sum_list.append(0)
        else:
            sum_list.append(np.sum(money[i]))
    return sum_list

analyze.Preparation of data for py graph




#Path of data downloaded from Zaim
read_data_path = "data/Zaim.2020.csv"
#Read the data and store it in pandas.
df = pd.read_csv(read_data_path, encoding="shift-jis")



key = 0
date_2020 = list()

#Extract date data
for date in df["date"]:
    #Divide dates by year, month, and day and store them in a list
    date = date.split("-")
    #Add a key to match the date with the original dataframe
    date.append(str(key))
    #Add the date and key to the list
    date_2020.append(date)
    key += 1

# date_month.shape: [Moon][Moonのデータの番号][Day,key]
date_month_2020 = devide_month(date_2020)

#Keep a list of your spending by month.
out_money = collect_data_from_dataframe(df, "Spending", date_month_2020)
#Calculate the monthly total of spending and keep it in a list.
out_money_monthly = monthly_sum_money(out_money)

#Collect income by month and keep it in a list.
in_money = collect_data_from_dataframe(df, "income", date_month_2020)
#Calculate the monthly total of income and keep it in the list.
in_money_monthly = monthly_sum_money(in_money)

#Calculate the balance and keep it in list format
difference_money = (np.array(in_money_monthly) - np.array(out_money_monthly)).tolist()

Create and save graphs


#Graph save destination
save_file_path = "./graph/"
#Annual spending graph
#Since there is no data completely before March of last year, I will omit it (April to December)
plt.plot(range(4, 13), np.array(out_money_monthly[3:])/10000, color="blue", label="Spending")
plt.xlabel('Moon')
plt.ylabel('Spending[Ten thousand yen]')
plt.grid(linestyle= '--')
plt.xlim(1,12)
plt.ylim(0,20)
#Settings such as axis scale spacing. In this case, the x-axis is 1 to 12 apart.
plt.xticks(np.arange(1, 13, 1))
plt.yticks(np.arange(0, 21, 1))
#Save graph
plt.savefig(save_file_path + "out_money.jpg ")
plt.cla()

#Annual income graph
plt.plot(range(4, 13), np.array(in_money_monthly[3:])/10000, color="red", label="income")
plt.xlabel('Moon')
plt.ylabel('income[Ten thousand yen]')
plt.grid(linestyle= '--')
plt.xlim(1,12)
plt.ylim(0,20)
plt.xticks(np.arange(1, 13, 1))
plt.yticks(np.arange(0, 21, 1))

plt.savefig(save_file_path + "in_money.jpg ")
plt.cla()


#Annual spending and income graph
#Income is shown in red and expenditure is shown in blue
plt.plot(range(4, 13), np.array(in_money_monthly[3:])/10000, color="red", label="income")
plt.plot(range(4, 13), np.array(out_money_monthly[3:])/10000, color="blue", label="Spending")
plt.legend(loc="lower right")
plt.xlabel('Moon')
plt.ylabel('Amount of money[Ten thousand yen]')
plt.grid(linestyle= '--')
plt.xlim(1,12)
plt.ylim(0,20)
plt.xticks(np.arange(1, 13, 1))
plt.yticks(np.arange(0, 21, 1))

plt.savefig(save_file_path + "money.jpg ")
plt.cla()


#Annual balance graph
#When expressing with a bar graph, I want to color-code the balance of each month by plus or minus, so I divide the data into two.
positive_money = list()
negative_money = list()
positive_x = list()
negative_x = list()
#Divide the data according to whether the difference between income and expenditure is positive or negative.
for i in range(len(difference_money)):
    if difference_money[i] >= 0:
        positive_money.append(difference_money[i])
        positive_x.append(i+1)
    else:
        negative_money.append(difference_money[i])
        negative_x.append(i+1)

#The positive balance is expressed in red, and the negative balance is expressed in blue.
plt.bar(positive_x, np.array(positive_money)/10000, color="red", label="Balance +")
plt.bar(negative_x, np.array(negative_money)/10000, color="blue", label="Balance of payments-")
plt.legend(loc="upper right")
plt.xlabel('Moon')
plt.ylabel('Balance of payments[Ten thousand yen]')
plt.grid(linestyle= '--')
plt.xlim(1,12)
plt.ylim(-8,8)
plt.xticks(np.arange(1, 13, 1))
plt.yticks(np.arange(-8, 9, 1))

plt.savefig(save_file_path + "difference_money.jpg ")
plt.cla()

#Output data to the outside for automatic creation of pdf report
format_data_path = "format_data/"
np.savetxt(format_data_path + "diff_money.csv", difference_money)

Source code for report_pdf.py

If I didn't read this article, I think I couldn't do it because I didn't know how to output (write?) To pdf. Overwhelming thanks! https://watlab-blog.com/2020/03/21/reportlab-pdf/

python



import numpy as np

from reportlab.pdfgen import canvas
from reportlab.lib.units import mm
from reportlab.pdfbase.pdfmetrics import registerFont
from reportlab.pdfbase.ttfonts import TTFont


#Register font
registerFont(TTFont('GenShinGothic',
                    './font_data/GenShinGothic-Monospace-Medium.ttf'))

file_path = 'report/balance report.pdf'  #Set output file name
graph_path_transition = "graph/money.jpg "
graph_path_balance = "graph/difference_money.jpg "


#Create a one-year income and expenditure report
paper = canvas.Canvas(file_path)  #Prepare a blank canvas
paper.saveState()  #Initialize
paper.setFont('GenShinGothic', 20)  #Set font

#Set the paper size for horizontal w and vertical h
w = 210 * mm
h = 297 * mm

paper.setPageSize((w, h))  #Set the paper size
paper.drawString(w/2 - 90, h - 50,  #Writing text
                 '1 year spending and income')


#Embed image(Image file path,Horizontal position,Vertical position,Image horizontal size,Image vertical size)
paper.drawInlineImage(graph_path_transition, 31*mm, h-130*mm, 148*mm, 111*mm)
paper.setFont('GenShinGothic', 15)  #Set font
paper.drawString(w/2 - 112.5, 160*mm, 'Figure 1 Changes in income and expenditure for one year')

#Embed the second image(Image file path,Horizontal position,Vertical position,Image horizontal size,Image vertical size)
paper.drawInlineImage(graph_path_balance, 31*mm, h-250*mm, 148*mm, 111*mm)
paper.setFont('GenShinGothic', 15)  #Set font
paper.drawString(w/2 - 67.5, 40*mm, 'Figure 2 One-year balance')

#Enter the result of income and expenditure
difference_money = np.loadtxt('format_data/diff_money.csv', delimiter=',') #Read the balance data
paper.setFont('GenShinGothic', 13)  #Set font
paper.drawString(w/2-100, 23*mm,
                 f'The balance of the whole year is{int(np.sum(difference_money))}It is a circle.')
paper.drawString(w/2-100, 14 * mm,
                 f'That is, per month{int(np.sum(difference_money)/len(difference_money))}It is a circle.')

paper.save()  #Save PDF


Execution result

Execution result of analyze.py

The result graph of executing analyze.py is in_money.jpg, out_money.jpg, money.jpg, difference_money.jpg, but two of them (in_money.jpg, out_money.jpg) are money.jpg. I will omit it because the contents are covered.

money.jpg Figure 1 Changes in income and expenditure for one year (money.jpg)

difference_money.jpg Figure 2 Changes in income and expenditure for one year (difference_money.jpg)

Execution result of report_pdf.py

The household account book report automatically created by report_pdf.py is as follows.

家計簿レポート.JPG

Recommended Posts

Try to automate pdf format report creation with Python
Try to automate the operation of network devices with Python
Try to reproduce color film with Python
Try logging in to qiita with Python
Library comparison summary to generate PDF with Python
Convert PDF to image (JPEG / PNG) with Python
Try to solve the man-machine chart with Python
Try to draw a life curve with python
Try to make a "cryptanalysis" cipher with Python
I tried to automate sushi making with python
Try to make a dihedral group with Python
Try to detect fish with python + OpenCV2.4 (unfinished)
Convert strings to character-by-character list format with python
Try scraping with Python.
Try to solve the programming challenge book with python3
[First API] Try to get Qiita articles with Python
Try to make a command standby tool with python
Try to solve the internship assignment problem with Python
Try to operate DB with Python and visualize with d3
Convert the image in .zip to PDF with Python
Try to factorial with recursion
Connect to BigQuery with Python
Try to understand Python self
Automate python testing with CircleCI
Post to slack with Python 3
Integrate PDF files with Python
Try running Python with Try Jupyter
Switch python to 2.7 with alternatives
Write to csv with Python
Try face recognition with Python
Format json with Vim (with python)
Automate sushi making with Python
String format with Python% operator
Try to display various information useful for debugging with python
[AWS] Try adding Python library to Layer with SAM + Lambda (Python)
Probably the easiest way to create a pdf with Python3
Experiment to make a self-catering PDF for Kindle with Python
Try to bring up a subwindow with PyQt5 and Python
It's Halloween so I'll try to hide it with Python
Try to decipher the garbled attachment file name with Python
Try scraping with Python + Beautiful Soup
Python: How to use async with
Link to get started with python
Convert markdown to PDF in Python
[Python] Write to csv file with Python
Create folders from '01' to '12' with python
Nice to meet you with python
Convert files written in python etc. to pdf with syntax highlighting
People who are accustomed to Android programs try multithreading with Python
Try to create a python environment with Visual Studio Code & WSL
Try to extract a character string from an image with Python3
Try to profile with ONNX Runtime
Convert list to DataFrame with python
MP3 to WAV conversion with Python
To do tail recursion with Python2
How to get started with Python
Try converting cloudmonkey CLI to python3 -1
What to do with PYTHON release?
Try face recognition with python + OpenCV
[Python] Explains how to use the format function with an example
Try to display google map and geospatial information authority map with python