data = pd.read_csv("sample.csv", encoding="UTF-8")
data
Unnecessary,Unnecessary,Unnecessary,Unnecessary,Unnecessary,Unnecessary
Unnecessary,Title A,Title B,Title C,Title D,Unnecessary
Unnecessary,10,20,30,40,Unnecessary
Unnecessary,100,200,300,400,Unnecessary
Unnecessary,Unnecessary,Unnecessary,Unnecessary,Unnecessary,Unnecessary
I save the data in the Google spreadsheet as CSV and imagine the data when analyzing it. I think there are quite a few sheets where memos and remarks are written without being structured. I think that you can select the range when saving, but this time I will try to organize it with pandas after practicing.
data.columns = data.iloc[0]
data
data = data.iloc[1:3,1:5]
data
It's just what I want.
data.describe()
I thought that the average etc. would come out, but it doesn't.
This is because the value type is not numeric.
data = data.astype('int')
data
data.describe()
data.corr()
#### Remarks
I don't know what the 0 in the upper left is
data.sum() #total
data.skew() #skewness
data.kurt() #kurtosis
data.var() #Distributed
data.cov() #Covariance matrix
%matplotlib inline #Required to display on page
data.plot(kind='box')
#### Remarks
The Japanese label is not displayed, but Japanese is
```
matplotlib.rcParams['font.family'] = 'M+ 1c' #Specifiable font
```
It can be displayed by specifying as.
The fonts that can be specified are
```
import matplotlib.font_manager as fm
fm.findSystemFonts()
```
You can find out at.
http://qiita.com/hagino3000/items/1b54acc01483ccd0ac72
I referred to.
pd.concat([data,data])
pd.concat([data,data], axis=1)
data.pipe(lambda df: df / 2)
data['Title A'].sort_values(ascending = True)
Recommended Posts