【Python百日进阶-数据分析】Day136 - plotly旭日图:px.sunburst()实例
文章目录
- 四、实例
- 4.1 带有 plotly.express 的旭日图
- 4.1.1 基础旭日图
- 4.1.2 带有 plotly.express 的矩形 DataFrame 的旭日形
- 4.1.3 改变path顺序,从而改变父子关系
- 4.1.4 在 px.sunburst 中具有连续颜色参数的矩形 DataFrame 的 Sunburst
- 4.1.5 在 px.sunburst 中具有离散颜色参数的矩形 DataFrame 的 Sunburst
- 4.1.6 对离散颜色使用显式映射
- 4.1.7 具有缺失值的矩形数据
四、实例
旭日图可视化从根到叶径向向外跨越的分层数据。与Icicle 图表和Treemaps类似,层次结构由labels( namesfor px.icicle) 和parents属性定义。根从中心开始,子代被添加到外环。
4.1 带有 plotly.express 的旭日图
Plotly Express是 Plotly 的易于使用的高级界面,它对各种类型的数据进行操作并生成易于样式化的图形。
使用px.sunburst,DataFrame 的每一行都表示为旭日形的一个扇区。
4.1.1 基础旭日图
import plotly.express as px
data = dict(
character=["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
parent=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
value=[10, 14, 12, 10, 2, 6, 6, 4, 4])
fig = px.sunburst(
data,
names='character',
parents='parent',
values='value',
)
fig.show()
4.1.2 带有 plotly.express 的矩形 DataFrame 的旭日形
分层数据通常存储为矩形数据框,不同的列对应于层次结构的不同级别。px.sunburst可以采用path与列列表对应的参数。请注意,path id如果给出,parent则不应提供。
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig.show()
4.1.3 改变path顺序,从而改变父子关系
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.sunburst(df, path=['day', 'sex', 'time'], values='total_bill')
fig.show()
4.1.4 在 px.sunburst 中具有连续颜色参数的矩形 DataFrame 的 Sunburst
如果color传递了参数,则节点的颜色将计算为其子节点颜色值的平均值,并按其值加权。
import plotly.express as px
import numpy as np
df = px.data.gapminder().query("year == 2007")
print(df)
'''
country continent year ... gdpPercap iso_alpha iso_num
11 Afghanistan Asia 2007 ... 974.580338 AFG 4
23 Albania Europe 2007 ... 5937.029526 ALB 8
35 Algeria Africa 2007 ... 6223.367465 DZA 12
47 Angola Africa 2007 ... 4797.231267 AGO 24
59 Argentina Americas 2007 ... 12779.379640 ARG 32
... ... ... ... ... ... ... ...
1655 Vietnam Asia 2007 ... 2441.576404 VNM 704
1667 West Bank and Gaza Asia 2007 ... 3025.349798 PSE 275
1679 Yemen, Rep. Asia 2007 ... 2280.769906 YEM 887
1691 Zambia Africa 2007 ... 1271.211593 ZMB 894
1703 Zimbabwe Africa 2007 ... 469.709298 ZWE 716
[142 rows x 8 columns]
'''
fig = px.sunburst(df, path=['continent', 'country'], values='pop',
color='lifeExp', hover_data=['iso_alpha'],
color_continuous_scale='RdBu',
color_continuous_midpoint=np.average(df['lifeExp'], weights=df['pop']))
fig.show()
4.1.5 在 px.sunburst 中具有离散颜色参数的矩形 DataFrame 的 Sunburst
当 的参数color对应于非数值数据时,使用离散颜色。如果一个扇区的color所有子项的列值相同,则使用相应的颜色,否则使用离散颜色序列的第一种颜色。
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.sunburst(df, path=['sex', 'day', 'time'], values='total_bill', color='day')
fig.show()
在下面的示例中,Saturday和Sunday扇区的颜色相同,Dinner因为只有周六和周日的晚餐条目。但是,对于女性 -> 星期五,有午餐和晚餐,因此使用“混合”颜色(此处为蓝色)。
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.sunburst(df, path=['sex', 'day', 'time'], values='total_bill', color='time')
fig.show()
4.1.6 对离散颜色使用显式映射
有关离散颜色的更多信息,请参阅专用页面。
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.sunburst(df,
path=['sex', 'day', 'time'],
values='total_bill',
color='time',
color_discrete_map={'(?)':'black', 'Lunch':'gold', 'Dinner':'darkblue'})
fig.show()
4.1.7 具有缺失值的矩形数据
如果数据集不是完全矩形,则缺失值应提供为None. 请注意,None条目的父母必须是叶子,即None它不能有其他孩子(否则会引发ValueError)。
raise ValueError(
ValueError: ('None entries cannot have not-None children', regions North
vendors None
sectors Other
Name: 4, dtype: object)
import plotly.express as px
import pandas as pd
vendors = ["A", "B", "C", "D", None, "E", "F", "G", "H", None]
sectors = ["Tech", "Tech", "Finance", "Finance", "Other",
"Tech", "Tech", "Finance", "Finance", "Other"]
regions = ["North", "North", "North", "North", "North",
"South", "South", "South", "South", "South"]
sales = [1, 3, 2, 4, 1, 2, 2, 1, 4, 1]
df = pd.DataFrame(
dict(vendors=vendors, sectors=sectors, regions=regions, sales=sales)
)
print(df)
'''
vendors sectors regions sales
0 A Tech North 1
1 B Tech North 3
2 C Finance North 2
3 D Finance North 4
4 None Other North 1
5 E Tech South 2
6 F Tech South 2
7 G Finance South 1
8 H Finance South 4
9 None Other South 1
'''
fig = px.sunburst(df, path=['regions', 'sectors', 'vendors'], values='sales')
fig.show()