飞道的博客

数据可视化之seaborn绘图

342人阅读  评论(0)

记录一个今天画出来的数据统计图(绝美,当然数据是癌症相关的就不是很美了,

之前一直都用plt.plot,也不太会用
但是现在发现seaborn真的可以

palette = sns.color_palette("ocean", 2)
sns.kdeplot(data=cancer_data,x='Radius (worst)',shade=True,hue='Diagnosis',palette=palette)

调颜色:颜色链接
Trends - A trend is defined as a pattern of change.

sns.lineplot - Line charts are best to show trends over a period of time, and multiple lines can be used to show trends in more than one group.
例1:绘制多条曲线

# Line chart showing the number of visitors to each museum over time
plt.figure(figsize=(16,6))
plt.title("asd")
sns.lineplot(data=museum_data) # Your code here
plt.xlabel('data')
# Check your answer
#step_3.check()


例2:绘制单条曲线:

plt.figure(figsize=(16,6))
sns.lineplot(data=museum_data['Avila Adobe']);# Your code here

Relationship - There are many different chart types that you can use to understand relationships between variables in your data.

sns.barplot - Bar charts are useful for comparing quantities corresponding to different groups.
例1:

plt.figure(figsize=(8,6))
# Bar chart showing average score for racing games by platform
sns.barplot(x=ign_data['Racing'],y=ign_data.index) # Your code here


吹爆了,这真的好好看,换个调色板看看,这种多颜色的不适合用一个色调来画,这个就挺好看的了:
当我用我最喜欢的Blues来画时:(嘤,丑哭了)

这个要注意的是,最好把值都放在横坐标上,不然名称堆叠在横坐标上不好看。

sns.heatmap - Heatmaps can be used to find color-coded patterns in tables of numbers.

plt.figure(figsize=(10,10))
sns.heatmap(data=ign_data,annot=True) # Your code here


这个图片注意的是,如果数字显示不清楚的话,可以调大画布。

sns.scatterplot - Scatter plots show the relationship between two continuous variables; if color-coded, we can also show the relationship with a third categorical variable.

sns.scatterplot(x='pricepercent',
            y='winpercent',
                hue='chocolate',
            data=candy_data)

加上hue参数之后可以按照该参数给点分类

sns.regplot - Including a regression line in the scatter plot makes it easier to see any linear relationship between two variables.
这个可以在散点图里面把回归曲线画出来,而且画出来的曲线也巨好看

sns.regplot(x='sugarpercent',
            y='winpercent',
            data=candy_data)

sns.lmplot - This command is useful for drawing multiple regression lines, if the scatter plot contains multiple, color-coded groups.
如果想要多条回归曲线的话:

p=sns.color_palette('winter',2)
sns.lmplot(x='pricepercent',
            y='winpercent',
                hue='chocolate',
            data=candy_data,
          palette=p) # Your code here


我真的要吹爆这个图

sns.swarmplot - Categorical scatter plots show the relationship between a continuous variable and a categorical variable.

sns.swarmplot(x='chocolate',
             y='winpercent',
             data=candy_data)


这个能描述出来两个特征

Distribution - We visualize distributions to show the possible values that we can expect to see in a variable, along with how likely they are.

sns.histplot - Histograms show the distribution of a single numerical variable.

palette=sns.color_palette('Blues',2)
sns.histplot(data=cancer_data,x='Area (mean)',hue='Diagnosis',color='Blues',palette=palette)
plt.legend( [' benign ','malignant'])


如果想要kde曲线的话:

palette=sns.color_palette('PuBu',2)
sns.histplot(data=cancer_data,x='Area (mean)',hue='Diagnosis',color='Blues',palette=palette,kde=True)
plt.legend( [' benign ','malignant'])

sns.kdeplot - KDE plots (or 2D KDE plots) show an estimated, smooth distribution of a single numerical variable (or two numerical variables).
kde可以看作是hitogram的一种平滑

palette = sns.color_palette("ocean", 2)
sns.kdeplot(data=cancer_data,x='Radius (worst)',shade=True,hue='Diagnosis',palette=palette)


注意bar图是两个离散序列之间的关系,而histogram这种是统计每个数据出现的次数。

sns.jointplot - This command is useful for simultaneously displaying a 2D KDE plot with the corresponding KDE plots for each individual variable.


转载:https://blog.csdn.net/weixin_52205764/article/details/128508108
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场