查看原文
其他

python数据分析包|Pandas-04pandas学习资料大放送

pythonic生物人 pythonic生物人 2022-09-11

"pythonic生物人"的第83篇分享

本文整理python Pandas学习资料,请按需食用。

本文速览

更多好文,请微信关注公众号:pythonic生物人

1、Pandas官网(一生陪伴建议)

2、Python Data Science Handbook(快速入门建议)
作者:Jake VanderPlas简介
入门Pandas建议看第3章
图书膜拜地址 
英文(建议看原版):https://jakevdp.github.io/PythonDataScienceHandbook/

3、Python for Data Analysis(入门&&进阶建议)
作者简介:Wes McKinney
入门建议看
图书膜拜地址 
英文原版(建议看):https://github.com/wesm/pydata-book
中文版:https://github.com/BrambleXu/pydata-notebook

4、渣渣的我结合两本书及Pandas官网梳理的Pandas教程(恰饭建议)
5、其它一堆资料,Pandas官网推荐

1、Pandas官网(建议一生陪伴)

这里是最全的,建议当cookbook查询,不建议从头到尾学习,每个函数都有实例。
网址:https://pandas.pydata.org/pandas-docs/stable/index.html
比如你想判断pandas.DataFrame中是否包含判断缺省值(NA),使用isna方法,4步轻松搞定:


此外,个人看过下面两本书,「力荐」,如果只想学Pandas,请读相应章节。

2、Python Data Science Handbook(快速入门建议)

  • 副标题:Essential Tools for Working with Data
  • 中文标题:Python数据科学手册
  • 作者:Jake VanderPlas简介


Jake VanderPlas目前是华盛顿大学 eScience 学院物理科学研究院院长。他既是一位天文学家,也是一位会议演讲达人,活跃于历年的 PyData会议,尤其擅长 Python 科学计算与数据可视化。Jake 在数据可视化方面颇有建创 建了altair、mpld3、JSAnimation可视化程序库, 同时为NumPy、Scikit-Learn、Scipy、 Matplotlib、IPython 等著名 Python 程序库做了大量贡献。 个人很喜欢他的一次PyData会议的一张图,可以说将python各牛库窜起来了。

  • 入门Pandas建议看第3章

  • 图书膜拜地址

英文(建议看原版):https://jakevdp.github.io/PythonDataScienceHandbook/

给您搬过来了:

1. IPython: Beyond Normal Python

  • Help and Documentation in IPython
  • Keyboard Shortcuts in the IPython Shell
  • IPython Magic Commands
  • Input and Output History
  • IPython and Shell Commands
  • Errors and Debugging
  • Profiling and Timing Code
  • More IPython Resources

2. Introduction to NumPy

  • Understanding Data Types in Python
  • The Basics of NumPy Arrays
  • Computation on NumPy Arrays: Universal Functions
  • Aggregations: Min, Max, and Everything In Between
  • Computation on Arrays: Broadcasting
  • Comparisons, Masks, and Boolean Logic
  • Fancy Indexing
  • Sorting Arrays
  • Structured Data: NumPy's Structured Arrays

3. Data Manipulation with Pandas

  • Introducing Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Combining Datasets: Concat and Append
  • Combining Datasets: Merge and Join
  • Aggregation and Grouping
  • Pivot Tables
  • Vectorized String Operations
  • Working with Time Series
  • High-Performance Pandas: eval() and query()
  • Further Resources

4. Visualization with Matplotlib

  • Simple Line Plots
  • Simple Scatter Plots
  • Visualizing Errors
  • Density and Contour Plots
  • Histograms, Binnings, and Density
  • Customizing Plot Legends
  • Customizing Colorbars
  • Multiple Subplots
  • Text and Annotation
  • Customizing Ticks
  • Customizing Matplotlib: Configurations and Stylesheets
  • Three-Dimensional Plotting in Matplotlib
  • Geographic Data with Basemap
  • Visualization with Seaborn
  • Further Resources

5. Machine Learning

  • What Is Machine Learning?
  • Introducing Scikit-Learn
  • Hyperparameters and Model Validation
  • Feature Engineering
  • In Depth: Naive Bayes Classification
  • In Depth: Linear Regression
  • In-Depth: Support Vector Machines
  • In-Depth: Decision Trees and Random Forests
  • In Depth: Principal Component Analysis
  • In-Depth: Manifold Learning
  • In Depth: k-Means Clustering
  • In Depth: Gaussian Mixture Models
  • In-Depth: Kernel Density Estimation
  • Application: A Face Detection Pipeline
  • Further Machine Learning Resources

pdf版,网上各种资源,找不到获取方式见文末。


3、Python for Data Analysis(入门建议)

这本书网上的呼声明显高于上面那本,可能关于Pandas的章节更多,接受的更系统

  • 副标题:
  • 中文标题
  • 作者简介:Wes McKinney

Creator of Python pandas(Pandas的爸爸,还需要多言吗???)

  • 入门建议看

  • 图书膜拜地址

英文原版(建议看):https://github.com/wesm/pydata-book

给你搬过来了:

  • Chapter 2: Python Language Basics, IPython, and Jupyter Notebooks
  • Chapter 3: Built-in Data Structures, Functions, and Files
  • Chapter 4: NumPy Basics: Arrays and Vectorized Computation
  • Chapter 5: Getting Started with pandas
  • Chapter 6: Data Loading, Storage, and File Formats
  • Chapter 7: Data Cleaning and Preparation
  • Chapter 8: Data Wrangling: Join, Combine, and Reshape
  • Chapter 9: Plotting and Visualization
  • Chapter 10: Data Aggregation and Group Operations
  • Chapter 11: Time Series
  • Chapter 12: Advanced pandas
  • Chapter 13: Introduction to Modeling Libraries in Python
  • Chapter 14: Data Analysis Examples
  • Appendix A: Advanced NumPy

中文版【部分】:https://github.com/BrambleXu/pydata-notebook

也给你搬过来了:

  • Chapter 4: NumPy Basics: Arrays and Vectorized Computation(NumPy基础:数组和向量化计算)
  • Chapter 5: Getting Started with pandas(开始使用pandas)
  • Chapter 7: Data Cleaning and Preparation(数据清洗和准备)
  • Chapter 11: Time Series(时间序列)
  • Chapter 12: Advanced pandas(高级pandas用法)
  • Chapter 14: Data Analysis Examples(数据分析实例)

4、渣渣的我结合两本书及Pandas官网梳理的Pandas教程

5、其它一堆资料,Pandas官网推荐:

下面各种视屏,cheatsheet,书籍,酌情食用。pandas’ own 10 Minutes to pandas.More complex recipes are in the Cookbook.A handy pandas cheat sheet.Community guidespandas Cookbook by Julia EvansThe goal of this 2015 cookbook (by Julia Evans上面第二本书的作者) is to give you some concrete examples for getting started with pandas. These are examples with real-world data, and all the bugs and weirdness that entails. For the table of contents, see the pandas-cookbook GitHub repository.Learn Pandas by Hernan RojasA set of lesson for new pandas users: https://bitbucket.org/hrojas/learn-pandasPractical data analysis with PythonThis guide is an introduction to the data analysis process using the Python data ecosystem and an interesting open dataset. There are four sections covering selected topics as munging data, aggregating data, visualizing data and time series.Exercises for new usersPractice your skills with real data sets and exercises. For more resources, please visit the main repository.Modern pandasTutorial series written in 2016 by Tom Augspurger. The source may be found in the GitHub repository TomAugspurger/effective-pandas.

  • Modern Pandas
  • Method Chaining
  • Indexes
  • Performance
  • Tidy Data
  • Visualization
  • Timeseries

Excel charts with pandas, vincent and xlsxwriter

  • Using Pandas and XlsxWriter to create Excel charts

Video tutorials

  • Pandas From The Ground Up (2015) (2:24) GitHub repo
  • Introduction Into Pandas (2016) (1:28) GitHub repo
  • Pandas: .head() to .tail() (2016) (1:26) GitHub repo
  • Data analysis in Python with pandas (2016-2018) GitHub repo and Jupyter Notebook
  • Best practices with pandas (2018) GitHub repo and Jupyter Notebook

Various tutorials¶

  • Wes McKinney’s (pandas BDFL) blog
  • Statistical analysis made easy in Python with SciPy and pandas DataFrames, by Randal Olson
  • Statistical Data Analysis in Python, tutorial videos, by Christopher Fonnesbeck from SciPy 2013
  • [Financial analysis in Python, by Thomas Wiecki](<http://nbviewer.ipython.org/github/twiecki/financial-analysis-python-tutorial/blob/master/1. Pandas Basics.ipynb>)
  • Intro to pandas data structures, by Greg Reda
  • Pandas and Python: Top 10, by Manish Amde
  • Pandas DataFrames Tutorial, by Karlijn Willems
  • A concise tutorial with real life examples

更多好文,请微信关注公众号:pythonic生物人

同系列文章

python数据分析包|Pandas-01之DataFrame&Series
python数据分析包|Pandas-02之缺失值(NA)处理
python数据分析包|Pandas-03pandas读写表格数据
python数据分析包|Pandas&NumPy小抄(Cheat_Sheet)

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存