Skip to main content

Powerful data structures for data analysis and statistics

Project description

pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.

pandas is well suited for many different kinds of data:

  • Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet

  • Ordered and unordered (not necessarily fixed-frequency) time series data.

  • Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels

  • Any other form of observational / statistical data sets. The data actually need not be labeled at all to be placed into a pandas data structure

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.

Here are just a few of the things that pandas does well:

  • Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data

  • Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects

  • Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations

  • Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data

  • Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects

  • Intelligent label-based slicing, fancy indexing, and subsetting of large data sets

  • Intuitive merging and joining data sets

  • Flexible reshaping and pivoting of data sets

  • Hierarchical labeling of axes (possible to have multiple labels per tick)

  • Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving / loading data from the ultrafast HDF5 format

  • Time series-specific functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging, etc.

Many of these principles are here to address the shortcomings frequently experienced using other languages / scientific research environments. For data scientists, working with data is typically divided into multiple stages: munging and cleaning data, analyzing / modeling it, then organizing the results of the analysis into a form suitable for plotting or tabular display. pandas is the ideal tool for all of these tasks.

Note

Windows binaries built against NumPy 1.6.1

Project details


Release history Release notifications | RSS feed

This version

0.4.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-0.4.2.tar.gz (1.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pandas-0.4.2.win-amd64-py2.7.exe (741.2 kB view details)

Uploaded Source

pandas-0.4.2.win-amd64-py2.6.exe (741.0 kB view details)

Uploaded Source

pandas-0.4.2.win32-py2.7.exe (644.7 kB view details)

Uploaded Source

pandas-0.4.2.win32-py2.6.exe (644.4 kB view details)

Uploaded Source

pandas-0.4.2.win32-py2.5.exe (509.9 kB view details)

Uploaded Source

File details

Details for the file pandas-0.4.2.tar.gz.

File metadata

  • Download URL: pandas-0.4.2.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pandas-0.4.2.tar.gz
Algorithm Hash digest
SHA256 be6b49c71ab2d46f4ee00b4b58585f5ea7636b06b3c0533a836d603d9623a7eb
MD5 c9011e5490debd00479e31442c313430
BLAKE2b-256 752ced778aaab592f3ec2e9e9361eb9c23b8b63ee35200a6956b0cba8580c0e9

See more details on using hashes here.

File details

Details for the file pandas-0.4.2.win-amd64-py2.7.exe.

File metadata

File hashes

Hashes for pandas-0.4.2.win-amd64-py2.7.exe
Algorithm Hash digest
SHA256 d1d2cb3b876bc8363e254f37f4e33bde4e1d3cbe82d350f74c784cac3cbe8c4e
MD5 0b6cd81bc5172922e8541745fce3c9bc
BLAKE2b-256 967dbdfdb91f5e9ff7517823f61b338f48a05b63ad0535aba53baac32726dc3d

See more details on using hashes here.

File details

Details for the file pandas-0.4.2.win-amd64-py2.6.exe.

File metadata

File hashes

Hashes for pandas-0.4.2.win-amd64-py2.6.exe
Algorithm Hash digest
SHA256 49b2bc9ff01698faab2538cc8bfadae43c227d24f536483c147e176a49393c1e
MD5 4f4777a88b9d187f7aab29851292d838
BLAKE2b-256 5d6a9d8a796c5f737833f8e3bd798b20ff9d4b87a7241b5492d5e2cd69fa3b53

See more details on using hashes here.

File details

Details for the file pandas-0.4.2.win32-py2.7.exe.

File metadata

File hashes

Hashes for pandas-0.4.2.win32-py2.7.exe
Algorithm Hash digest
SHA256 906fbf32cf011638f4da96cd62d7b33296ddb8d82f04d8ce4d0b303342780622
MD5 6c2bbd52b6a6e74952f1e30c80264c85
BLAKE2b-256 e611eda877cb1adec38dc8edc44a1eff6deb8a37f4f7670db11d0e1f8aaaca9d

See more details on using hashes here.

File details

Details for the file pandas-0.4.2.win32-py2.6.exe.

File metadata

File hashes

Hashes for pandas-0.4.2.win32-py2.6.exe
Algorithm Hash digest
SHA256 96f057a95d936fff85b3635886f780ceb1a5ecf77d3d50d4476064c87a00e223
MD5 2fff0b7a6730ab32408d316fd9e6a85a
BLAKE2b-256 3de532405d77b14aea616b2496c153de68ed6046f9771f0192bb9e6d297f5354

See more details on using hashes here.

File details

Details for the file pandas-0.4.2.win32-py2.5.exe.

File metadata

File hashes

Hashes for pandas-0.4.2.win32-py2.5.exe
Algorithm Hash digest
SHA256 10efe94342b1f73274ffeca7c7810dd609779bb9e36bbc66ab2ae520cec8d9e1
MD5 2a746efbdf80320be96393322b8e9ac9
BLAKE2b-256 e334dc80c940b4ff48b353cab68fdb83551d8b8e591de2fab5c71c1d78e80b9c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page