Before we start: This Python tutorial is a part of our series of Python Package tutorials. You can find other Numpy related topics too!

NumPy (Numerical Python) is one of the most commonly used packages for scientific computing in Python. It provides a multidimensional array object (n-dimensional array, denoted ndarray), as well as variations such as masks and matrices, which can be used for various mathematical operations on numerical datatypes (dtypes). Python numpy is compatible with, and used by many other popular Python packages, including pandas and matplotlib.

Why is the numpy package such a popular Python library with beginners? Quite simply, because it’s faster than regular Python programming language arrays, which lack numpy’s optimized and pre-compiled C code that does all the heavy lifting when it comes to new array creation, as well as transposing, iterating, reshaping etc, array elements like tuples, booleans and other data structures. Another reason is that numpy arrays and arithmetic operations are vectorized, which means they lack explicit looping or array indexing in the code. This syntax makes the code not only more readable, but also more similar to standard mathematical notation.

The following example illustrates the vectorization difference between standard Python and the numpy library.

For two arrays A and B of the same size, if we wanted to do a vector multiplication in Python:

c = []
for i in range(len(a)):
 c.append(a[i]*b[i])

In numpy, this can simply be done with the following line of code:

c = a*b

Numpy makes many mathematical functions used widely in scientific computing fast and easy to use, such as:

  • Vector-Vector multiplication
  • Matrix-Matrix multiplication and Matrix-Vector multiplication 
  • Element-wise operations on vectors and matrices (i.e., adding, subtracting, multiplying, and dividing by a number )
  • Element-wise or array-wise comparisons
  • Applying functions element-wise to a vector/matrix ( like pow, log, and exp)
  • A whole lot of Linear Algebra operations can be found in NumPy.linalg
  • Reduction, statistics, and much more
 

The following tutorials will provide you with step-by-step instructions on how to work with NumPy, including:

Get a version of Python that’s pre-compiled for Data Science

While the open source distribution of Python may be satisfactory for an individual, it doesn’t always meet the support, security, or platform requirements of large organizations.

This is why organizations choose ActivePython for their data science, big data processing and statistical analysis needs.

Pre-bundled with the most important packages Data Scientists need, ActivePython is pre-compiled so you and your team don’t have to waste time configuring the open source distribution. You can focus on what’s important–spending more time building algorithms and predictive models against your big data sources, and less time on system configuration.

Some Popular Python Packages for Data Science/Big Data/Machine Learning You Get Pre-compiled – with ActivePython

  • pandas (data analysis)
  • NumPy (multi-dimensional arrays)
  • SciPy (algorithms to use with numpy)
  • HDF5 (store & manipulate data)
  • Matplotlib (data visualization)
  • Jupyter (research collaboration)
  • PyTables (managing HDF5 datasets)
  • HDFS (C/C++ wrapper for Hadoop)
  • pymongo (MongoDB driver)
  • SQLAlchemy (Python SQL Toolkit)
Related Links