Pandas Series

The Pandas Series is a one-dimensional, heterogeneous array with labels. We can create a Pandas Series data structure in three different ways:

Note:
When creating a Series, we can hand the constructor a list of axis labels (index). The index is an optional parameter.

  • With a Python dictionary – the sorted dictionary keys will become the index
  • With a NumPy array – the index values will be incremented starting from 0
  • With a single scalar value – we have to supply the index

Example:

Selecting the first column (Country) in the datafile

First we are going to select the Country column, which is the first column , then show the type of the object currently in the local scope.

country_col = df["Country"]
print("Type df", type(df))
print("Type country col", type(country_col))

The output will be:

Type df 
Type country col 

Attributes of Pandas Series

print("Series shape", country_col.shape)
print("Series index", country_col.index)
print("Series values", country_col.values)
print("Series name", country_col.name)

And the output will be:

Series shape (202,)
Series index RangeIndex(start=0, stop=202, step=1)
Series values ['Afghanistan' 'Albania' 'Algeria' 'Andorra' 'Angola' 'Antigua and Barbuda'
 'Argentina' 'Armenia' 'Australia' 'Austria' 'Azerbaijan' 'Bahamas'
 'Bahrain' 'Bangladesh' 'Barbados' 'Belarus' 'Belgium' 'Belize' 'Benin'
...
 'Trinidad and Tobago' 'Tunisia' 'Turkey' 'Turkmenistan' 'Tuvalu' 'Uganda'
 'Ukraine' 'United Arab Emirates' 'United Kingdom'
 'United States of America' 'Uruguay' 'Uzbekistan' 'Vanuatu' 'Venezuela'
 'Vietnam' 'West Bank and Gaza' 'Yemen' 'Zambia' 'Zimbabwe']
Series name Country

Slicing of a Pandas Series

print("Last 2 countries", country_col[-2:])
print("Last 2 countries type", type(country_col[-2:]))

And the output will be:

Last 2 countries 200      Zambia
201    Zimbabwe
Name: Country, dtype: object
Last 2 countries type 

NumPy functions on Pandas DataFrames and Series

NumPy functions can operate on Pandas DataFrames and Series. As an example we are going to apply NumPy sign() function. It returns 1 for positive numbers, -1 for negative numbers, and 0 for zeroes.

Example:

last_col = df.columns[-1]
print("Last df column signs:\n", last_col, np.sign(df[last_col]), "\n")

And the output will be:

Last df column signs:
 Urban_population_pct_of_total 0      1.0
1      1.0
2      1.0
3      NaN
4      1.0
      ... 
198    1.0
199    1.0
200    1.0
201    1.0
Name: Urban_population_pct_of_total, Length: 202, dtype: float64

If you just want to print the values of the last column then you can do it as it is shown below:

print(df[last_col].values)

The output will be:

[  22.9   45.4   63.3    nan   53.3   39.1   90.1   64.1   88.2   66.
   51.5   90.4   96.5   25.1   52.7   72.2   97.2   48.3   40.1  100.
   11.1   64.2   45.7   57.4   84.2   73.5   70.    18.3   10.    19.7   
...
   40.8   73.9   24.1   84.2   75.2   50.6    nan   24.7   24.2   32.3
    nan   40.1   24.    12.2   65.3   67.3   46.2    nan   12.6   67.8
   76.7   89.7   80.8   92.    36.7   23.5   93.4   26.4   71.6   27.3
35.    35.9]

Note:
Numerical operations between DataFrames, Series, and NumPy arrays can be performed very easily.

Leave a Reply