Python is an open source multipurpose programming language used for many different applications. There is a huge community backing which made this open source language very popular. The power of its capabilities comes from the number of libraries it supports which is only growing through the community.

The main downside of this language is since it is runtime interpreted it takes more processing power. However due to ease of learning this compensates for the programming time.


Python was developed in early nineties by Guido van Rossum in Netherlands. Python is copyrighted. Like Perl, Python source code is now available under the GNU General Public License (GPL)

  • Easy to Read-Learn-Maintain
  • Support functional, structured programming methods and OOP as well.
  • Can be used as a scripting language
  • Supports automatic garbage collection
  • Provides high level dynamic data types and supports dynamic data type checking

Setting up Environment:

There are 2 ways you can setup the Python environment on local machine for practicing the language.

Second option is much simpler and very easy to use. I have installed Anaconda and use Jupyter Notebook for writing Python code.

Warm up:

  • Hello World !

If you have installed Jupyter along with Python using Ananconda, open Jupyter notebook which opens in a web browser

Type the following:

print ('Hello Python !')

Click on the execute button on top or use keyboard shortcut Shift + Enter

  • Python is case sensitive hence Python and python are two different identifiers

Lines and Indentation:

Python does not provide any braces to indicate blocks of code  or for flow control but instead they are indicated by line indentation.


if True:
      print ('First')
      print ('True')
      print ('Second')
      print ('False')
print ('Program Ends !')

However the code below will generate an error

if True:
	print ('First')
        print ('True')
	print ('Second')
       	print ('False')

This is because the number of spaces within the block should be same to form a block of code.


Python accepts all three types of quotes single (‘), double (“) and triple (”’ or “””) to denote strings. Triple quotes are usually used for denoting strings across multiple lines

word = 'World'
line = "This is double quotes"
multi_line = """This is being used to
demonstrate multi-line sentences using triple quotes"""


For commenting a line Python uses hash symbol (#) and hence anything after the hash in the same line is ignored. There is no separate multi-line commenting symbols in Python, hence there will be one hash per line

#This is a comment

Python Data Structures:

Below are some data structures used in Python and are very important to understand to continue with Data Science learning

A list contains items separated by commas in a square bracket ([ ]). These items in the list can be of multiple data types.

The list values can be accessed by slice operator with indexes starting at 0 in the beginning of the list and working their way to end -1. The plus (+) sign is the list concatenation operator, and the asterisk (*) is the repetition operator.


list = [12, 48, 33, 'Paul', 90, 'Harris']
list_1 = ['Ram', 121]
print (list)          # Prints complete list
print (list[0])       # Prints 12 (first element of the list )
print (list[1:3])     # Prints 48, 33 (elements starting from 2nd till 3rd)
print (list[2:])      # Prints 33, 'Paul', 90, 'Harris' (elements starting from 3rd element)
print (list_1 * 2)    # Prints ['Ram', 121, 'Ram', 121] (list two times)
print (list + list_1) # Prints [12, 48, 33, 'Paul', 90, 'Harris', 'Ram', 121] (concatenated lists)


These are another sequence data type similar to lists. These consist of number of values of different data types separated by commas and the output is displayed within parentheses () unlike square brackets for lists [ ].

The main difference is Tuples are immutable which means, we cannot assign modify or change the values. They are used to hold some data which does not change in the program.

Following works:

tuple = (12, 48, 33, 'Paul', 90, 'Harris')
tuple_1 = ('Ram', 121)
print (tuple)            # Prints complete list
print (tuple[0])         # Prints 12 (first element of the list )
print (tuple[1:3])       # Prints 48, 33 (elements starting from 2nd till 3rd)
print (tuple[2:])        # Prints 33, 'Paul', 90, 'Harris' (elements starting from 3rd element)
print (tuple_1 * 2)      # Prints ['Ram', 121, 'Ram', 121] (list two times)
print (tuple + tuple_1)  # Prints [12, 48, 33, 'Paul', 90, 'Harris', 'Ram', 121] (concatenated lists)


list[3] = 99	
tuple[3] = 99	
print (list)	#works
print (tuple)	#Does not support value assignment

They are like associated arrays and consist of key-value pairs. Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using square braces ([]).


dictionary = {}
dictionary['one'] = "This is one"
dictionary[2] = "This is two"
tinydict = {'name': 'harris','code':3433, 'dept': 'IT'}
print (dictionary)              #Prints key-pair associations
print (dictionary['one'])       # Prints value for 'one' key
print (dictionary[2])           # Prints value for 2 key
print (tinydict)                # Prints complete dictionary
print (tinydict.keys())         # Prints all the keys
print (tinydict.values())       # Prints all the values

Python Iteration:

  • While Loop: This executes a target statement repeatedly until the condition is true


count = 0
while (count < 9):
      print ('The count is:', count)
      count = count + 1
print "Good bye!"

As you can see in the above code it identifies the code block by indentation (number of spaces)

  • For Loop: This has the ability to iterate over the items in any sequence, such as lists, tuple or strings.
for letter in 'Python': # First Example
    print ('Current Letter :', letter)

colors = ['red','orange','green','purple']
for col in colors:
    print ('Current Color: ', col)


Current Letter : P
Current Letter : y
Current Letter : t
Current Letter : h
Current Letter : o
Current Letter : n
Current Color:  red
Current Color:  orange
Current Color:  green
Current Color:  purple

Python Numbers:

There are 4 numeric data types Python supports which are integers (int), long integers (long), floating point real values (float) and complex numbers (complex).

During the learning we would mostly be limited to using int and float data types.

We do not need to explicitly specify the data type in Python. When the variable is assigned with a value it automatically creates the variable with the required data type to hold the value.


n_int = 10
n_float = 10.23
print ('Data Type for n_int: ', type(n_int))
print ('Data Type for n_float: ', type(n_float))


Data Type for n_int:  <class 'int'>
Data Type for n_float:  <class 'float'>

Python Strings:

These are the most popular types in Python which are very frequently used. A string can be simply be created using single or double quotes. Since they are treated as lists by Python they can be sliced as described earlier in Lists.


word = 'this is python string'
print ('String Entered: ', word)
print ('First to 6th values in String Entered: ', word[0:7])
print ('String modified: ', word + ' modified')


String Entered:  this is python string
First to 6th values in String Entered:  this is
String modified:  this is python string modified

These are some basics of Python language useful for further coding in Data Science.

Practice, practice, practice and more practice….

There is no use learning if you dont practice what you learn. I would encourage you to start writing some basic python programs. Below is a good site to start putting your coding skills to practice.

I will write another post on Data Exploration using Python.

-Hari Mindi


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.