Notice: session_start(): A session had already been started - ignoring in /home/u614325693/domains/onedoubt.com/public_html/post.php on line 3
HTML Elements - ONEDOUBT.COM

HTML Elements


An HTML element is defined by a start tag, some content, and an end tag.


HTML Elements

The HTML element is everything from the start tag to the end tag:

<tagname>Content goes here...</tagname>

Examples of some HTML elements:

<h1>My First Heading</h1>
<p>My first paragraph.</p>
Start tag Element content End tag
<h1> My First Heading </h1>
<p> My first paragraph. </p>
<br> none none

Note: Some HTML elements have no content (like the <br> element). These elements are called empty elements. Empty elements do not have an end tag!


Nested HTML Elements

HTML elements can be nested (this means that elements can contain other elements).

All HTML documents consist of nested HTML elements.

The following example contains four HTML elements (<html>, <body>, <h1> and <p>):

Example

<!DOCTYPE html>
<html>
<body>

<h1>My First Heading</h1>
<p>My first paragraph.</p>

</body>
</html>

Example Explained

The <html> element is the root element and it defines the whole HTML document.

It has a start tag <html> and an end tag </html>.

Then, inside the <html> element there is a <body> element:

<body>

<h1>My First Heading</h1>
<p>My first paragraph.</p>

</body>

The <body> element defines the document's body.

It has a start tag <body> and an end tag </body>.

Then, inside the <body> element there are two other elements: <h1> and <p>:

<h1>My First Heading</h1>
<p>My first paragraph.</p>

The <h1> element defines a heading.

It has a start tag <h1> and an end tag </h1>:

<h1>My First Heading</h1>

The <p> element defines a paragraph.

It has a start tag <p> and an end tag </p>:

<p>My first paragraph.</p>


Never Skip the End Tag

Some HTML elements will display correctly, even if you forget the end tag:

Example

<html>
<body>

<p>This is a paragraph
<p>This is a paragraph

</body>
</html>

However, never rely on this! Unexpected results and errors may occur if you forget the end tag!


Empty HTML Elements

HTML elements with no content are called empty elements.

The <br> tag defines a line break, and is an empty element without a closing tag:

Example

<p>This is a <br> paragraph with a line break.</p>

HTML is Not Case Sensitive

HTML tags are not case sensitive: <P> means the same as <p>.

The HTML standard does not require lowercase tags, but W3C recommends lowercase in HTML, and demands lowercase for stricter document types like XHTML.

At W3Schools we always use lowercase tag names.


HTML Tag Reference

W3Schools' tag reference contains additional information about these tags and their attributes.

Tag Description
Defines the root of an HTML document
Defines the document's body
Defines HTML headings

For a complete list of all available HTML tags, visit our .


Python Tutorial

Learn Python

Python is a popular programming language.

Python can be used on a server to create web applications.


Learning by Examples

With our "Try it Yourself" editor, you can edit Python code and view the result.

Example

print("Hello, World!")

Click on the "Try it Yourself" button to see how it works.


Python File Handling

In our File Handling section you will learn how to open, read, write, and delete files.


Python Database Handling

In our database section you will learn how to access and work with MySQL and MongoDB databases:


Python Introduction


What is Python?

Python is a popular programming language. It was created by Guido van Rossum, and released in 1991.

It is used for:

  • web development (server-side),
  • software development,
  • mathematics,
  • system scripting.

What can Python do?

  • Python can be used on a server to create web applications.
  • Python can be used alongside software to create workflows.
  • Python can connect to database systems. It can also read and modify files.
  • Python can be used to handle big data and perform complex mathematics.
  • Python can be used for rapid prototyping, or for production-ready software development.

Why Python?

  • Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
  • Python has a simple syntax similar to the English language.
  • Python has syntax that allows developers to write programs with fewer lines than some other programming languages.
  • Python runs on an interpreter system, meaning that code can be executed as soon as it is written. This means that prototyping can be very quick.
  • Python can be treated in a procedural way, an object-oriented way or a functional way.

Good to know

  • The most recent major version of Python is Python 3, which we shall be using in this tutorial. However, Python 2, although not being updated with anything other than security updates, is still quite popular.
  • In this tutorial Python will be written in a text editor. It is possible to write Python in an Integrated Development Environment, such as Thonny, Pycharm, Netbeans or Eclipse which are particularly useful when managing larger collections of Python files.

Python Syntax compared to other programming languages

  • Python was designed for readability, and has some similarities to the English language with influence from mathematics.
  • Python uses new lines to complete a command, as opposed to other programming languages which often use semicolons or parentheses.
  • Python relies on indentation, using whitespace, to define scope; such as the scope of loops, functions and classes. Other programming languages often use curly-brackets for this purpose.

Example

print("Hello, World!")

Python Getting Started


Python Install

Many PCs and Macs will have python already installed.

To check if you have python installed on a Windows PC, search in the start bar for Python or run the following on the Command Line (cmd.exe):

C:UsersYour Name>python --version

To check if you have python installed on a Linux or Mac, then on linux open the command line or on Mac open the Terminal and type:

python --version

If you find that you do not have Python installed on your computer, then you can download it for free from the following website:


Python Quickstart

Python is an interpreted programming language, this means that as a developer you write Python (.py) files in a text editor and then put those files into the python interpreter to be executed.

The way to run a python file is like this on the command line:

C:UsersYour Name>python helloworld.py

Where "helloworld.py" is the name of your python file.

Let's write our first Python file, called helloworld.py, which can be done in any text editor.

helloworld.py

print("Hello, World!")

Simple as that. Save your file. Open your command line, navigate to the directory where you saved your file, and run:

C:UsersYour Name>python helloworld.py

The output should read:

Hello, World!

Congratulations, you have written and executed your first Python program.



The Python Command Line

To test a short amount of code in python sometimes it is quickest and easiest not to write the code in a file. This is made possible because Python can be run as a command line itself.

Type the following on the Windows, Mac or Linux command line:

C:UsersYour Name>python
Or, if the "python" command did not work, you can try "py":
C:UsersYour Name>py

From there you can write any python, including our hello world example from earlier in the tutorial:

C:UsersYour Name>python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

>>> print("Hello, World!")

Which will write "Hello, World!" in the command line:

C:UsersYour Name>python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello, World!")

Hello, World!

Whenever you are done in the python command line, you can simply type the following to quit the python command line interface:

exit()

Python Syntax


Execute Python Syntax

As we learned in the previous page, Python syntax can be executed by writing directly in the Command Line:

>>> print("Hello, World!")
Hello, World!

Or by creating a python file on the server, using the .py file extension, and running it in the Command Line:

C:UsersYour Name>python myfile.py

Python Indentation

Indentation refers to the spaces at the beginning of a code line.

Where in other programming languages the indentation in code is for readability only, the indentation in Python is very important.

Python uses indentation to indicate a block of code.

Example

if 5 > 2:
  print("Five is greater than two!")

Python will give you an error if you skip the indentation:

Example

Syntax Error:

if 5 > 2:
print("Five is greater than two!")

The number of spaces is up to you as a programmer, the most common use is four, but it has to be at least one.

Example

if 5 > 2:
 print("Five is greater than two!") 
if 5 > 2:
        print("Five is greater than two!") 

You have to use the same number of spaces in the same block of code, otherwise Python will give you an error:

Example

Syntax Error:

if 5 > 2:
 print("Five is greater than two!")
        print("Five is greater than two!")





Python Variables

In Python, variables are created when you assign a value to it:

Example

Variables in Python:

x = 5
y = "Hello, World!"

Python has no command for declaring a variable.

You will learn more about variables in the chapter.


Comments

Python has commenting capability for the purpose of in-code documentation.

Comments start with a #, and Python will render the rest of the line as a comment:

Example

Comments in Python:

#This is a comment.
print("Hello, World!")

Test Yourself With Exercises

Exercise:

Insert the missing part of the code below to output "Hello World".

("Hello World")


Python Comments


Comments can be used to explain Python code.

Comments can be used to make the code more readable.

Comments can be used to prevent execution when testing code.


Creating a Comment

Comments starts with a #, and Python will ignore them:

Example

#This is a comment
print("Hello, World!")

Comments can be placed at the end of a line, and Python will ignore the rest of the line:

Example

print("Hello, World!") #This is a comment

A comment does not have to be text that explains the code, it can also be used to prevent Python from executing code:

Example

#print("Hello, World!")
print("Cheers, Mate!")


Multiline Comments

Python does not really have a syntax for multiline comments.

To add a multiline comment you could insert a # for each line:

Example

#This is a comment
#written in
#more than just one line
print("Hello, World!")

Or, not quite as intended, you can use a multiline string.

Since Python will ignore string literals that are not assigned to a variable, you can add a multiline string (triple quotes) in your code, and place your comment inside it:

Example

"""
This is a comment
written in
more than just one line
"""
print("Hello, World!")

As long as the string is not assigned to a variable, Python will read the code, but then ignore it, and you have made a multiline comment.


Test Yourself With Exercises

Exercise:

Comments in Python are written with a special character, which one?

This is a comment


Python Variables


Variables

Variables are containers for storing data values.


Creating Variables

Python has no command for declaring a variable.

A variable is created the moment you first assign a value to it.

Example

x = 5
y = "John"
print(x)
print(y)

Variables do not need to be declared with any particular type, and can even change type after they have been set.

Example

x = 4       # x is of type int
x = "Sally" # x is now of type str
print(x)

Casting

If you want to specify the data type of a variable, this can be done with casting.

Example

x = str(3)    # x will be '3'
y = int(3)    # y will be 3
z = float(3)  # z will be 3.0


Get the Type

You can get the data type of a variable with the type() function.

Example

x = 5
y = "John"
print(type(x))
print(type(y))
You will learn more about and later in this tutorial.

Single or Double Quotes?

String variables can be declared either by using single or double quotes:

Example

x = "John"
# is the same as
x = 'John'

Case-Sensitive

Variable names are case-sensitive.

Example

This will create two variables:

a = 4
A = "Sally"
#A will not overwrite a

Python Variables


Variables

Variables are containers for storing data values.


Creating Variables

Python has no command for declaring a variable.

A variable is created the moment you first assign a value to it.

Example

x = 5
y = "John"
print(x)
print(y)

Variables do not need to be declared with any particular type, and can even change type after they have been set.

Example

x = 4       # x is of type int
x = "Sally" # x is now of type str
print(x)

Casting

If you want to specify the data type of a variable, this can be done with casting.

Example

x = str(3)    # x will be '3'
y = int(3)    # y will be 3
z = float(3)  # z will be 3.0


Get the Type

You can get the data type of a variable with the type() function.

Example

x = 5
y = "John"
print(type(x))
print(type(y))
You will learn more about and later in this tutorial.

Single or Double Quotes?

String variables can be declared either by using single or double quotes:

Example

x = "John"
# is the same as
x = 'John'

Case-Sensitive

Variable names are case-sensitive.

Example

This will create two variables:

a = 4
A = "Sally"
#A will not overwrite a

Python - Variable Names


Variable Names

A variable can have a short name (like x and y) or a more descriptive name (age, carname, total_volume). Rules for Python variables:
  • A variable name must start with a letter or the underscore character
  • A variable name cannot start with a number
  • A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
  • Variable names are case-sensitive (age, Age and AGE are three different variables)
  • A variable name cannot be any of the .

Example

Legal variable names:

myvar = "John"
my_var = "John"
_my_var = "John"
myVar = "John"
MYVAR = "John"
myvar2 = "John"

Example

Illegal variable names:

2myvar = "John"
my-var = "John"
my var = "John"

Remember that variable names are case-sensitive



Multi Words Variable Names

Variable names with more than one word can be difficult to read.

There are several techniques you can use to make them more readable:

Camel Case

Each word, except the first, starts with a capital letter:

myVariableName = "John"

Pascal Case

Each word starts with a capital letter:

MyVariableName = "John"

Snake Case

Each word is separated by an underscore character:

my_variable_name = "John"

Python Variables - Assign Multiple Values


Many Values to Multiple Variables

Python allows you to assign values to multiple variables in one line:

Example

x, y, z = "Orange", "Banana", "Cherry"
print(x)
print(y)
print(z)

Note: Make sure the number of variables matches the number of values, or else you will get an error.


One Value to Multiple Variables

And you can assign the same value to multiple variables in one line:

Example

x = y = z = "Orange"
print(x)
print(y)
print(z)

Unpack a Collection

If you have a collection of values in a list, tuple etc. Python allows you to extract the values into variables. This is called unpacking.

Example

Unpack a list:

fruits = ["apple", "banana", "cherry"]
x, y, z = fruits
print(x)
print(y)
print(z)

Learn more about unpacking in our Chapter.


Python - Output Variables


Output Variables

The Python print() function is often used to output variables.

Example

x = "Python is awesome"
print(x)

In the print() function, you output multiple variables, separated by a comma:

Example

x = "Python"
y = "is"
z = "awesome"
print(x, y, z)

You can also use the + operator to output multiple variables:

Example

x = "Python "
y = "is "
z = "awesome"
print(x + y + z)

Notice the space character after "Python " and "is ", without them the result would be "Pythonisawesome".

For numbers, the + character works as a mathematical operator:

Example

x = 5
y = 10
print(x + y)

In the print() function, when you try to combine a string and a number with the + operator, Python will give you an error:

Example

x = 5
y = "John"
print(x + y)

The best way to output multiple variables in the print() function is to separate them with commas, which even support different data types:

Example

x = 5
y = "John"
print(x, y)

Python - Global Variables


Global Variables

Variables that are created outside of a function (as in all of the examples above) are known as global variables.

Global variables can be used by everyone, both inside of functions and outside.

Example

Create a variable outside of a function, and use it inside the function

x = "awesome"

def myfunc():
  print("Python is " + x)

myfunc()

If you create a variable with the same name inside a function, this variable will be local, and can only be used inside the function. The global variable with the same name will remain as it was, global and with the original value.

Example

Create a variable inside a function, with the same name as the global variable

x = "awesome"

def myfunc():
  x = "fantastic"
  print("Python is " + x)

myfunc()

print("Python is " + x)


The global Keyword

Normally, when you create a variable inside a function, that variable is local, and can only be used inside that function.

To create a global variable inside a function, you can use the global keyword.

Example

If you use the global keyword, the variable belongs to the global scope:

def myfunc():
  global x
  x = "fantastic"

myfunc()

print("Python is " + x)

Also, use the global keyword if you want to change a global variable inside a function.

Example

To change the value of a global variable inside a function, refer to the variable by using the global keyword:

x = "awesome"

def myfunc():
  global x
  x = "fantastic"

myfunc()

print("Python is " + x)

Python - Variable Exercises


Test Yourself With Exercises

Now you have learned a lot about variables, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:

Exercise:

Create a variable named carname and assign the value Volvo to it.

 = ""

Go to the Exercise section and test all of our Python Variable Exercises:


Python Data Types


Built-in Data Types

In programming, data type is an important concept.

Variables can store data of different types, and different types can do different things.

Python has the following data types built-in by default, in these categories:

Text Type: str
Numeric Types: int, float, complex
Sequence Types: list, tuple, range
Mapping Type: dict
Set Types: set, frozenset
Boolean Type: bool
Binary Types: bytes, bytearray, memoryview
None Type: NoneType

Getting the Data Type

You can get the data type of any object by using the type() function:

Example

Print the data type of the variable x:

x = 5
print(type(x))

Setting the Data Type

In Python, the data type is set when you assign a value to a variable:

Example Data Type Try it
x = "Hello World" str
x = 20 int
x = 20.5 float
x = 1j complex
x = ["apple", "banana", "cherry"] list
x = ("apple", "banana", "cherry") tuple
x = range(6) range
x = {"name" : "John", "age" : 36} dict
x = {"apple", "banana", "cherry"} set
x = frozenset({"apple", "banana", "cherry"}) frozenset
x = True bool
x = b"Hello" bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview
x = None NoneType


Setting the Specific Data Type

If you want to specify the data type, you can use the following constructor functions:

Example Data Type Try it
x = str("Hello World") str
x = int(20) int
x = float(20.5) float
x = complex(1j) complex
x = list(("apple", "banana", "cherry")) list
x = tuple(("apple", "banana", "cherry")) tuple
x = range(6) range
x = dict(name="John", age=36) dict
x = set(("apple", "banana", "cherry")) set
x = frozenset(("apple", "banana", "cherry")) frozenset
x = bool(5) bool
x = bytes(5) bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview

Test Yourself With Exercises

Exercise:

The following code example would print the data type of x, what data type would that be?

x = 5
print(type(x))



Python Numbers


Python Numbers

There are three numeric types in Python:

  • int
  • float
  • complex

Variables of numeric types are created when you assign a value to them:

Example

x = 1    # int
y = 2.8  # float
z = 1j   # complex

To verify the type of any object in Python, use the type() function:

Example

print(type(x))
print(type(y))
print(type(z))

Int

Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.

Example

Integers:

x = 1
y = 35656222554887711
z = -3255522

print(type(x))
print(type(y))
print(type(z))

Float

Float, or "floating point number" is a number, positive or negative, containing one or more decimals.

Example

Floats:

x = 1.10
y = 1.0
z = -35.59

print(type(x))
print(type(y))
print(type(z))

Float can also be scientific numbers with an "e" to indicate the power of 10.

Example

Floats:

x = 35e3
y = 12E4
z = -87.7e100

print(type(x))
print(type(y))
print(type(z))


Complex

Complex numbers are written with a "j" as the imaginary part:

Example

Complex:

x = 3+5j
y = 5j
z = -5j

print(type(x))
print(type(y))
print(type(z))

Type Conversion

You can convert from one type to another with the int(), float(), and complex() methods:

Example

Convert from one type to another:

x = 1    # int
y = 2.8  # float
z = 1j   # complex

#convert from int to float:
a = float(x)

#convert from float to int:
b = int(y)

#convert from int to complex:
c = complex(x)

print(a)
print(b)
print(c)

print(type(a))
print(type(b))
print(type(c))

Note: You cannot convert complex numbers into another number type.


Random Number

Python does not have a random() function to make a random number, but Python has a built-in module called random that can be used to make random numbers:

Example

Import the random module, and display a random number between 1 and 9:

import random

print(random.randrange(1, 10))

In our you will learn more about the Random module.


Test Yourself With Exercises

Exercise:

Insert the correct syntax to convert x into a floating point number.

x = 5
x = (x)



Python Casting


Specify a Variable Type

There may be times when you want to specify a type on to a variable. This can be done with casting. Python is an object-orientated language, and as such it uses classes to define data types, including its primitive types.

Casting in python is therefore done using constructor functions:

  • int() - constructs an integer number from an integer literal, a float literal (by removing all decimals), or a string literal (providing the string represents a whole number)
  • float() - constructs a float number from an integer literal, a float literal or a string literal (providing the string represents a float or an integer)
  • str() - constructs a string from a wide variety of data types, including strings, integer literals and float literals

Example

Integers:

x = int(1)   # x will be 1
y = int(2.8) # y will be 2
z = int("3") # z will be 3

Example

Floats:

x = float(1)     # x will be 1.0
y = float(2.8)   # y will be 2.8
z = float("3")   # z will be 3.0
w = float("4.2") # w will be 4.2

Example

Strings:

x = str("s1") # x will be 's1'
y = str(2)    # y will be '2'
z = str(3.0)  # z will be '3.0'

Python Strings


Strings

Strings in python are surrounded by either single quotation marks, or double quotation marks.

'hello' is the same as "hello".

You can display a string literal with the print() function:

Example

print("Hello")
print('Hello')

Assign String to a Variable

Assigning a string to a variable is done with the variable name followed by an equal sign and the string:

Example

a = "Hello"
print(a)

Multiline Strings

You can assign a multiline string to a variable by using three quotes:

Example

You can use three double quotes:

a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)

Or three single quotes:

Example

a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)

Note: in the result, the line breaks are inserted at the same position as in the code.



Strings are Arrays

Like many other popular programming languages, strings in Python are arrays of bytes representing unicode characters.

However, Python does not have a character data type, a single character is simply a string with a length of 1.

Square brackets can be used to access elements of the string.

Example

Get the character at position 1 (remember that the first character has the position 0):

a = "Hello, World!"
print(a[1])

Looping Through a String

Since strings are arrays, we can loop through the characters in a string, with a for loop.

Example

Loop through the letters in the word "banana":

for x in "banana":
  print(x)

Learn more about For Loops in our chapter.


String Length

To get the length of a string, use the len() function.

Example

The len() function returns the length of a string:

a = "Hello, World!"
print(len(a))

Check String

To check if a certain phrase or character is present in a string, we can use the keyword in.

Example

Check if "free" is present in the following text:

txt = "The best things in life are free!"
print("free" in txt)

Use it in an if statement:

Example

Print only if "free" is present:

txt = "The best things in life are free!"
if "free" in txt:
  print("Yes, 'free' is present.")

Learn more about If statements in our chapter.


Check if NOT

To check if a certain phrase or character is NOT present in a string, we can use the keyword not in.

Example

Check if "expensive" is NOT present in the following text:

txt = "The best things in life are free!"
print("expensive" not in txt)

Use it in an if statement:

Example

print only if "expensive" is NOT present:

txt = "The best things in life are free!"
if "expensive" not in txt:
  print("No, 'expensive' is NOT present.")

Python Strings


Strings

Strings in python are surrounded by either single quotation marks, or double quotation marks.

'hello' is the same as "hello".

You can display a string literal with the print() function:

Example

print("Hello")
print('Hello')

Assign String to a Variable

Assigning a string to a variable is done with the variable name followed by an equal sign and the string:

Example

a = "Hello"
print(a)

Multiline Strings

You can assign a multiline string to a variable by using three quotes:

Example

You can use three double quotes:

a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)

Or three single quotes:

Example

a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)

Note: in the result, the line breaks are inserted at the same position as in the code.



Strings are Arrays

Like many other popular programming languages, strings in Python are arrays of bytes representing unicode characters.

However, Python does not have a character data type, a single character is simply a string with a length of 1.

Square brackets can be used to access elements of the string.

Example

Get the character at position 1 (remember that the first character has the position 0):

a = "Hello, World!"
print(a[1])

Looping Through a String

Since strings are arrays, we can loop through the characters in a string, with a for loop.

Example

Loop through the letters in the word "banana":

for x in "banana":
  print(x)

Learn more about For Loops in our chapter.


String Length

To get the length of a string, use the len() function.

Example

The len() function returns the length of a string:

a = "Hello, World!"
print(len(a))

Check String

To check if a certain phrase or character is present in a string, we can use the keyword in.

Example

Check if "free" is present in the following text:

txt = "The best things in life are free!"
print("free" in txt)

Use it in an if statement:

Example

Print only if "free" is present:

txt = "The best things in life are free!"
if "free" in txt:
  print("Yes, 'free' is present.")

Learn more about If statements in our chapter.


Check if NOT

To check if a certain phrase or character is NOT present in a string, we can use the keyword not in.

Example

Check if "expensive" is NOT present in the following text:

txt = "The best things in life are free!"
print("expensive" not in txt)

Use it in an if statement:

Example

print only if "expensive" is NOT present:

txt = "The best things in life are free!"
if "expensive" not in txt:
  print("No, 'expensive' is NOT present.")

Python - Slicing Strings


Slicing

You can return a range of characters by using the slice syntax.

Specify the start index and the end index, separated by a colon, to return a part of the string.

Example

Get the characters from position 2 to position 5 (not included):

b = "Hello, World!"
print(b[2:5])

Note: The first character has index 0.


Slice From the Start

By leaving out the start index, the range will start at the first character:

Example

Get the characters from the start to position 5 (not included):

b = "Hello, World!"
print(b[:5])


Slice To the End

By leaving out the end index, the range will go to the end:

Example

Get the characters from position 2, and all the way to the end:

b = "Hello, World!"
print(b[2:])

Negative Indexing

Use negative indexes to start the slice from the end of the string:

Example

Get the characters:

From: "o" in "World!" (position -5)

To, but not included: "d" in "World!" (position -2):

b = "Hello, World!"
print(b[-5:-2])

Python - Modify Strings


Python has a set of built-in methods that you can use on strings.


Upper Case

Example

The upper() method returns the string in upper case:

a = "Hello, World!"
print(a.upper())

Lower Case

Example

The lower() method returns the string in lower case:

a = "Hello, World!"
print(a.lower())

Remove Whitespace

Whitespace is the space before and/or after the actual text, and very often you want to remove this space.

Example

The strip() method removes any whitespace from the beginning or the end:

a = " Hello, World! "
print(a.strip()) # returns "Hello, World!"


Replace String

Example

The replace() method replaces a string with another string:

a = "Hello, World!"
print(a.replace("H", "J"))

Split String

The split() method returns a list where the text between the specified separator becomes the list items.

Example

The split() method splits the string into substrings if it finds instances of the separator:

a = "Hello, World!"
print(a.split(",")) # returns ['Hello', ' World!']

Learn more about Lists in our chapter.


String Methods

Learn more about String Methods with our


Python - String Concatenation


String Concatenation

To concatenate, or combine, two strings you can use the + operator.

Example

Merge variable a with variable b into variable c:

a = "Hello"
b = "World"
c = a + b
print(c)

Example

To add a space between them, add a " ":

a = "Hello"
b = "World"
c = a + " " + b
print(c)

Python - Format - Strings


String Format

As we learned in the Python Variables chapter, we cannot combine strings and numbers like this:

Example

age = 36
txt = "My name is John, I am " + age
print(txt)

But we can combine strings and numbers by using the format() method!

The format() method takes the passed arguments, formats them, and places them in the string where the placeholders {} are:

Example

Use the format() method to insert numbers into strings:

age = 36
txt = "My name is John, and I am {}"
print(txt.format(age))

The format() method takes unlimited number of arguments, and are placed into the respective placeholders:

Example

quantity = 3
itemno = 567
price = 49.95
myorder = "I want {} pieces of item {} for {} dollars."
print(myorder.format(quantity, itemno, price))

You can use index numbers {0} to be sure the arguments are placed in the correct placeholders:

Example

quantity = 3
itemno = 567
price = 49.95
myorder = "I want to pay {2} dollars for {0} pieces of item {1}."
print(myorder.format(quantity, itemno, price))

Learn more about String Formatting in our chapter.


Python - Escape Characters


Escape Character

To insert characters that are illegal in a string, use an escape character.

An escape character is a backslash followed by the character you want to insert.

An example of an illegal character is a double quote inside a string that is surrounded by double quotes:

Example

You will get an error if you use double quotes inside a string that is surrounded by double quotes:

txt = "We are the so-called "Vikings" from the north."

To fix this problem, use the escape character ":

Example

The escape character allows you to use double quotes when you normally would not be allowed:

txt = "We are the so-called "Vikings" from the north."

Escape Characters

Other escape characters used in Python:

Code Result Try it
' Single Quote
\ Backslash
n New Line
r Carriage Return
t Tab
b Backspace
f Form Feed
ooo Octal value
xhh Hex value

Python - String Methods


String Methods

Python has a set of built-in methods that you can use on strings.

Note: All string methods return new values. They do not change the original string.

Method Description
Converts the first character to upper case
Converts string into lower case
Returns a centered string
Returns the number of times a specified value occurs in a string
Returns an encoded version of the string
Returns true if the string ends with the specified value
Sets the tab size of the string
Searches the string for a specified value and returns the position of where it was found
Formats specified values in a string
format_map() Formats specified values in a string
Searches the string for a specified value and returns the position of where it was found
Returns True if all characters in the string are alphanumeric
Returns True if all characters in the string are in the alphabet
Returns True if all characters in the string are ascii characters
Returns True if all characters in the string are decimals
Returns True if all characters in the string are digits
Returns True if the string is an identifier
Returns True if all characters in the string are lower case
Returns True if all characters in the string are numeric
Returns True if all characters in the string are printable
Returns True if all characters in the string are whitespaces
Returns True if the string follows the rules of a title
Returns True if all characters in the string are upper case
Joins the elements of an iterable to the end of the string
Returns a left justified version of the string
Converts a string into lower case
Returns a left trim version of the string
Returns a translation table to be used in translations
Returns a tuple where the string is parted into three parts
Returns a string where a specified value is replaced with a specified value
Searches the string for a specified value and returns the last position of where it was found
Searches the string for a specified value and returns the last position of where it was found
Returns a right justified version of the string
Returns a tuple where the string is parted into three parts
Splits the string at the specified separator, and returns a list
Returns a right trim version of the string
Splits the string at the specified separator, and returns a list
Splits the string at line breaks and returns a list
Returns true if the string starts with the specified value
Returns a trimmed version of the string
Swaps cases, lower case becomes upper case and vice versa
Converts the first character of each word to upper case
Returns a translated string
Converts a string into upper case
Fills the string with a specified number of 0 values at the beginning

Python - String Exercises


Test Yourself With Exercises

Now you have learned a lot about Strings, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:

Test Yourself With Exercises

Exercise:

Use the len method to print the length of the string.

x = "Hello World"
print()

Go to the Exercise section and test all of our Python Strings Exercises:


Python Booleans


Booleans represent one of two values: True or False.


Boolean Values

In programming you often need to know if an expression is True or False.

You can evaluate any expression in Python, and get one of two answers, True or False.

When you compare two values, the expression is evaluated and Python returns the Boolean answer:

Example

print(10 > 9)
print(10 == 9)
print(10 < 9)

When you run a condition in an if statement, Python returns True or False:

Example

Print a message based on whether the condition is True or False:

a = 200
b = 33

if b > a:
  print("b is greater than a")
else:
  print("b is not greater than a")

Evaluate Values and Variables

The bool() function allows you to evaluate any value, and give you True or False in return,

Example

Evaluate a string and a number:

print(bool("Hello"))
print(bool(15))

Example

Evaluate two variables:

x = "Hello"
y = 15

print(bool(x))
print(bool(y))


Most Values are True

Almost any value is evaluated to True if it has some sort of content.

Any string is True, except empty strings.

Any number is True, except 0.

Any list, tuple, set, and dictionary are True, except empty ones.

Example

The following will return True:

bool("abc")
bool(123)
bool(["apple", "cherry", "banana"])

Some Values are False

In fact, there are not many values that evaluate to False, except empty values, such as (), [], {}, "", the number 0, and the value None. And of course the value False evaluates to False.

Example

The following will return False:

bool(False)
bool(None)
bool(0)
bool("")
bool(())
bool([])
bool({})

One more value, or object in this case, evaluates to False, and that is if you have an object that is made from a class with a __len__ function that returns 0 or False:

Example

class myclass():
  def __len__(self):
    return 0

myobj = myclass()
print(bool(myobj))

Functions can Return a Boolean

You can create functions that returns a Boolean Value:

Example

Print the answer of a function:

def myFunction() :
  return True

print(myFunction())

You can execute code based on the Boolean answer of a function:

Example

Print "YES!" if the function returns True, otherwise print "NO!":

def myFunction() :
  return True

if myFunction():
  print("YES!")
else:
  print("NO!")

Python also has many built-in functions that return a boolean value, like the isinstance() function, which can be used to determine if an object is of a certain data type:

Example

Check if an object is an integer or not:

x = 200
print(isinstance(x, int))

Test Yourself With Exercises

Exercise:

The statement below would print a Boolean value, which one?

print(10 > 9)




Python Operators


Python Operators

Operators are used to perform operations on variables and values.

In the example below, we use the + operator to add together two values:

Example

print(10 + 5)

Python divides the operators in the following groups:

  • Arithmetic operators
  • Assignment operators
  • Comparison operators
  • Logical operators
  • Identity operators
  • Membership operators
  • Bitwise operators

Python Arithmetic Operators

Arithmetic operators are used with numeric values to perform common mathematical operations:

Operator Name Example Try it
+ Addition x + y
- Subtraction x - y
* Multiplication x * y
/ Division x / y
% Modulus x % y
** Exponentiation x ** y
// Floor division x // y

Python Assignment Operators

Assignment operators are used to assign values to variables:

Operator Example Same As Try it
= x = 5 x = 5
+= x += 3 x = x + 3
-= x -= 3 x = x - 3
*= x *= 3 x = x * 3
/= x /= 3 x = x / 3
%= x %= 3 x = x % 3
//= x //= 3 x = x // 3
**= x **= 3 x = x ** 3
&= x &= 3 x = x & 3
|= x |= 3 x = x | 3
^= x ^= 3 x = x ^ 3
>>= x >>= 3 x = x >> 3
<<= x <<= 3 x = x << 3


Python Comparison Operators

Comparison operators are used to compare two values:

Operator Name Example Try it
== Equal x == y
!= Not equal x != y
> Greater than x > y
< Less than x < y
>= Greater than or equal to x >= y
<= Less than or equal to x <= y

Python Logical Operators

Logical operators are used to combine conditional statements:

Operator Description Example Try it
and  Returns True if both statements are true x < 5 and  x < 10
or Returns True if one of the statements is true x < 5 or x < 4
not Reverse the result, returns False if the result is true not(x < 5 and x < 10)

Python Identity Operators

Identity operators are used to compare the objects, not if they are equal, but if they are actually the same object, with the same memory location:

Operator Description Example Try it
is  Returns True if both variables are the same object x is y
is not Returns True if both variables are not the same object x is not y

Python Membership Operators

Membership operators are used to test if a sequence is presented in an object:

Operator Description Example Try it
in  Returns True if a sequence with the specified value is present in the object x in y
not in Returns True if a sequence with the specified value is not present in the object x not in y

Python Bitwise Operators

Bitwise operators are used to compare (binary) numbers:

Operator Name Description Example Try it
&  AND Sets each bit to 1 if both bits are 1 x & y
| OR Sets each bit to 1 if one of two bits is 1 x | y
^ XOR Sets each bit to 1 if only one of two bits is 1 x ^ y
~ NOT Inverts all the bits ~x
<< Zero fill left shift Shift left by pushing zeros in from the right and let the leftmost bits fall off x << 2
>> Signed right shift Shift right by pushing copies of the leftmost bit in from the left, and let the rightmost bits fall off x >> 2

Operator Precedence

Operator precedence describes the order in which operations are performed.

Example

Parentheses has the highest precedence, meaning that expressions inside parentheses must be evaluated first:

print((6 + 3) - (6 + 3))

Example

Multiplication * has higher precedence than addition +, and therefor multiplications are evaluated before additions:

print(100 + 5 * 3)

The precedence order is described in the table below, starting with the highest precedence at the top:

Operator Description Try it
() Parentheses
** Exponentiation
+x  -x  ~x Unary plus, unary minus, and bitwise NOT
*  /  //  % Multiplication, division, floor division, and modulus
+  - Addition and subtraction
<<  >> Bitwise left and right shifts
& Bitwise AND
^ Bitwise XOR
| Bitwise OR
==  !=  >  >=  <  <=  is  is not  in  not in  Comparisons, identity, and membership operators
not Logical NOT
and AND
or OR

If two operators have the same precedence, the expression is evaluated from left to right.

Example

Addition + and subtraction - has the same precedence, and therefor we evaluate the expression from left to right:

print(5 + 4 - 7 + 3)

Test Yourself With Exercises

Exercise:

Multiply 10 with 5, and print the result.

print(10  5)


Python Lists


mylist = ["apple", "banana", "cherry"]

List

Lists are used to store multiple items in a single variable.

Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

Lists are created using square brackets:

Example

Create a List:

thislist = ["apple", "banana", "cherry"]
print(thislist)

List Items

List items are ordered, changeable, and allow duplicate values.

List items are indexed, the first item has index [0], the second item has index [1] etc.


Ordered

When we say that lists are ordered, it means that the items have a defined order, and that order will not change.

If you add new items to a list, the new items will be placed at the end of the list.

Note: There are some that will change the order, but in general: the order of the items will not change.


Changeable

The list is changeable, meaning that we can change, add, and remove items in a list after it has been created.


Allow Duplicates

Since lists are indexed, lists can have items with the same value:

Example

Lists allow duplicate values:

thislist = ["apple", "banana", "cherry", "apple", "cherry"]
print(thislist)


List Length

To determine how many items a list has, use the len() function:

Example

Print the number of items in the list:

thislist = ["apple", "banana", "cherry"]
print(len(thislist))

List Items - Data Types

List items can be of any data type:

Example

String, int and boolean data types:

list1 = ["apple", "banana", "cherry"]
list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]

A list can contain different data types:

Example

A list with strings, integers and boolean values:

list1 = ["abc", 34, True, 40, "male"]

type()

From Python's perspective, lists are defined as objects with the data type 'list':

<class 'list'>

Example

What is the data type of a list?

mylist = ["apple", "banana", "cherry"]
print(type(mylist))

The list() Constructor

It is also possible to use the list() constructor when creating a new list.

Example

Using the list() constructor to make a List:

thislist = list(("apple", "banana", "cherry")) # note the double round-brackets
print(thislist)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • List is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python Lists


mylist = ["apple", "banana", "cherry"]

List

Lists are used to store multiple items in a single variable.

Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

Lists are created using square brackets:

Example

Create a List:

thislist = ["apple", "banana", "cherry"]
print(thislist)

List Items

List items are ordered, changeable, and allow duplicate values.

List items are indexed, the first item has index [0], the second item has index [1] etc.


Ordered

When we say that lists are ordered, it means that the items have a defined order, and that order will not change.

If you add new items to a list, the new items will be placed at the end of the list.

Note: There are some that will change the order, but in general: the order of the items will not change.


Changeable

The list is changeable, meaning that we can change, add, and remove items in a list after it has been created.


Allow Duplicates

Since lists are indexed, lists can have items with the same value:

Example

Lists allow duplicate values:

thislist = ["apple", "banana", "cherry", "apple", "cherry"]
print(thislist)


List Length

To determine how many items a list has, use the len() function:

Example

Print the number of items in the list:

thislist = ["apple", "banana", "cherry"]
print(len(thislist))

List Items - Data Types

List items can be of any data type:

Example

String, int and boolean data types:

list1 = ["apple", "banana", "cherry"]
list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]

A list can contain different data types:

Example

A list with strings, integers and boolean values:

list1 = ["abc", 34, True, 40, "male"]

type()

From Python's perspective, lists are defined as objects with the data type 'list':

<class 'list'>

Example

What is the data type of a list?

mylist = ["apple", "banana", "cherry"]
print(type(mylist))

The list() Constructor

It is also possible to use the list() constructor when creating a new list.

Example

Using the list() constructor to make a List:

thislist = list(("apple", "banana", "cherry")) # note the double round-brackets
print(thislist)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • List is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.



Access Items

List items are indexed and you can access them by referring to the index number:

Example

Print the second item of the list:

thislist = ["apple", "banana", "cherry"]
print(thislist[1])

Note: The first item has index 0.

Negative Indexing

Negative indexing means start from the end

-1 refers to the last item, -2 refers to the second last item etc.

Example

Print the last item of the list:

thislist = ["apple", "banana", "cherry"]
print(thislist[-1])

Range of Indexes

You can specify a range of indexes by specifying where to start and where to end the range.

When specifying a range, the return value will be a new list with the specified items.

Example

Return the third, fourth, and fifth item:

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[2:5])

Note: The search will start at index 2 (included) and end at index 5 (not included).

Remember that the first item has index 0.

By leaving out the start value, the range will start at the first item:

Example

This example returns the items from the beginning to, but NOT including, "kiwi":

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[:4])

By leaving out the end value, the range will go on to the end of the list:

Example

This example returns the items from "cherry" to the end:

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[2:])


Range of Negative Indexes

Specify negative indexes if you want to start the search from the end of the list:

Example

This example returns the items from "orange" (-4) to, but NOT including "mango" (-1):

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[-4:-1])

Check if Item Exists

To determine if a specified item is present in a list use the in keyword:

Example

Check if "apple" is present in the list:

thislist = ["apple", "banana", "cherry"]
if "apple" in thislist:
  print("Yes, 'apple' is in the fruits list")

Python - Change List Items


Change Item Value

To change the value of a specific item, refer to the index number:

Example

Change the second item:

thislist = ["apple", "banana", "cherry"]
thislist[1] = "blackcurrant"
print(thislist)

Change a Range of Item Values

To change the value of items within a specific range, define a list with the new values, and refer to the range of index numbers where you want to insert the new values:

Example

Change the values "banana" and "cherry" with the values "blackcurrant" and "watermelon":

thislist = ["apple", "banana", "cherry", "orange", "kiwi", "mango"]
thislist[1:3] = ["blackcurrant", "watermelon"]
print(thislist)

If you insert more items than you replace, the new items will be inserted where you specified, and the remaining items will move accordingly:

Example

Change the second value by replacing it with two new values:

thislist = ["apple", "banana", "cherry"]
thislist[1:2] = ["blackcurrant", "watermelon"]
print(thislist)

Note: The length of the list will change when the number of items inserted does not match the number of items replaced.

If you insert less items than you replace, the new items will be inserted where you specified, and the remaining items will move accordingly:

Example

Change the second and third value by replacing it with one value:

thislist = ["apple", "banana", "cherry"]
thislist[1:3] = ["watermelon"]
print(thislist)


Insert Items

To insert a new list item, without replacing any of the existing values, we can use the insert() method.

The insert() method inserts an item at the specified index:

Example

Insert "watermelon" as the third item:

thislist = ["apple", "banana", "cherry"]
thislist.insert(2, "watermelon")
print(thislist)

Note: As a result of the example above, the list will now contain 4 items.



Append Items

To add an item to the end of the list, use the append() method:

Example

Using the append() method to append an item:

thislist = ["apple", "banana", "cherry"]
thislist.append("orange")
print(thislist)

Insert Items

To insert a list item at a specified index, use the insert() method.

The insert() method inserts an item at the specified index:

Example

Insert an item as the second position:

thislist = ["apple", "banana", "cherry"]
thislist.insert(1, "orange")
print(thislist)

Note: As a result of the examples above, the lists will now contain 4 items.



Extend List

To append elements from another list to the current list, use the extend() method.

Example

Add the elements of tropical to thislist:

thislist = ["apple", "banana", "cherry"]
tropical = ["mango", "pineapple", "papaya"]
thislist.extend(tropical)
print(thislist)

The elements will be added to the end of the list.


Add Any Iterable

The extend() method does not have to append lists, you can add any iterable object (tuples, sets, dictionaries etc.).

Example

Add elements of a tuple to a list:

thislist = ["apple", "banana", "cherry"]
thistuple = ("kiwi", "orange")
thislist.extend(thistuple)
print(thislist)


Remove Specified Item

The remove() method removes the specified item.

Example

Remove "banana":

thislist = ["apple", "banana", "cherry"]
thislist.remove("banana")
print(thislist)

If there are more than one item with the specified value, the remove() method removes the first occurance:

Example

Remove the first occurance of "banana":

thislist = ["apple", "banana", "cherry", "banana", "kiwi"]
thislist.remove("banana")
print(thislist)

Remove Specified Index

The pop() method removes the specified index.

Example

Remove the second item:

thislist = ["apple", "banana", "cherry"]
thislist.pop(1)
print(thislist)

If you do not specify the index, the pop() method removes the last item.

Example

Remove the last item:

thislist = ["apple", "banana", "cherry"]
thislist.pop()
print(thislist)

The del keyword also removes the specified index:

Example

Remove the first item:

thislist = ["apple", "banana", "cherry"]
del thislist[0]
print(thislist)

The del keyword can also delete the list completely.

Example

Delete the entire list:

thislist = ["apple", "banana", "cherry"]
del thislist

Clear the List

The clear() method empties the list.

The list still remains, but it has no content.

Example

Clear the list content:

thislist = ["apple", "banana", "cherry"]
thislist.clear()
print(thislist)


Loop Through a List

You can loop through the list items by using a for loop:

Example

Print all items in the list, one by one:

thislist = ["apple", "banana", "cherry"]
for x in thislist:
  print(x)

Learn more about for loops in our Chapter.


Loop Through the Index Numbers

You can also loop through the list items by referring to their index number.

Use the range() and len() functions to create a suitable iterable.

Example

Print all items by referring to their index number:

thislist = ["apple", "banana", "cherry"]
for i in range(len(thislist)):
  print(thislist[i])

The iterable created in the example above is [0, 1, 2].



Using a While Loop

You can loop through the list items by using a while loop.

Use the len() function to determine the length of the list, then start at 0 and loop your way through the list items by referring to their indexes.

Remember to increase the index by 1 after each iteration.

Example

Print all items, using a while loop to go through all the index numbers

thislist = ["apple", "banana", "cherry"]
i = 0
while i < len(thislist):
  print(thislist[i])
  i = i + 1

Learn more about while loops in our Chapter.


Looping Using List Comprehension

List Comprehension offers the shortest syntax for looping through lists:

Example

A short hand for loop that will print all items in a list:

thislist = ["apple", "banana", "cherry"]
[print(x) for x in thislist]

Learn more about list comprehension in the next chapter: .



List Comprehension

List comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list.

Example:

Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the name.

Without list comprehension you will have to write a for statement with a conditional test inside:

Example

fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
  if "a" in x:
    newlist.append(x)

print(newlist)

With list comprehension you can do all that with only one line of code:

Example

fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x for x in fruits if "a" in x]

print(newlist)


The Syntax

newlist = [expression for item in iterable if condition == True]

The return value is a new list, leaving the old list unchanged.


Condition

The condition is like a filter that only accepts the items that valuate to True.

Example

Only accept items that are not "apple":

newlist = [x for x in fruits if x != "apple"]

The condition if x != "apple"  will return True for all elements other than "apple", making the new list contain all fruits except "apple".

The condition is optional and can be omitted:

Example

With no if statement:

newlist = [x for x in fruits]

Iterable

The iterable can be any iterable object, like a list, tuple, set etc.

Example

You can use the range() function to create an iterable:

newlist = [x for x in range(10)]

Same example, but with a condition:

Example

Accept only numbers lower than 5:

newlist = [x for x in range(10) if x < 5]

Expression

The expression is the current item in the iteration, but it is also the outcome, which you can manipulate before it ends up like a list item in the new list:

Example

Set the values in the new list to upper case:

newlist = [x.upper() for x in fruits]

You can set the outcome to whatever you like:

Example

Set all values in the new list to 'hello':

newlist = ['hello' for x in fruits]

The expression can also contain conditions, not like a filter, but as a way to manipulate the outcome:

Example

Return "orange" instead of "banana":

newlist = [x if x != "banana" else "orange" for x in fruits]

The expression in the example above says:

"Return the item if it is not banana, if it is banana return orange".


Python - Sort Lists


Sort List Alphanumerically

List objects have a sort() method that will sort the list alphanumerically, ascending, by default:

Example

Sort the list alphabetically:

thislist = ["orange", "mango", "kiwi", "pineapple", "banana"]
thislist.sort()
print(thislist)

Example

Sort the list numerically:

thislist = [100, 50, 65, 82, 23]
thislist.sort()
print(thislist)

Sort Descending

To sort descending, use the keyword argument reverse = True:

Example

Sort the list descending:

thislist = ["orange", "mango", "kiwi", "pineapple", "banana"]
thislist.sort(reverse = True)
print(thislist)

Example

Sort the list descending:

thislist = [100, 50, 65, 82, 23]
thislist.sort(reverse = True)
print(thislist)


Customize Sort Function

You can also customize your own function by using the keyword argument key = function.

The function will return a number that will be used to sort the list (the lowest number first):

Example

Sort the list based on how close the number is to 50:

def myfunc(n):
  return abs(n - 50)

thislist = [100, 50, 65, 82, 23]
thislist.sort(key = myfunc)
print(thislist)

Case Insensitive Sort

By default the sort() method is case sensitive, resulting in all capital letters being sorted before lower case letters:

Example

Case sensitive sorting can give an unexpected result:

thislist = ["banana", "Orange", "Kiwi", "cherry"]
thislist.sort()
print(thislist)

Luckily we can use built-in functions as key functions when sorting a list.

So if you want a case-insensitive sort function, use str.lower as a key function:

Example

Perform a case-insensitive sort of the list:

thislist = ["banana", "Orange", "Kiwi", "cherry"]
thislist.sort(key = str.lower)
print(thislist)

Reverse Order

What if you want to reverse the order of a list, regardless of the alphabet?

The reverse() method reverses the current sorting order of the elements.

Example

Reverse the order of the list items:

thislist = ["banana", "Orange", "Kiwi", "cherry"]
thislist.reverse()
print(thislist)


Copy a List

You cannot copy a list simply by typing list2 = list1, because: list2 will only be a reference to list1, and changes made in list1 will automatically also be made in list2.

There are ways to make a copy, one way is to use the built-in List method copy().

Example

Make a copy of a list with the copy() method:

thislist = ["apple", "banana", "cherry"]
mylist = thislist.copy()
print(mylist)

Another way to make a copy is to use the built-in method list().

Example

Make a copy of a list with the list() method:

thislist = ["apple", "banana", "cherry"]
mylist = list(thislist)
print(mylist)

Python - Join Lists


Join Two Lists

There are several ways to join, or concatenate, two or more lists in Python.

One of the easiest ways are by using the + operator.

Example

Join two list:

list1 = ["a", "b", "c"]
list2 = [1, 2, 3]

list3 = list1 + list2
print(list3)

Another way to join two lists is by appending all the items from list2 into list1, one by one:

Example

Append list2 into list1:

list1 = ["a", "b" , "c"]
list2 = [1, 2, 3]

for x in list2:
  list1.append(x)

print(list1)

Or you can use the extend() method, where the purpose is to add elements from one list to another list:

Example

Use the extend() method to add list2 at the end of list1:

list1 = ["a", "b" , "c"]
list2 = [1, 2, 3]

list1.extend(list2)
print(list1)

Python - List Methods


List Methods

Python has a set of built-in methods that you can use on lists.

Method Description
Adds an element at the end of the list
Removes all the elements from the list
Returns a copy of the list
Returns the number of elements with the specified value
Add the elements of a list (or any iterable), to the end of the current list
Returns the index of the first element with the specified value
Adds an element at the specified position
Removes the element at the specified position
Removes the item with the specified value
Reverses the order of the list
Sorts the list

Python List Exercises


Test Yourself With Exercises

Now you have learned a lot about lists, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:

Exercise:

Print the second item in the fruits list.

fruits = ["apple", 
"banana",
"cherry"] print()

Go to the Exercise section and test all of our Python List Exercises:


Python Tuples


mytuple = ("apple", "banana", "cherry")

Tuple

Tuples are used to store multiple items in a single variable.

Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

A tuple is a collection which is ordered and unchangeable.

Tuples are written with round brackets.

Example

Create a Tuple:

thistuple = ("apple", "banana", "cherry")
print(thistuple)

Tuple Items

Tuple items are ordered, unchangeable, and allow duplicate values.

Tuple items are indexed, the first item has index [0], the second item has index [1] etc.


Ordered

When we say that tuples are ordered, it means that the items have a defined order, and that order will not change.


Unchangeable

Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple has been created.


Allow Duplicates

Since tuples are indexed, they can have items with the same value:

Example

Tuples allow duplicate values:

thistuple = ("apple", "banana", "cherry", "apple", "cherry")
print(thistuple)


Tuple Length

To determine how many items a tuple has, use the len() function:

Example

Print the number of items in the tuple:

thistuple = ("apple", "banana", "cherry")
print(len(thistuple))

Create Tuple With One Item

To create a tuple with only one item, you have to add a comma after the item, otherwise Python will not recognize it as a tuple.

Example

One item tuple, remember the comma:

thistuple = ("apple",)
print(type(thistuple))

#NOT a tuple
thistuple = ("apple")
print(type(thistuple))

Tuple Items - Data Types

Tuple items can be of any data type:

Example

String, int and boolean data types:

tuple1 = ("apple", "banana", "cherry")
tuple2 = (1, 5, 7, 9, 3)
tuple3 = (True, False, False)

A tuple can contain different data types:

Example

A tuple with strings, integers and boolean values:

tuple1 = ("abc", 34, True, 40, "male")

type()

From Python's perspective, tuples are defined as objects with the data type 'tuple':

<class 'tuple'>

Example

What is the data type of a tuple?

mytuple = ("apple", "banana", "cherry")
print(type(mytuple))

The tuple() Constructor

It is also possible to use the tuple() constructor to make a tuple.

Example

Using the tuple() method to make a tuple:

thistuple = tuple(("apple", "banana", "cherry")) # note the double round-brackets
print(thistuple)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python Tuples


mytuple = ("apple", "banana", "cherry")

Tuple

Tuples are used to store multiple items in a single variable.

Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

A tuple is a collection which is ordered and unchangeable.

Tuples are written with round brackets.

Example

Create a Tuple:

thistuple = ("apple", "banana", "cherry")
print(thistuple)

Tuple Items

Tuple items are ordered, unchangeable, and allow duplicate values.

Tuple items are indexed, the first item has index [0], the second item has index [1] etc.


Ordered

When we say that tuples are ordered, it means that the items have a defined order, and that order will not change.


Unchangeable

Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple has been created.


Allow Duplicates

Since tuples are indexed, they can have items with the same value:

Example

Tuples allow duplicate values:

thistuple = ("apple", "banana", "cherry", "apple", "cherry")
print(thistuple)


Tuple Length

To determine how many items a tuple has, use the len() function:

Example

Print the number of items in the tuple:

thistuple = ("apple", "banana", "cherry")
print(len(thistuple))

Create Tuple With One Item

To create a tuple with only one item, you have to add a comma after the item, otherwise Python will not recognize it as a tuple.

Example

One item tuple, remember the comma:

thistuple = ("apple",)
print(type(thistuple))

#NOT a tuple
thistuple = ("apple")
print(type(thistuple))

Tuple Items - Data Types

Tuple items can be of any data type:

Example

String, int and boolean data types:

tuple1 = ("apple", "banana", "cherry")
tuple2 = (1, 5, 7, 9, 3)
tuple3 = (True, False, False)

A tuple can contain different data types:

Example

A tuple with strings, integers and boolean values:

tuple1 = ("abc", 34, True, 40, "male")

type()

From Python's perspective, tuples are defined as objects with the data type 'tuple':

<class 'tuple'>

Example

What is the data type of a tuple?

mytuple = ("apple", "banana", "cherry")
print(type(mytuple))

The tuple() Constructor

It is also possible to use the tuple() constructor to make a tuple.

Example

Using the tuple() method to make a tuple:

thistuple = tuple(("apple", "banana", "cherry")) # note the double round-brackets
print(thistuple)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python - Access Tuple Items


Access Tuple Items

You can access tuple items by referring to the index number, inside square brackets:

Example

Print the second item in the tuple:

thistuple = ("apple", "banana", "cherry")
print(thistuple[1])

Note: The first item has index 0.


Negative Indexing

Negative indexing means start from the end.

-1 refers to the last item, -2 refers to the second last item etc.

Example

Print the last item of the tuple:

thistuple = ("apple", "banana", "cherry")
print(thistuple[-1])

Range of Indexes

You can specify a range of indexes by specifying where to start and where to end the range.

When specifying a range, the return value will be a new tuple with the specified items.

Example

Return the third, fourth, and fifth item:

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
print(thistuple[2:5])

Note: The search will start at index 2 (included) and end at index 5 (not included).

Remember that the first item has index 0.

By leaving out the start value, the range will start at the first item:

Example

This example returns the items from the beginning to, but NOT included, "kiwi":

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
print(thistuple[:4])

By leaving out the end value, the range will go on to the end of the list:

Example

This example returns the items from "cherry" and to the end:

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
print(thistuple[2:])


Range of Negative Indexes

Specify negative indexes if you want to start the search from the end of the tuple:

Example

This example returns the items from index -4 (included) to index -1 (excluded)

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
print(thistuple[-4:-1])

Check if Item Exists

To determine if a specified item is present in a tuple use the in keyword:

Example

Check if "apple" is present in the tuple:

thistuple = ("apple", "banana", "cherry")
if "apple" in thistuple:
  print("Yes, 'apple' is in the fruits tuple")

Python - Update Tuples


Tuples are unchangeable, meaning that you cannot change, add, or remove items once the tuple is created.

But there are some workarounds.


Change Tuple Values

Once a tuple is created, you cannot change its values. Tuples are unchangeable, or immutable as it also is called.

But there is a workaround. You can convert the tuple into a list, change the list, and convert the list back into a tuple.

Example

Convert the tuple into a list to be able to change it:

x = ("apple", "banana", "cherry")
y = list(x)
y[1] = "kiwi"
x = tuple(y)

print(x)

Add Items

Since tuples are immutable, they do not have a built-in append() method, but there are other ways to add items to a tuple.

1. Convert into a list: Just like the workaround for changing a tuple, you can convert it into a list, add your item(s), and convert it back into a tuple.

Example

Convert the tuple into a list, add "orange", and convert it back into a tuple:

thistuple = ("apple", "banana", "cherry")
y = list(thistuple)
y.append("orange")
thistuple = tuple(y)

2. Add tuple to a tuple. You are allowed to add tuples to tuples, so if you want to add one item, (or many), create a new tuple with the item(s), and add it to the existing tuple:

Example

Create a new tuple with the value "orange", and add that tuple:

thistuple = ("apple", "banana", "cherry")
y = ("orange",)
thistuple += y

print(thistuple)

Note: When creating a tuple with only one item, remember to include a comma after the item, otherwise it will not be identified as a tuple.



Remove Items

Note: You cannot remove items in a tuple.

Tuples are unchangeable, so you cannot remove items from it, but you can use the same workaround as we used for changing and adding tuple items:

Example

Convert the tuple into a list, remove "apple", and convert it back into a tuple:

thistuple = ("apple", "banana", "cherry")
y = list(thistuple)
y.remove("apple")
thistuple = tuple(y)

Or you can delete the tuple completely:

Example

The del keyword can delete the tuple completely:

thistuple = ("apple", "banana", "cherry")
del thistuple
print(thistuple) #this will raise an error because the tuple no longer exists

Python - Unpack Tuples


Unpacking a Tuple

When we create a tuple, we normally assign values to it. This is called "packing" a tuple:

Example

Packing a tuple:

fruits = ("apple", "banana", "cherry")

But, in Python, we are also allowed to extract the values back into variables. This is called "unpacking":

Example

Unpacking a tuple:

fruits = ("apple", "banana", "cherry")

(green, yellow, red) = fruits

print(green)
print(yellow)
print(red)

Note: The number of variables must match the number of values in the tuple, if not, you must use an asterisk to collect the remaining values as a list.



Using Asterisk*

If the number of variables is less than the number of values, you can add an * to the variable name and the values will be assigned to the variable as a list:

Example

Assign the rest of the values as a list called "red":

fruits = ("apple", "banana", "cherry", "strawberry", "raspberry")

(green, yellow, *red) = fruits

print(green)
print(yellow)
print(red)

If the asterisk is added to another variable name than the last, Python will assign values to the variable until the number of values left matches the number of variables left.

Example

Add a list of values the "tropic" variable:

fruits = ("apple", "mango", "papaya", "pineapple", "cherry")

(green, *tropic, red) = fruits

print(green)
print(tropic)
print(red)

Python - Loop Tuples


Loop Through a Tuple

You can loop through the tuple items by using a for loop.

Example

Iterate through the items and print the values:

thistuple = ("apple", "banana", "cherry")
for x in thistuple:
  print(x)

Learn more about for loops in our Chapter.


Loop Through the Index Numbers

You can also loop through the tuple items by referring to their index number.

Use the range() and len() functions to create a suitable iterable.

Example

Print all items by referring to their index number:

thistuple = ("apple", "banana", "cherry")
for i in range(len(thistuple)):
  print(thistuple[i])


Using a While Loop

You can loop through the tuple items by using a while loop.

Use the len() function to determine the length of the tuple, then start at 0 and loop your way through the tuple items by referring to their indexes.

Remember to increase the index by 1 after each iteration.

Example

Print all items, using a while loop to go through all the index numbers:

thistuple = ("apple", "banana", "cherry")
i = 0
while i < len(thistuple):
  print(thistuple[i])
  i = i + 1

Learn more about while loops in our Chapter.


Python - Join Tuples


Join Two Tuples

To join two or more tuples you can use the + operator:

Example

Join two tuples:

tuple1 = ("a", "b" , "c")
tuple2 = (1, 2, 3)

tuple3 = tuple1 + tuple2
print(tuple3)

Multiply Tuples

If you want to multiply the content of a tuple a given number of times, you can use the * operator:

Example

Multiply the fruits tuple by 2:

fruits = ("apple", "banana", "cherry")
mytuple = fruits * 2

print(mytuple)

Python - Tuple Methods


Tuple Methods

Python has two built-in methods that you can use on tuples.

Method Description
Returns the number of times a specified value occurs in a tuple
Searches the tuple for a specified value and returns the position of where it was found

Python - Tuple Exercises


Test Yourself With Exercises

Now you have learned a lot about tuples, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:

Exercise:

Print the first item in the fruits tuple.

fruits = ("apple", 
"banana",
"cherry") print()

Go to the Exercise section and test all of our Python Tuple Exercises:


Python Sets


myset = {"apple", "banana", "cherry"}

Set

Sets are used to store multiple items in a single variable.

Set is one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

A set is a collection which is unordered, unchangeable*, and unindexed.

* Note: Set items are unchangeable, but you can remove items and add new items.

Sets are written with curly brackets.

Example

Create a Set:

thisset = {"apple", "banana", "cherry"}
print(thisset)

Note: Sets are unordered, so you cannot be sure in which order the items will appear.


Set Items

Set items are unordered, unchangeable, and do not allow duplicate values.


Unordered

Unordered means that the items in a set do not have a defined order.

Set items can appear in a different order every time you use them, and cannot be referred to by index or key.


Unchangeable

Set items are unchangeable, meaning that we cannot change the items after the set has been created.

Once a set is created, you cannot change its items, but you can remove items and add new items.


Duplicates Not Allowed

Sets cannot have two items with the same value.

Example

Duplicate values will be ignored:

thisset = {"apple", "banana", "cherry", "apple"}

print(thisset)

Note: The values True and 1 are considered the same value in sets, and are treated as duplicates:

Example

True and 1 is considered the same value:

thisset = {"apple", "banana", "cherry", True, 1, 2}

print(thisset)


Get the Length of a Set

To determine how many items a set has, use the len() function.

Example

Get the number of items in a set:

thisset = {"apple", "banana", "cherry"}

print(len(thisset))

Set Items - Data Types

Set items can be of any data type:

Example

String, int and boolean data types:

set1 = {"apple", "banana", "cherry"}
set2 = {1, 5, 7, 9, 3}
set3 = {True, False, False}

A set can contain different data types:

Example

A set with strings, integers and boolean values:

set1 = {"abc", 34, True, 40, "male"}

type()

From Python's perspective, sets are defined as objects with the data type 'set':

<class 'set'>

Example

What is the data type of a set?

myset = {"apple", "banana", "cherry"}
print(type(myset))

The set() Constructor

It is also possible to use the set() constructor to make a set.

Example

Using the set() constructor to make a set:

thisset = set(("apple", "banana", "cherry")) # note the double round-brackets
print(thisset)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove items and add new items.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python Sets


myset = {"apple", "banana", "cherry"}

Set

Sets are used to store multiple items in a single variable.

Set is one of 4 built-in data types in Python used to store collections of data, the other 3 are , , and , all with different qualities and usage.

A set is a collection which is unordered, unchangeable*, and unindexed.

* Note: Set items are unchangeable, but you can remove items and add new items.

Sets are written with curly brackets.

Example

Create a Set:

thisset = {"apple", "banana", "cherry"}
print(thisset)

Note: Sets are unordered, so you cannot be sure in which order the items will appear.


Set Items

Set items are unordered, unchangeable, and do not allow duplicate values.


Unordered

Unordered means that the items in a set do not have a defined order.

Set items can appear in a different order every time you use them, and cannot be referred to by index or key.


Unchangeable

Set items are unchangeable, meaning that we cannot change the items after the set has been created.

Once a set is created, you cannot change its items, but you can remove items and add new items.


Duplicates Not Allowed

Sets cannot have two items with the same value.

Example

Duplicate values will be ignored:

thisset = {"apple", "banana", "cherry", "apple"}

print(thisset)

Note: The values True and 1 are considered the same value in sets, and are treated as duplicates:

Example

True and 1 is considered the same value:

thisset = {"apple", "banana", "cherry", True, 1, 2}

print(thisset)


Get the Length of a Set

To determine how many items a set has, use the len() function.

Example

Get the number of items in a set:

thisset = {"apple", "banana", "cherry"}

print(len(thisset))

Set Items - Data Types

Set items can be of any data type:

Example

String, int and boolean data types:

set1 = {"apple", "banana", "cherry"}
set2 = {1, 5, 7, 9, 3}
set3 = {True, False, False}

A set can contain different data types:

Example

A set with strings, integers and boolean values:

set1 = {"abc", 34, True, 40, "male"}

type()

From Python's perspective, sets are defined as objects with the data type 'set':

<class 'set'>

Example

What is the data type of a set?

myset = {"apple", "banana", "cherry"}
print(type(myset))

The set() Constructor

It is also possible to use the set() constructor to make a set.

Example

Using the set() constructor to make a set:

thisset = set(("apple", "banana", "cherry")) # note the double round-brackets
print(thisset)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove items and add new items.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python - Access Set Items


Access Items

You cannot access items in a set by referring to an index or a key.

But you can loop through the set items using a for loop, or ask if a specified value is present in a set, by using the in keyword.

Example

Loop through the set, and print the values:

thisset = {"apple", "banana", "cherry"}

for x in thisset:
  print(x)

Example

Check if "banana" is present in the set:

thisset = {"apple", "banana", "cherry"}

print("banana" in thisset)

Change Items

Once a set is created, you cannot change its items, but you can add new items.


Python - Add Set Items


Add Items

Once a set is created, you cannot change its items, but you can add new items.

To add one item to a set use the add() method.

Example

Add an item to a set, using the add() method:

thisset = {"apple", "banana", "cherry"}

thisset.add("orange")

print(thisset)

Add Sets

To add items from another set into the current set, use the update() method.

Example

Add elements from tropical into thisset:

thisset = {"apple", "banana", "cherry"}
tropical = {"pineapple", "mango", "papaya"}

thisset.update(tropical)

print(thisset)

Add Any Iterable

The object in the update() method does not have to be a set, it can be any iterable object (tuples, lists, dictionaries etc.).

Example

Add elements of a list to at set:

thisset = {"apple", "banana", "cherry"}
mylist = ["kiwi", "orange"]

thisset.update(mylist)

print(thisset)

Python - Remove Set Items


Remove Item

To remove an item in a set, use the remove(), or the discard() method.

Example

Remove "banana" by using the remove() method:

thisset = {"apple", "banana", "cherry"}

thisset.remove("banana")

print(thisset)

Note: If the item to remove does not exist, remove() will raise an error.

Example

Remove "banana" by using the discard() method:

thisset = {"apple", "banana", "cherry"}

thisset.discard("banana")

print(thisset)

Note: If the item to remove does not exist, discard() will NOT raise an error.

You can also use the pop() method to remove an item, but this method will remove a random item, so you cannot be sure what item that gets removed.

The return value of the pop() method is the removed item.

Example

Remove a random item by using the pop() method:

thisset = {"apple", "banana", "cherry"}

x = thisset.pop()

print(x)

print(thisset)

Note: Sets are unordered, so when using the pop() method, you do not know which item that gets removed.

Example

The clear() method empties the set:

thisset = {"apple", "banana", "cherry"}

thisset.clear()

print(thisset)

Example

The del keyword will delete the set completely:

thisset = {"apple", "banana", "cherry"}

del thisset

print(thisset)

Python - Loop Sets


Loop Items

You can loop through the set items by using a for loop:

Example

Loop through the set, and print the values:

thisset = {"apple", "banana", "cherry"}

for x in thisset:
  print(x)

Python - Join Sets


Join Two Sets

There are several ways to join two or more sets in Python.

You can use the union() method that returns a new set containing all items from both sets, or the update() method that inserts all the items from one set into another:

Example

The union() method returns a new set with all items from both sets:

set1 = {"a", "b" , "c"}
set2 = {1, 2, 3}

set3 = set1.union(set2)
print(set3)

Example

The update() method inserts the items in set2 into set1:

set1 = {"a", "b" , "c"}
set2 = {1, 2, 3}

set1.update(set2)
print(set1)

Note: Both union() and update() will exclude any duplicate items.



Keep ONLY the Duplicates

The intersection_update() method will keep only the items that are present in both sets.

Example

Keep the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

x.intersection_update(y)

print(x)

The intersection() method will return a new set, that only contains the items that are present in both sets.

Example

Return a set that contains the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

z = x.intersection(y)

print(z)

Keep All, But NOT the Duplicates

The symmetric_difference_update() method will keep only the elements that are NOT present in both sets.

Example

Keep the items that are not present in both sets:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

x.symmetric_difference_update(y)

print(x)

The symmetric_difference() method will return a new set, that contains only the elements that are NOT present in both sets.

Example

Return a set that contains all items from both sets, except items that are present in both:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

z = x.symmetric_difference(y)

print(z)

Note: The values True and 1 are considered the same value in sets, and are treated as duplicates:

Example

True and 1 is considered the same value:

x = {"apple", "banana", "cherry", True}
y = {"google", 1, "apple", 2}

z = x.symmetric_difference(y)

Python - Set Methods


Set Methods

Python has a set of built-in methods that you can use on sets.

Method Description
Adds an element to the set
Removes all the elements from the set
Returns a copy of the set
Returns a set containing the difference between two or more sets
Removes the items in this set that are also included in another, specified set
Remove the specified item
Returns a set, that is the intersection of two other sets
Removes the items in this set that are not present in other, specified set(s)
Returns whether two sets have a intersection or not
Returns whether another set contains this set or not
Returns whether this set contains another set or not
Removes an element from the set
Removes the specified element
Returns a set with the symmetric differences of two sets
inserts the symmetric differences from this set and another
Return a set containing the union of sets
Update the set with the union of this set and others

Python - Set Exercises


Test Yourself With Exercises

Now you have learned a lot about sets, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:


Exercise:

Check if "apple" is present in the fruits set.

fruits = {"apple", 
"banana",
"cherry"} if "apple" fruits: print("Yes, apple is a fruit!")

Go to the Exercise section and test all of our Python Set Exercises:


Python Dictionaries


thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

Dictionary

Dictionaries are used to store data values in key:value pairs.

A dictionary is a collection which is ordered*, changeable and do not allow duplicates.

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

Dictionaries are written with curly brackets, and have keys and values:

Example

Create and print a dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(thisdict)

Dictionary Items

Dictionary items are ordered, changeable, and does not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by using the key name.

Example

Print the "brand" value of the dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(thisdict["brand"])

Ordered or Unordered?

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When we say that dictionaries are ordered, it means that the items have a defined order, and that order will not change.

Unordered means that the items does not have a defined order, you cannot refer to an item by using an index.


Changeable

Dictionaries are changeable, meaning that we can change, add or remove items after the dictionary has been created.


Duplicates Not Allowed

Dictionaries cannot have two items with the same key:

Example

Duplicate values will overwrite existing values:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964,
  "year": 2020
}
print(thisdict)


Dictionary Length

To determine how many items a dictionary has, use the len() function:

Example

Print the number of items in the dictionary:

print(len(thisdict))

Dictionary Items - Data Types

The values in dictionary items can be of any data type:

Example

String, int, boolean, and list data types:

thisdict = {
  "brand": "Ford",
  "electric": False,
  "year": 1964,
  "colors": ["red", "white", "blue"]
}

type()

From Python's perspective, dictionaries are defined as objects with the data type 'dict':

<class 'dict'>

Example

Print the data type of a dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(type(thisdict))

The dict() Constructor

It is also possible to use the dict() constructor to make a dictionary.

Example

Using the dict() method to make a dictionary:

thisdict = dict(name = "John", age = 36, country = "Norway")
print(thisdict)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • Dictionary is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python Dictionaries


thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

Dictionary

Dictionaries are used to store data values in key:value pairs.

A dictionary is a collection which is ordered*, changeable and do not allow duplicates.

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

Dictionaries are written with curly brackets, and have keys and values:

Example

Create and print a dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(thisdict)

Dictionary Items

Dictionary items are ordered, changeable, and does not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by using the key name.

Example

Print the "brand" value of the dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(thisdict["brand"])

Ordered or Unordered?

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When we say that dictionaries are ordered, it means that the items have a defined order, and that order will not change.

Unordered means that the items does not have a defined order, you cannot refer to an item by using an index.


Changeable

Dictionaries are changeable, meaning that we can change, add or remove items after the dictionary has been created.


Duplicates Not Allowed

Dictionaries cannot have two items with the same key:

Example

Duplicate values will overwrite existing values:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964,
  "year": 2020
}
print(thisdict)


Dictionary Length

To determine how many items a dictionary has, use the len() function:

Example

Print the number of items in the dictionary:

print(len(thisdict))

Dictionary Items - Data Types

The values in dictionary items can be of any data type:

Example

String, int, boolean, and list data types:

thisdict = {
  "brand": "Ford",
  "electric": False,
  "year": 1964,
  "colors": ["red", "white", "blue"]
}

type()

From Python's perspective, dictionaries are defined as objects with the data type 'dict':

<class 'dict'>

Example

Print the data type of a dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(type(thisdict))

The dict() Constructor

It is also possible to use the dict() constructor to make a dictionary.

Example

Using the dict() method to make a dictionary:

thisdict = dict(name = "John", age = 36, country = "Norway")
print(thisdict)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

  • is a collection which is ordered and changeable. Allows duplicate members.
  • is a collection which is ordered and unchangeable. Allows duplicate members.
  • is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.
  • Dictionary is a collection which is ordered** and changeable. No duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the right type for a particular data set could mean retention of meaning, and, it could mean an increase in efficiency or security.


Python - Access Dictionary Items


Accessing Items

You can access the items of a dictionary by referring to its key name, inside square brackets:

Example

Get the value of the "model" key:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
x = thisdict["model"]

There is also a method called get() that will give you the same result:

Example

Get the value of the "model" key:

x = thisdict.get("model")

Get Keys

The keys() method will return a list of all the keys in the dictionary.

Example

Get a list of the keys:

x = thisdict.keys()

The list of the keys is a view of the dictionary, meaning that any changes done to the dictionary will be reflected in the keys list.

Example

Add a new item to the original dictionary, and see that the keys list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.keys()

print(x) #before the change

car["color"] = "white"

print(x) #after the change


Get Values

The values() method will return a list of all the values in the dictionary.

Example

Get a list of the values:

x = thisdict.values()

The list of the values is a view of the dictionary, meaning that any changes done to the dictionary will be reflected in the values list.

Example

Make a change in the original dictionary, and see that the values list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.values()

print(x) #before the change

car["year"] = 2020

print(x) #after the change

Example

Add a new item to the original dictionary, and see that the values list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.values()

print(x) #before the change

car["color"] = "red"

print(x) #after the change

Get Items

The items() method will return each item in a dictionary, as tuples in a list.

Example

Get a list of the key:value pairs

x = thisdict.items()

The returned list is a view of the items of the dictionary, meaning that any changes done to the dictionary will be reflected in the items list.

Example

Make a change in the original dictionary, and see that the items list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.items()

print(x) #before the change

car["year"] = 2020

print(x) #after the change

Example

Add a new item to the original dictionary, and see that the items list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.items()

print(x) #before the change

car["color"] = "red"

print(x) #after the change

Check if Key Exists

To determine if a specified key is present in a dictionary use the in keyword:

Example

Check if "model" is present in the dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
if "model" in thisdict:
  print("Yes, 'model' is one of the keys in the thisdict dictionary")

Python - Change Dictionary Items


Change Values

You can change the value of a specific item by referring to its key name:

Example

Change the "year" to 2018:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict["year"] = 2018

Update Dictionary

The update() method will update the dictionary with the items from the given argument.

The argument must be a dictionary, or an iterable object with key:value pairs.

Example

Update the "year" of the car by using the update() method:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict.update({"year": 2020})

Python - Add Dictionary Items


Adding Items

Adding an item to the dictionary is done by using a new index key and assigning a value to it:

Example

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict["color"] = "red"
print(thisdict)

Update Dictionary

The update() method will update the dictionary with the items from a given argument. If the item does not exist, the item will be added.

The argument must be a dictionary, or an iterable object with key:value pairs.

Example

Add a color item to the dictionary by using the update() method:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict.update({"color": "red"})

Python - Remove Dictionary Items


Removing Items

There are several methods to remove items from a dictionary:

Example

The pop() method removes the item with the specified key name:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict.pop("model")
print(thisdict)

Example

The popitem() method removes the last inserted item (in versions before 3.7, a random item is removed instead):

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict.popitem()
print(thisdict)

Example

The del keyword removes the item with the specified key name:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
del thisdict["model"]
print(thisdict)

Example

The del keyword can also delete the dictionary completely:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
del thisdict
print(thisdict) #this will cause an error because "thisdict" no longer exists.

Example

The clear() method empties the dictionary:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
thisdict.clear()
print(thisdict)

Python - Loop Dictionaries


Loop Through a Dictionary

You can loop through a dictionary by using a for loop.

When looping through a dictionary, the return value are the keys of the dictionary, but there are methods to return the values as well.

Example

Print all key names in the dictionary, one by one:

for x in thisdict:
  print(x)

Example

Print all values in the dictionary, one by one:

for x in thisdict:
  print(thisdict[x])

Example

You can also use the values() method to return values of a dictionary:

for x in thisdict.values():
  print(x)

Example

You can use the keys() method to return the keys of a dictionary:

for x in thisdict.keys():
  print(x)

Example

Loop through both keys and values, by using the items() method:

for x, y in thisdict.items():
  print(x, y)

Python - Copy Dictionaries


Copy a Dictionary

You cannot copy a dictionary simply by typing dict2 = dict1, because: dict2 will only be a reference to dict1, and changes made in dict1 will automatically also be made in dict2.

There are ways to make a copy, one way is to use the built-in Dictionary method copy().

Example

Make a copy of a dictionary with the copy() method:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
mydict = thisdict.copy()
print(mydict)

Another way to make a copy is to use the built-in function dict().

Example

Make a copy of a dictionary with the dict() function:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
mydict = dict(thisdict)
print(mydict)

Python - Nested Dictionaries


Nested Dictionaries

A dictionary can contain dictionaries, this is called nested dictionaries.

Example

Create a dictionary that contain three dictionaries:

myfamily = {
  "child1" : {
    "name" : "Emil",
    "year" : 2004
  },
  "child2" : {
    "name" : "Tobias",
    "year" : 2007
  },
  "child3" : {
    "name" : "Linus",
    "year" : 2011
  }
}

Or, if you want to add three dictionaries into a new dictionary:

Example

Create three dictionaries, then create one dictionary that will contain the other three dictionaries:

child1 = {
  "name" : "Emil",
  "year" : 2004
}
child2 = {
  "name" : "Tobias",
  "year" : 2007
}
child3 = {
  "name" : "Linus",
  "year" : 2011
}

myfamily = {
  "child1" : child1,
  "child2" : child2,
  "child3" : child3
}

Access Items in Nested Dictionaries

To access items from a nested dictionary, you use the name of the dictionaries, starting with the outer dictionary:

Example

Print the name of child 2:

print(myfamily["child2"]["name"])

Python Dictionary Methods


Dictionary Methods

Python has a set of built-in methods that you can use on dictionaries.

Method Description
Removes all the elements from the dictionary
Returns a copy of the dictionary
Returns a dictionary with the specified keys and value
Returns the value of the specified key
Returns a list containing a tuple for each key value pair
Returns a list containing the dictionary's keys
Removes the element with the specified key
Removes the last inserted key-value pair
Returns the value of the specified key. If the key does not exist: insert the key, with the specified value
Updates the dictionary with the specified key-value pairs
Returns a list of all the values in the dictionary

Python Dictionary Exercises


Test Yourself With Exercises

Now you have learned a lot about dictionaries, and how to use them in Python.

Are you ready for a test?

Try to insert the missing part to make the code work as expected:


Test Yourself With Exercises

Exercise:

Use the get method to print the value of the "model" key of the car dictionary.

car =	{
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print()

Go to the Exercise section and test all of our Python Dictionary Exercises:


Python If ... Else


Python Conditions and If statements

Python supports the usual logical conditions from mathematics:

  • Equals: a == b
  • Not Equals: a != b
  • Less than: a < b
  • Less than or equal to: a <= b
  • Greater than: a > b
  • Greater than or equal to: a >= b

These conditions can be used in several ways, most commonly in "if statements" and loops.

An "if statement" is written by using the if keyword.

Example

If statement:

a = 33
b = 200
if b > a:
  print("b is greater than a")

In this example we use two variables, a and b, which are used as part of the if statement to test whether b is greater than a. As a is 33, and b is 200, we know that 200 is greater than 33, and so we print to screen that "b is greater than a".

Indentation

Python relies on indentation (whitespace at the beginning of a line) to define scope in the code. Other programming languages often use curly-brackets for this purpose.

Example

If statement, without indentation (will raise an error):

a = 33
b = 200
if b > a:
print("b is greater than a") # you will get an error


Elif

The elif keyword is Python's way of saying "if the previous conditions were not true, then try this condition".

Example

a = 33
b = 33
if b > a:
  print("b is greater than a")
elif a == b:
  print("a and b are equal")

In this example a is equal to b, so the first condition is not true, but the elif condition is true, so we print to screen that "a and b are equal".


Else

The else keyword catches anything which isn't caught by the preceding conditions.

Example

a = 200
b = 33
if b > a:
  print("b is greater than a")
elif a == b:
  print("a and b are equal")
else:
  print("a is greater than b")

In this example a is greater than b, so the first condition is not true, also the elif condition is not true, so we go to the else condition and print to screen that "a is greater than b".

You can also have an else without the elif:

Example

a = 200
b = 33
if b > a:
  print("b is greater than a")
else:
  print("b is not greater than a")

Short Hand If

If you have only one statement to execute, you can put it on the same line as the if statement.

Example

One line if statement:

if a > b: print("a is greater than b")

Short Hand If ... Else

If you have only one statement to execute, one for if, and one for else, you can put it all on the same line:

Example

One line if else statement:

a = 2
b = 330
print("A") if a > b else print("B")

This technique is known as Ternary Operators, or Conditional Expressions.

You can also have multiple else statements on the same line:

Example

One line if else statement, with 3 conditions:

a = 330
b = 330
print("A") if a > b else print("=") if a == b else print("B")

And

The and keyword is a logical operator, and is used to combine conditional statements:

Example

Test if a is greater than b, AND if c is greater than a:

a = 200
b = 33
c = 500
if a > b and c > a:
  print("Both conditions are True")

Or

The or keyword is a logical operator, and is used to combine conditional statements:

Example

Test if a is greater than b, OR if a is greater than c:

a = 200
b = 33
c = 500
if a > b or a > c:
  print("At least one of the conditions is True")

Not

The not keyword is a logical operator, and is used to reverse the result of the conditional statement:

Example

Test if a is NOT greater than b:

a = 33
b = 200
if not a > b:
  print("a is NOT greater than b")

Nested If

You can have if statements inside if statements, this is called nested if statements.

Example

x = 41

if x > 10:
  print("Above ten,")
  if x > 20:
    print("and also above 20!")
  else:
    print("but not above 20.")

The pass Statement

if statements cannot be empty, but if you for some reason have an if statement with no content, put in the pass statement to avoid getting an error.

Example

a = 33
b = 200

if b > a:
  pass

Test Yourself With Exercises

Exercise:

Print "Hello World" if a is greater than b.

a = 50
b = 10
 a  b
  print("Hello World")



Python While Loops


Python Loops

Python has two primitive loop commands:

  • while loops
  • for loops

The while Loop

With the while loop we can execute a set of statements as long as a condition is true.

Example

Print i as long as i is less than 6:

i = 1
while i < 6:
  print(i)
  i += 1

Note: remember to increment i, or else the loop will continue forever.

The while loop requires relevant variables to be ready, in this example we need to define an indexing variable, i, which we set to 1.


The break Statement

With the break statement we can stop the loop even if the while condition is true:

Example

Exit the loop when i is 3:

i = 1
while i < 6:
  print(i)
  if i == 3:
    break
  i += 1


The continue Statement

With the continue statement we can stop the current iteration, and continue with the next:

Example

Continue to the next iteration if i is 3:

i = 0
while i < 6:
  i += 1
  if i == 3:
    continue
  print(i)

The else Statement

With the else statement we can run a block of code once when the condition no longer is true:

Example

Print a message once the condition is false:

i = 1
while i < 6:
  print(i)
  i += 1
else:
  print("i is no longer less than 6")

Test Yourself With Exercises

Exercise:

Print i as long as i is less than 6.

i = 1
 i < 6
  print(i)
  i += 1



Python For Loops


Python For Loops

A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string).

This is less like the for keyword in other programming languages, and works more like an iterator method as found in other object-orientated programming languages.

With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.

Example

Print each fruit in a fruit list:

fruits = ["apple", "banana", "cherry"]
for x in fruits:
  print(x)

The for loop does not require an indexing variable to set beforehand.


Looping Through a String

Even strings are iterable objects, they contain a sequence of characters:

Example

Loop through the letters in the word "banana":

for x in "banana":
  print(x)

The break Statement

With the break statement we can stop the loop before it has looped through all the items:

Example

Exit the loop when x is "banana":

fruits = ["apple", "banana", "cherry"]
for x in fruits:
  print(x)
  if x == "banana":
    break

Example

Exit the loop when x is "banana", but this time the break comes before the print:

fruits = ["apple", "banana", "cherry"]
for x in fruits:
  if x == "banana":
    break
  print(x)


The continue Statement

With the continue statement we can stop the current iteration of the loop, and continue with the next:

Example

Do not print banana:

fruits = ["apple", "banana", "cherry"]
for x in fruits:
  if x == "banana":
    continue
  print(x)

The range() Function

To loop through a set of code a specified number of times, we can use the range() function,

The range() function returns a sequence of numbers, starting from 0 by default, and increments by 1 (by default), and ends at a specified number.

Example

Using the range() function:

for x in range(6):
  print(x)

Note that range(6) is not the values of 0 to 6, but the values 0 to 5.

The range() function defaults to 0 as a starting value, however it is possible to specify the starting value by adding a parameter: range(2, 6), which means values from 2 to 6 (but not including 6):

Example

Using the start parameter:

for x in range(2, 6):
  print(x)

The range() function defaults to increment the sequence by 1, however it is possible to specify the increment value by adding a third parameter: range(2, 30, 3):

Example

Increment the sequence with 3 (default is 1):

for x in range(2, 30, 3):
  print(x)

Else in For Loop

The else keyword in a for loop specifies a block of code to be executed when the loop is finished:

Example

Print all numbers from 0 to 5, and print a message when the loop has ended:

for x in range(6):
  print(x)
else:
  print("Finally finished!")

Note: The else block will NOT be executed if the loop is stopped by a break statement.

Example

Break the loop when x is 3, and see what happens with the else block:

for x in range(6):
  if x == 3: break
  print(x)
else:
  print("Finally finished!")

Nested Loops

A nested loop is a loop inside a loop.

The "inner loop" will be executed one time for each iteration of the "outer loop":

Example

Print each adjective for every fruit:

adj = ["red", "big", "tasty"]
fruits = ["apple", "banana", "cherry"]

for x in adj:
  for y in fruits:
    print(x, y)

The pass Statement

for loops cannot be empty, but if you for some reason have a for loop with no content, put in the pass statement to avoid getting an error.

Example

for x in [0, 1, 2]:
  pass

Test Yourself With Exercises

Exercise:

Loop through the items in the fruits list.

fruits = ["apple", 
"banana", "cherry"] x fruits print(x)


Python Functions


A function is a block of code which only runs when it is called.

You can pass data, known as parameters, into a function.

A function can return data as a result.


Creating a Function

In Python a function is defined using the def keyword:

Example

def my_function():
  print("Hello from a function")

Calling a Function

To call a function, use the function name followed by parenthesis:

Example

def my_function():
  print("Hello from a function")

my_function()

Arguments

Information can be passed into functions as arguments.

Arguments are specified after the function name, inside the parentheses. You can add as many arguments as you want, just separate them with a comma.

The following example has a function with one argument (fname). When the function is called, we pass along a first name, which is used inside the function to print the full name:

Example

def my_function(fname):
  print(fname + " Refsnes")

my_function("Emil")
my_function("Tobias")
my_function("Linus")

Arguments are often shortened to args in Python documentations.



Parameters or Arguments?

The terms parameter and argument can be used for the same thing: information that are passed into a function.

From a function's perspective:

A parameter is the variable listed inside the parentheses in the function definition.

An argument is the value that is sent to the function when it is called.


Number of Arguments

By default, a function must be called with the correct number of arguments. Meaning that if your function expects 2 arguments, you have to call the function with 2 arguments, not more, and not less.

Example

This function expects 2 arguments, and gets 2 arguments:

def my_function(fname, lname):
  print(fname + " " + lname)

my_function("Emil", "Refsnes")
If you try to call the function with 1 or 3 arguments, you will get an error:

Example

This function expects 2 arguments, but gets only 1:

def my_function(fname, lname):
  print(fname + " " + lname)

my_function("Emil")

Arbitrary Arguments, *args

If you do not know how many arguments that will be passed into your function, add a * before the parameter name in the function definition.

This way the function will receive a tuple of arguments, and can access the items accordingly:

Example

If the number of arguments is unknown, add a * before the parameter name:

def my_function(*kids):
  print("The youngest child is " + kids[2])

my_function("Emil", "Tobias", "Linus")

Arbitrary Arguments are often shortened to *args in Python documentations.


Keyword Arguments

You can also send arguments with the key = value syntax.

This way the order of the arguments does not matter.

Example

def my_function(child3, child2, child1):
  print("The youngest child is " + child3)

my_function(child1 = "Emil", child2 = "Tobias", child3 = "Linus")

The phrase Keyword Arguments are often shortened to kwargs in Python documentations.


Arbitrary Keyword Arguments, **kwargs

If you do not know how many keyword arguments that will be passed into your function, add two asterisk: ** before the parameter name in the function definition.

This way the function will receive a dictionary of arguments, and can access the items accordingly:

Example

If the number of keyword arguments is unknown, add a double ** before the parameter name:

def my_function(**kid):
  print("His last name is " + kid["lname"])

my_function(fname = "Tobias", lname = "Refsnes")

Arbitrary Kword Arguments are often shortened to **kwargs in Python documentations.


Default Parameter Value

The following example shows how to use a default parameter value.

If we call the function without argument, it uses the default value:

Example

def my_function(country = "Norway"):
  print("I am from " + country)

my_function("Sweden")
my_function("India")
my_function()
my_function("Brazil")

Passing a List as an Argument

You can send any data types of argument to a function (string, number, list, dictionary etc.), and it will be treated as the same data type inside the function.

E.g. if you send a List as an argument, it will still be a List when it reaches the function:

Example

def my_function(food):
  for x in food:
    print(x)

fruits = ["apple", "banana", "cherry"]

my_function(fruits)

Return Values

To let a function return a value, use the return statement:

Example

def my_function(x):
  return 5 * x

print(my_function(3))
print(my_function(5))
print(my_function(9))

The pass Statement

function definitions cannot be empty, but if you for some reason have a function definition with no content, put in the pass statement to avoid getting an error.

Example

def myfunction():
  pass

Recursion

Python also accepts function recursion, which means a defined function can call itself.

Recursion is a common mathematical and programming concept. It means that a function calls itself. This has the benefit of meaning that you can loop through data to reach a result.

The developer should be very careful with recursion as it can be quite easy to slip into writing a function which never terminates, or one that uses excess amounts of memory or processor power. However, when written correctly recursion can be a very efficient and mathematically-elegant approach to programming.

In this example, tri_recursion() is a function that we have defined to call itself ("recurse"). We use the k variable as the data, which decrements (-1) every time we recurse. The recursion ends when the condition is not greater than 0 (i.e. when it is 0).

To a new developer it can take some time to work out how exactly this works, best way to find out is by testing and modifying it.

Example

Recursion Example

def tri_recursion(k):
  if(k > 0):
    result = k + tri_recursion(k - 1)
    print(result)
  else:
    result = 0
  return result

print("nnRecursion Example Results")
tri_recursion(6)

Test Yourself With Exercises

Exercise:

Create a function named my_function.

:
  print("Hello from a function")


Python Lambda


A lambda function is a small anonymous function.

A lambda function can take any number of arguments, but can only have one expression.


Syntax

lambda arguments : expression

The expression is executed and the result is returned:

Example

Add 10 to argument a, and return the result:

x = lambda a : a + 10
print(x(5))

Lambda functions can take any number of arguments:

Example

Multiply argument a with argument b and return the result:

x = lambda a, b : a * b
print(x(5, 6))

Example

Summarize argument a, b, and c and return the result:

x = lambda a, b, c : a + b + c
print(x(5, 6, 2))


Why Use Lambda Functions?

The power of lambda is better shown when you use them as an anonymous function inside another function.

Say you have a function definition that takes one argument, and that argument will be multiplied with an unknown number:

def myfunc(n):
  return lambda a : a * n

Use that function definition to make a function that always doubles the number you send in:

Example

def myfunc(n):
  return lambda a : a * n

mydoubler = myfunc(2)

print(mydoubler(11))

Or, use the same function definition to make a function that always triples the number you send in:

Example

def myfunc(n):
  return lambda a : a * n

mytripler = myfunc(3)

print(mytripler(11))

Or, use the same function definition to make both functions, in the same program:

Example

def myfunc(n):
  return lambda a : a * n

mydoubler = myfunc(2)
mytripler = myfunc(3)

print(mydoubler(11))
print(mytripler(11))

Use lambda functions when an anonymous function is required for a short period of time.


Test Yourself With Exercises

Exercise:

Create a lambda function that takes one parameter (a) and returns it.

x =    


Python Arrays


Note: Python does not have built-in support for Arrays, but can be used instead.


Arrays

Note: This page shows you how to use LISTS as ARRAYS, however, to work with arrays in Python you will have to import a library, like the .

Arrays are used to store multiple values in one single variable:

Example

Create an array containing car names:

cars = ["Ford", "Volvo", "BMW"]

What is an Array?

An array is a special variable, which can hold more than one value at a time.

If you have a list of items (a list of car names, for example), storing the cars in single variables could look like this:

car1 = "Ford"
car2 = "Volvo"
car3 = "BMW"

However, what if you want to loop through the cars and find a specific one? And what if you had not 3 cars, but 300?

The solution is an array!

An array can hold many values under a single name, and you can access the values by referring to an index number.


Access the Elements of an Array

You refer to an array element by referring to the index number.

Example

Get the value of the first array item:

x = cars[0]

Example

Modify the value of the first array item:

cars[0] = "Toyota"

The Length of an Array

Use the len() method to return the length of an array (the number of elements in an array).

Example

Return the number of elements in the cars array:

x = len(cars)

Note: The length of an array is always one more than the highest array index.



Looping Array Elements

You can use the for in loop to loop through all the elements of an array.

Example

Print each item in the cars array:

for x in cars:
  print(x)

Adding Array Elements

You can use the append() method to add an element to an array.

Example

Add one more element to the cars array:

cars.append("Honda")

Removing Array Elements

You can use the pop() method to remove an element from the array.

Example

Delete the second element of the cars array:

cars.pop(1)

You can also use the remove() method to remove an element from the array.

Example

Delete the element that has the value "Volvo":

cars.remove("Volvo")

Note: The list's remove() method only removes the first occurrence of the specified value.


Array Methods

Python has a set of built-in methods that you can use on lists/arrays.

Method Description
Adds an element at the end of the list
Removes all the elements from the list
Returns a copy of the list
Returns the number of elements with the specified value
Add the elements of a list (or any iterable), to the end of the current list
Returns the index of the first element with the specified value
Adds an element at the specified position
Removes the element at the specified position
Removes the first item with the specified value
Reverses the order of the list
Sorts the list

Note: Python does not have built-in support for Arrays, but Python Lists can be used instead.


Python Classes and Objects


Python Classes/Objects

Python is an object oriented programming language.

Almost everything in Python is an object, with its properties and methods.

A Class is like an object constructor, or a "blueprint" for creating objects.


Create a Class

To create a class, use the keyword class:

Example

Create a class named MyClass, with a property named x:

class MyClass:
  x = 5

Create Object

Now we can use the class named MyClass to create objects:

Example

Create an object named p1, and print the value of x:

p1 = MyClass()
print(p1.x)

The __init__() Function

The examples above are classes and objects in their simplest form, and are not really useful in real life applications.

To understand the meaning of classes we have to understand the built-in __init__() function.

All classes have a function called __init__(), which is always executed when the class is being initiated.

Use the __init__() function to assign values to object properties, or other operations that are necessary to do when the object is being created:

Example

Create a class named Person, use the __init__() function to assign values for name and age:

class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

p1 = Person("John", 36)

print(p1.name)
print(p1.age)

Note: The __init__() function is called automatically every time the class is being used to create a new object.



The __str__() Function

The __str__() function controls what should be returned when the class object is represented as a string.

If the __str__() function is not set, the string representation of the object is returned:

Example

The string representation of an object WITHOUT the __str__() function:

class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

p1 = Person("John", 36)

print(p1)

Example

The string representation of an object WITH the __str__() function:

class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

  def __str__(self):
    return f"{self.name}({self.age})"

p1 = Person("John", 36)

print(p1)

Object Methods

Objects can also contain methods. Methods in objects are functions that belong to the object.

Let us create a method in the Person class:

Example

Insert a function that prints a greeting, and execute it on the p1 object:

class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

  def myfunc(self):
    print("Hello my name is " + self.name)

p1 = Person("John", 36)
p1.myfunc()

Note: The self parameter is a reference to the current instance of the class, and is used to access variables that belong to the class.


The self Parameter

The self parameter is a reference to the current instance of the class, and is used to access variables that belongs to the class.

It does not have to be named self , you can call it whatever you like, but it has to be the first parameter of any function in the class:

Example

Use the words mysillyobject and abc instead of self:

class Person:
  def __init__(mysillyobject, name, age):
    mysillyobject.name = name
    mysillyobject.age = age

  def myfunc(abc):
    print("Hello my name is " + abc.name)

p1 = Person("John", 36)
p1.myfunc()

Modify Object Properties

You can modify properties on objects like this:

Example

Set the age of p1 to 40:

p1.age = 40

Delete Object Properties

You can delete properties on objects by using the del keyword:

Example

Delete the age property from the p1 object:

del p1.age

Delete Objects

You can delete objects by using the del keyword:

Example

Delete the p1 object:

del p1

The pass Statement

class definitions cannot be empty, but if you for some reason have a class definition with no content, put in the pass statement to avoid getting an error.

Example

class Person:
  pass

Test Yourself With Exercises

Exercise:

Create a class named MyClass:

 MyClass:
  x = 5

Python Inheritance


Python Inheritance

Inheritance allows us to define a class that inherits all the methods and properties from another class.

Parent class is the class being inherited from, also called base class.

Child class is the class that inherits from another class, also called derived class.


Create a Parent Class

Any class can be a parent class, so the syntax is the same as creating any other class:

Example

Create a class named Person, with firstname and lastname properties, and a printname method:

class Person:
  def __init__(self, fname, lname):
    self.firstname = fname
    self.lastname = lname

  def printname(self):
    print(self.firstname, self.lastname)

#Use the Person class to create an object, and then execute the printname method:

x = Person("John", "Doe")
x.printname()

Create a Child Class

To create a class that inherits the functionality from another class, send the parent class as a parameter when creating the child class:

Example

Create a class named Student, which will inherit the properties and methods from the Person class:

class Student(Person):
  pass

Note: Use the pass keyword when you do not want to add any other properties or methods to the class.

Now the Student class has the same properties and methods as the Person class.

Example

Use the Student class to create an object, and then execute the printname method:

x = Student("Mike", "Olsen")
x.printname()


Add the __init__() Function

So far we have created a child class that inherits the properties and methods from its parent.

We want to add the __init__() function to the child class (instead of the pass keyword).

Note: The __init__() function is called automatically every time the class is being used to create a new object.

Example

Add the __init__() function to the Student class:

class Student(Person):
  def __init__(self, fname, lname):
    #add properties etc.

When you add the __init__() function, the child class will no longer inherit the parent's __init__() function.

Note: The child's __init__() function overrides the inheritance of the parent's __init__() function.

To keep the inheritance of the parent's __init__() function, add a call to the parent's __init__() function:

Example

class Student(Person):
  def __init__(self, fname, lname):
    Person.__init__(self, fname, lname)

Now we have successfully added the __init__() function, and kept the inheritance of the parent class, and we are ready to add functionality in the __init__() function.


Use the super() Function

Python also has a super() function that will make the child class inherit all the methods and properties from its parent:

Example

class Student(Person):
  def __init__(self, fname, lname):
    super().__init__(fname, lname)

By using the super() function, you do not have to use the name of the parent element, it will automatically inherit the methods and properties from its parent.


Add Properties

Example

Add a property called graduationyear to the Student class:

class Student(Person):
  def __init__(self, fname, lname):
    super().__init__(fname, lname)
    self.graduationyear = 2019

In the example below, the year 2019 should be a variable, and passed into the Student class when creating student objects. To do so, add another parameter in the __init__() function:

Example

Add a year parameter, and pass the correct year when creating objects:

class Student(Person):
  def __init__(self, fname, lname, year):
    super().__init__(fname, lname)
    self.graduationyear = year

x = Student("Mike", "Olsen", 2019)

Add Methods

Example

Add a method called welcome to the Student class:

class Student(Person):
  def __init__(self, fname, lname, year):
    super().__init__(fname, lname)
    self.graduationyear = year

  def welcome(self):
    print("Welcome", self.firstname, self.lastname, "to the class of", self.graduationyear)

If you add a method in the child class with the same name as a function in the parent class, the inheritance of the parent method will be overridden.


Test Yourself With Exercises

Exercise:

What is the correct syntax to create a class named Student that will inherit properties and methods from a class named Person?

class :


Python Iterators


Python Iterators

An iterator is an object that contains a countable number of values.

An iterator is an object that can be iterated upon, meaning that you can traverse through all the values.

Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__().


Iterator vs Iterable

Lists, tuples, dictionaries, and sets are all iterable objects. They are iterable containers which you can get an iterator from.

All these objects have a iter() method which is used to get an iterator:

Example

Return an iterator from a tuple, and print each value:

mytuple = ("apple", "banana", "cherry")
myit = iter(mytuple)

print(next(myit))
print(next(myit))
print(next(myit))

Even strings are iterable objects, and can return an iterator:

Example

Strings are also iterable objects, containing a sequence of characters:

mystr = "banana"
myit = iter(mystr)

print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))
print(next(myit))

Looping Through an Iterator

We can also use a for loop to iterate through an iterable object:

Example

Iterate the values of a tuple:

mytuple = ("apple", "banana", "cherry")

for x in mytuple:
  print(x)

Example

Iterate the characters of a string:

mystr = "banana"

for x in mystr:
  print(x)

The for loop actually creates an iterator object and executes the next() method for each loop.



Create an Iterator

To create an object/class as an iterator you have to implement the methods __iter__() and __next__() to your object.

As you have learned in the chapter, all classes have a function called __init__(), which allows you to do some initializing when the object is being created.

The __iter__() method acts similar, you can do operations (initializing etc.), but must always return the iterator object itself.

The __next__() method also allows you to do operations, and must return the next item in the sequence.

Example

Create an iterator that returns numbers, starting with 1, and each sequence will increase by one (returning 1,2,3,4,5 etc.):

class MyNumbers:
  def __iter__(self):
    self.a = 1
    return self

  def __next__(self):
    x = self.a
    self.a += 1
    return x

myclass = MyNumbers()
myiter = iter(myclass)

print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))
print(next(myiter))

StopIteration

The example above would continue forever if you had enough next() statements, or if it was used in a for loop.

To prevent the iteration from going on forever, we can use the StopIteration statement.

In the __next__() method, we can add a terminating condition to raise an error if the iteration is done a specified number of times:

Example

Stop after 20 iterations:

class MyNumbers:
  def __iter__(self):
    self.a = 1
    return self

  def __next__(self):
    if self.a <= 20:
      x = self.a
      self.a += 1
      return x
    else:
      raise StopIteration

myclass = MyNumbers()
myiter = iter(myclass)

for x in myiter:
  print(x)


Python Polymorphism


The word "polymorphism" means "many forms", and in programming it refers to methods/functions/operators with the same name that can be executed on many objects or classes.


Function Polymorphism

An example of a Python function that can be used on different objects is the len() function.

String

For strings len() returns the number of characters:

Example

x = "Hello World!"

print(len(x))

Tuple

For tuples len() returns the number of items in the tuple:

Example

mytuple = ("apple", "banana", "cherry")

print(len(mytuple))

Dictionary

For dictionaries len() returns the number of key/value pairs in the dictionary:

Example

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

print(len(thisdict))


Class Polymorphism

Polymorphism is often used in Class methods, where we can have multiple classes with the same method name.

For example, say we have three classes: Car, Boat, and Plane, and they all have a method called move():

Example

Different classes with the same method:

class Car:
  def __init__(self, brand, model):
    self.brand = brand
    self.model = model

  def move(self):
    print("Drive!")

class Boat:
  def __init__(self, brand, model):
    self.brand = brand
    self.model = model

  def move(self):
    print("Sail!")

class Plane:
  def __init__(self, brand, model):
    self.brand = brand
    self.model = model

  def move(self):
    print("Fly!")

car1 = Car("Ford", "Mustang")       #Create a Car class
boat1 = Boat("Ibiza", "Touring 20") #Create a Boat class
plane1 = Plane("Boeing", "747")     #Create a Plane class

for x in (car1, boat1, plane1):
  x.move()

Look at the for loop at the end. Because of polymorphism we can execute the same method for all three classes.


Inheritance Class Polymorphism

What about classes with child classes with the same name? Can we use polymorphism there?

Yes. If we use the example above and make a parent class called Vehicle, and make Car, Boat, Plane child classes of Vehicle, the child classes inherits the Vehicle methods, but can override them:

Example

Create a class called Vehicle and make Car, Boat, Plane child classes of Vehicle:

class Vehicle:
  def __init__(self, brand, model):
    self.brand = brand
    self.model = model

  def move(self):
    print("Move!")

class Car(Vehicle):
  pass

class Boat(Vehicle):
  def move(self):
    print("Sail!")

class Plane(Vehicle):
  def move(self):
    print("Fly!")

car1 = Car("Ford", "Mustang") #Create a Car object
boat1 = Boat("Ibiza", "Touring 20") #Create a Boat object
plane1 = Plane("Boeing", "747") #Create a Plane object

for x in (car1, boat1, plane1):
  print(x.brand)
  print(x.model)
  x.move()

Child classes inherits the properties and methods from the parent class.

In the example above you can see that the Car class is empty, but it inherits brand, model, and move() from Vehicle.

The Boat and Plane classes also inherit brand, model, and move() from Vehicle, but they both override the move() method.

Because of polymorphism we can execute the same method for all classes.


Python Scope


A variable is only available from inside the region it is created. This is called scope.


Local Scope

A variable created inside a function belongs to the local scope of that function, and can only be used inside that function.

Example

A variable created inside a function is available inside that function:

def myfunc():
  x = 300
  print(x)

myfunc()

Function Inside Function

As explained in the example above, the variable x is not available outside the function, but it is available for any function inside the function:

Example

The local variable can be accessed from a function within the function:

def myfunc():
  x = 300
  def myinnerfunc():
    print(x)
  myinnerfunc()

myfunc()


Global Scope

A variable created in the main body of the Python code is a global variable and belongs to the global scope.

Global variables are available from within any scope, global and local.

Example

A variable created outside of a function is global and can be used by anyone:

x = 300

def myfunc():
  print(x)

myfunc()

print(x)

Naming Variables

If you operate with the same variable name inside and outside of a function, Python will treat them as two separate variables, one available in the global scope (outside the function) and one available in the local scope (inside the function):

Example

The function will print the local x, and then the code will print the global x:

x = 300

def myfunc():
  x = 200
  print(x)

myfunc()

print(x)

Global Keyword

If you need to create a global variable, but are stuck in the local scope, you can use the global keyword.

The global keyword makes the variable global.

Example

If you use the global keyword, the variable belongs to the global scope:

def myfunc():
  global x
  x = 300

myfunc()

print(x)

Also, use the global keyword if you want to make a change to a global variable inside a function.

Example

To change the value of a global variable inside a function, refer to the variable by using the global keyword:

x = 300

def myfunc():
  global x
  x = 200

myfunc()

print(x)

Python Modules


What is a Module?

Consider a module to be the same as a code library.

A file containing a set of functions you want to include in your application.


Create a Module

To create a module just save the code you want in a file with the file extension .py:

Example

Save this code in a file named mymodule.py

def greeting(name):
  print("Hello, " + name)

Use a Module

Now we can use the module we just created, by using the import statement:

Example

Import the module named mymodule, and call the greeting function:

import mymodule

mymodule.greeting("Jonathan")

Note: When using a function from a module, use the syntax: module_name.function_name.


Variables in Module

The module can contain functions, as already described, but also variables of all types (arrays, dictionaries, objects etc):

Example

Save this code in the file mymodule.py

person1 = {
  "name": "John",
  "age": 36,
  "country": "Norway"
}

Example

Import the module named mymodule, and access the person1 dictionary:

import mymodule

a = mymodule.person1["age"]
print(a)


Naming a Module

You can name the module file whatever you like, but it must have the file extension .py

Re-naming a Module

You can create an alias when you import a module, by using the as keyword:

Example

Create an alias for mymodule called mx:

import mymodule as mx

a = mx.person1["age"]
print(a)

Built-in Modules

There are several built-in modules in Python, which you can import whenever you like.

Example

Import and use the platform module:

import platform

x = platform.system()
print(x)

Using the dir() Function

There is a built-in function to list all the function names (or variable names) in a module. The dir() function:

Example

List all the defined names belonging to the platform module:

import platform

x = dir(platform)
print(x)

Note: The dir() function can be used on all modules, also the ones you create yourself.


Import From Module

You can choose to import only parts from a module, by using the from keyword.

Example

The module named mymodule has one function and one dictionary:

def greeting(name):
  print("Hello, " + name)

person1 = {
  "name": "John",
  "age": 36,
  "country": "Norway"
}

Example

Import only the person1 dictionary from the module:

from mymodule import person1

print (person1["age"])

Note: When importing using the from keyword, do not use the module name when referring to elements in the module. Example: person1["age"], not mymodule.person1["age"]


Test Yourself With Exercises

Exercise:

What is the correct syntax to import a module named "mymodule"?

 mymodule


Python Datetime


Python Dates

A date in Python is not a data type of its own, but we can import a module named datetime to work with dates as date objects.

Example

Import the datetime module and display the current date:

import datetime

x = datetime.datetime.now()
print(x)

Date Output

When we execute the code from the example above the result will be:

The date contains year, month, day, hour, minute, second, and microsecond.

The datetime module has many methods to return information about the date object.

Here are a few examples, you will learn more about them later in this chapter:

Example

Return the year and name of weekday:

import datetime

x = datetime.datetime.now()

print(x.year)
print(x.strftime("%A"))

Creating Date Objects

To create a date, we can use the datetime() class (constructor) of the datetime module.

The datetime() class requires three parameters to create a date: year, month, day.

Example

Create a date object:

import datetime

x = datetime.datetime(2020, 5, 17)

print(x)

The datetime() class also takes parameters for time and timezone (hour, minute, second, microsecond, tzone), but they are optional, and has a default value of 0, (None for timezone).



The strftime() Method

The datetime object has a method for formatting date objects into readable strings.

The method is called strftime(), and takes one parameter, format, to specify the format of the returned string:

Example

Display the name of the month:

import datetime

x = datetime.datetime(2018, 6, 1)

print(x.strftime("%B"))

A reference of all the legal format codes:

Directive Description Example Try it
%a Weekday, short version Wed
%A Weekday, full version Wednesday
%w Weekday as a number 0-6, 0 is Sunday 3
%d Day of month 01-31 31
%b Month name, short version Dec
%B Month name, full version December
%m Month as a number 01-12 12
%y Year, short version, without century 18
%Y Year, full version 2018
%H Hour 00-23 17
%I Hour 00-12 05
%p AM/PM PM
%M Minute 00-59 41
%S Second 00-59 08
%f Microsecond 000000-999999 548513
%z UTC offset +0100
%Z Timezone CST
%j Day number of year 001-366 365
%U Week number of year, Sunday as the first day of week, 00-53 52
%W Week number of year, Monday as the first day of week, 00-53 52
%c Local version of date and time Mon Dec 31 17:41:00 2018
%C Century 20
%x Local version of date 12/31/18
%X Local version of time 17:41:00
%% A % character %
%G ISO 8601 year 2018
%u ISO 8601 weekday (1-7) 1
%V ISO 8601 weeknumber (01-53) 01

Python Math


Python has a set of built-in math functions, including an extensive math module, that allows you to perform mathematical tasks on numbers.


Built-in Math Functions

The min() and max() functions can be used to find the lowest or highest value in an iterable:

Example

x = min(5, 10, 25)
y = max(5, 10, 25)

print(x)
print(y)

The abs() function returns the absolute (positive) value of the specified number:

Example

x = abs(-7.25)

print(x)

The pow(x, y) function returns the value of x to the power of y (xy).

Example

Return the value of 4 to the power of 3 (same as 4 * 4 * 4):

x = pow(4, 3)

print(x)


The Math Module

Python has also a built-in module called math, which extends the list of mathematical functions.

To use it, you must import the math module:

import math

When you have imported the math module, you can start using methods and constants of the module.

The math.sqrt() method for example, returns the square root of a number:

Example

import math

x = math.sqrt(64)

print(x)

The math.ceil() method rounds a number upwards to its nearest integer, and the math.floor() method rounds a number downwards to its nearest integer, and returns the result:

Example

import math

x = math.ceil(1.4)
y = math.floor(1.4)

print(x) # returns 2
print(y) # returns 1

The math.pi constant, returns the value of PI (3.14...):

Example

import math

x = math.pi

print(x)

Complete Math Module Reference

In our you will find a complete reference of all methods and constants that belongs to the Math module.


Python JSON


JSON is a syntax for storing and exchanging data.

JSON is text, written with JavaScript object notation.


JSON in Python

Python has a built-in package called json, which can be used to work with JSON data.

Example

Import the json module:

import json

Parse JSON - Convert from JSON to Python

If you have a JSON string, you can parse it by using the json.loads() method.

The result will be a .

Example

Convert from JSON to Python:

import json

# some JSON:
x =  '{ "name":"John", "age":30, "city":"New York"}'

# parse x:
y = json.loads(x)

# the result is a Python dictionary:
print(y["age"])

Convert from Python to JSON

If you have a Python object, you can convert it into a JSON string by using the json.dumps() method.

Example

Convert from Python to JSON:

import json

# a Python object (dict):
x = {
  "name": "John",
  "age": 30,
  "city": "New York"
}

# convert into JSON:
y = json.dumps(x)

# the result is a JSON string:
print(y)


You can convert Python objects of the following types, into JSON strings:

  • dict
  • list
  • tuple
  • string
  • int
  • float
  • True
  • False
  • None

Example

Convert Python objects into JSON strings, and print the values:

import json

print(json.dumps({"name": "John", "age": 30}))
print(json.dumps(["apple", "bananas"]))
print(json.dumps(("apple", "bananas")))
print(json.dumps("hello"))
print(json.dumps(42))
print(json.dumps(31.76))
print(json.dumps(True))
print(json.dumps(False))
print(json.dumps(None))

When you convert from Python to JSON, Python objects are converted into the JSON (JavaScript) equivalent:

Python JSON
dict Object
list Array
tuple Array
str String
int Number
float Number
True true
False false
None null

Example

Convert a Python object containing all the legal data types:

import json

x = {
  "name": "John",
  "age": 30,
  "married": True,
  "divorced": False,
  "children": ("Ann","Billy"),
  "pets": None,
  "cars": [
    {"model": "BMW 230", "mpg": 27.5},
    {"model": "Ford Edge", "mpg": 24.1}
  ]
}

print(json.dumps(x))

Format the Result

The example above prints a JSON string, but it is not very easy to read, with no indentations and line breaks.

The json.dumps() method has parameters to make it easier to read the result:

Example

Use the indent parameter to define the numbers of indents:

json.dumps(x, indent=4)

You can also define the separators, default value is (", ", ": "), which means using a comma and a space to separate each object, and a colon and a space to separate keys from values:

Example

Use the separators parameter to change the default separator:

json.dumps(x, indent=4, separators=(". ", " = "))

Order the Result

The json.dumps() method has parameters to order the keys in the result:

Example

Use the sort_keys parameter to specify if the result should be sorted or not:

json.dumps(x, indent=4, sort_keys=True)

Python RegEx


A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

RegEx can be used to check if a string contains the specified search pattern.


RegEx Module

Python has a built-in package called re, which can be used to work with Regular Expressions.

Import the re module:

import re

RegEx in Python

When you have imported the re module, you can start using regular expressions:

Example

Search the string to see if it starts with "The" and ends with "Spain":

import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

RegEx Functions

The re module offers a set of functions that allows us to search a string for a match:

Function Description
Returns a list containing all matches
Returns a if there is a match anywhere in the string
Returns a list where the string has been split at each match
Replaces one or many matches with a string


Metacharacters

Metacharacters are characters with a special meaning:

Character Description Example Try it
[] A set of characters "[a-m]"
Signals a special sequence (can also be used to escape special characters) "d"
. Any character (except newline character) "he..o"
^ Starts with "^hello"
$ Ends with "planet$"
* Zero or more occurrences "he.*o"
+ One or more occurrences "he.+o"
? Zero or one occurrences "he.?o"
{} Exactly the specified number of occurrences "he.{2}o"
| Either or "falls|stays"
() Capture and group    

Special Sequences

A special sequence is a followed by one of the characters in the list below, and has a special meaning:

Character Description Example Try it
A Returns a match if the specified characters are at the beginning of the string "AThe"
b Returns a match where the specified characters are at the beginning or at the end of a word
(the "r" in the beginning is making sure that the string is being treated as a "raw string")
r"bain"
r"ainb"

B Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word
(the "r" in the beginning is making sure that the string is being treated as a "raw string")
r"Bain"
r"ainB"

d Returns a match where the string contains digits (numbers from 0-9) "d"
D Returns a match where the string DOES NOT contain digits "D"
s Returns a match where the string contains a white space character "s"
S Returns a match where the string DOES NOT contain a white space character "S"
w Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character) "w"
W Returns a match where the string DOES NOT contain any word characters "W"
Z Returns a match if the specified characters are at the end of the string "SpainZ"

Sets

A set is a set of characters inside a pair of square brackets [] with a special meaning:

Set Description Try it
[arn] Returns a match where one of the specified characters (a, r, or n) is present
[a-n] Returns a match for any lower case character, alphabetically between a and n
[^arn] Returns a match for any character EXCEPT a, r, and n
[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present
[0-9] Returns a match for any digit between 0 and 9
[0-5][0-9] Returns a match for any two-digit numbers from 00 and 59
[a-zA-Z] Returns a match for any character alphabetically between a and z, lower case OR upper case
[+] In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for any + character in the string

The findall() Function

The findall() function returns a list containing all matches.

Example

Print a list of all matches:

import re

txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)

The list contains the matches in the order they are found.

If no matches are found, an empty list is returned:

Example

Return an empty list if no match was found:

import re

txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)

The search() Function

The search() function searches the string for a match, and returns a if there is a match.

If there is more than one match, only the first occurrence of the match will be returned:

Example

Search for the first white-space character in the string:

import re

txt = "The rain in Spain"
x = re.search("s", txt)

print("The first white-space character is located in position:", x.start())

If no matches are found, the value None is returned:

Example

Make a search that returns no match:

import re

txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)

The split() Function

The split() function returns a list where the string has been split at each match:

Example

Split at each white-space character:

import re

txt = "The rain in Spain"
x = re.split("s", txt)
print(x)

You can control the number of occurrences by specifying the maxsplit parameter:

Example

Split the string only at the first occurrence:

import re

txt = "The rain in Spain"
x = re.split("s", txt, 1)
print(x)

The sub() Function

The sub() function replaces the matches with the text of your choice:

Example

Replace every white-space character with the number 9:

import re

txt = "The rain in Spain"
x = re.sub("s", "9", txt)
print(x)

You can control the number of replacements by specifying the count parameter:

Example

Replace the first 2 occurrences:

import re

txt = "The rain in Spain"
x = re.sub("s", "9", txt, 2)
print(x)

Match Object

A Match Object is an object containing information about the search and the result.

Note: If there is no match, the value None will be returned, instead of the Match Object.

Example

Do a search that will return a Match Object:

import re

txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object

The Match object has properties and methods used to retrieve information about the search, and the result:

.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match

Example

Print the position (start- and end-position) of the first match occurrence.

The regular expression looks for any words that starts with an upper case "S":

import re

txt = "The rain in Spain"
x = re.search(r"bSw+", txt)
print(x.span())

Example

Print the string passed into the function:

import re

txt = "The rain in Spain"
x = re.search(r"bSw+", txt)
print(x.string)

Example

Print the part of the string where there was a match.

The regular expression looks for any words that starts with an upper case "S":

import re

txt = "The rain in Spain"
x = re.search(r"bSw+", txt)
print(x.group())

Note: If there is no match, the value None will be returned, instead of the Match Object.


Python PIP


What is PIP?

PIP is a package manager for Python packages, or modules if you like.

Note: If you have Python version 3.4 or later, PIP is included by default.


What is a Package?

A package contains all the files you need for a module.

Modules are Python code libraries you can include in your project.


Check if PIP is Installed

Navigate your command line to the location of Python's script directory, and type the following:

Example

Check PIP version:

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>pip --version

Install PIP

If you do not have PIP installed, you can download and install it from this page:


Download a Package

Downloading a package is very easy.

Open the command line interface and tell PIP to download the package you want.

Navigate your command line to the location of Python's script directory, and type the following:

Example

Download a package named "camelcase":

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>pip install camelcase

Now you have downloaded and installed your first package!



Using a Package

Once the package is installed, it is ready to use.

Import the "camelcase" package into your project.

Example

Import and use "camelcase":

import camelcase

c = camelcase.CamelCase()

txt = "hello world"

print(c.hump(txt))

Find Packages

Find more packages at .


Remove a Package

Use the uninstall command to remove a package:

Example

Uninstall the package named "camelcase":

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>pip uninstall camelcase

The PIP Package Manager will ask you to confirm that you want to remove the camelcase package:

Uninstalling camelcase-02.1:
  Would remove:
    c:usersYour Nameappdatalocalprogramspythonpython36-32libsite-packagescamelcase-0.2-py3.6.egg-info
    c:usersYour Nameappdatalocalprogramspythonpython36-32libsite-packagescamelcase*
Proceed (y/n)?

Press y and the package will be removed.


List Packages

Use the list command to list all the packages installed on your system:

Example

List installed packages:

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>pip list

Result:

Package         Version
-----------------------
camelcase       0.2
mysql-connector 2.1.6
pip             18.1
pymongo         3.6.1
setuptools      39.0.1

Python Try Except


The try block lets you test a block of code for errors.

The except block lets you handle the error.

The else block lets you execute code when there is no error.

The finally block lets you execute code, regardless of the result of the try- and except blocks.


Exception Handling

When an error occurs, or exception as we call it, Python will normally stop and generate an error message.

These exceptions can be handled using the try statement:

Example

The try block will generate an exception, because x is not defined:

try:
  print(x)
except:
  print("An exception occurred")

Since the try block raises an error, the except block will be executed.

Without the try block, the program will crash and raise an error:

Example

This statement will raise an error, because x is not defined:

print(x)

Many Exceptions

You can define as many exception blocks as you want, e.g. if you want to execute a special block of code for a special kind of error:

Example

Print one message if the try block raises a NameError and another for other errors:

try:
  print(x)
except NameError:
  print("Variable x is not defined")
except:
  print("Something else went wrong")


Else

You can use the else keyword to define a block of code to be executed if no errors were raised:

Example

In this example, the try block does not generate any error:

try:
  print("Hello")
except:
  print("Something went wrong")
else:
  print("Nothing went wrong")

Finally

The finally block, if specified, will be executed regardless if the try block raises an error or not.

Example

try:
  print(x)
except:
  print("Something went wrong")
finally:
  print("The 'try except' is finished")

This can be useful to close objects and clean up resources:

Example

Try to open and write to a file that is not writable:

try:
  f = open("demofile.txt")
  try:
    f.write("Lorum Ipsum")
  except:
    print("Something went wrong when writing to the file")
  finally:
    f.close()
except:
  print("Something went wrong when opening the file")

The program can continue, without leaving the file object open.


Raise an exception

As a Python developer you can choose to throw an exception if a condition occurs.

To throw (or raise) an exception, use the raise keyword.

Example

Raise an error and stop the program if x is lower than 0:

x = -1

if x < 0:
  raise Exception("Sorry, no numbers below zero")

The raise keyword is used to raise an exception.

You can define what kind of error to raise, and the text to print to the user.

Example

Raise a TypeError if x is not an integer:

x = "hello"

if not type(x) is int:
  raise TypeError("Only integers are allowed")

Python User Input


User Input

Python allows for user input.

That means we are able to ask the user for input.

The method is a bit different in Python 3.6 than Python 2.7.

Python 3.6 uses the input() method.

Python 2.7 uses the raw_input() method.

The following example asks for the username, and when you entered the username, it gets printed on the screen:

Python 3.6

username = input("Enter username:")
print("Username is: " + username)

Python 2.7

username = raw_input("Enter username:")
print("Username is: " + username)

Python stops executing when it comes to the input() function, and continues when the user has given some input.


Python String Formatting


To make sure a string will display as expected, we can format the result with the format() method.


String format()

The format() method allows you to format selected parts of a string.

Sometimes there are parts of a text that you do not control, maybe they come from a database, or user input?

To control such values, add placeholders (curly brackets {}) in the text, and run the values through the format() method:

Example

Add a placeholder where you want to display the price:

price = 49
txt = "The price is {} dollars"
print(txt.format(price))

You can add parameters inside the curly brackets to specify how to convert the value:

Example

Format the price to be displayed as a number with two decimals:

txt = "The price is {:.2f} dollars"

Check out all formatting types in our .


Multiple Values

If you want to use more values, just add more values to the format() method:

print(txt.format(price, itemno, count))

And add more placeholders:

Example

quantity = 3
itemno = 567
price = 49
myorder = "I want {} pieces of item number {} for {:.2f} dollars."
print(myorder.format(quantity, itemno, price))


Index Numbers

You can use index numbers (a number inside the curly brackets {0}) to be sure the values are placed in the correct placeholders:

Example

quantity = 3
itemno = 567
price = 49
myorder = "I want {0} pieces of item number {1} for {2:.2f} dollars."
print(myorder.format(quantity, itemno, price))

Also, if you want to refer to the same value more than once, use the index number:

Example

age = 36
name = "John"
txt = "His name is {1}. {1} is {0} years old."
print(txt.format(age, name))

Named Indexes

You can also use named indexes by entering a name inside the curly brackets {carname}, but then you must use names when you pass the parameter values txt.format(carname = "Ford"):

Example

myorder = "I have a {carname}, it is a {model}."
print(myorder.format(carname = "Ford", model = "Mustang"))

Python File Open


File handling is an important part of any web application.

Python has several functions for creating, reading, updating, and deleting files.


File Handling

The key function for working with files in Python is the open() function.

The open() function takes two parameters; filename, and mode.

There are four different methods (modes) for opening a file:

"r" - Read - Default value. Opens a file for reading, error if the file does not exist

"a" - Append - Opens a file for appending, creates the file if it does not exist

"w" - Write - Opens a file for writing, creates the file if it does not exist

"x" - Create - Creates the specified file, returns an error if the file exists

In addition you can specify if the file should be handled as binary or text mode

"t" - Text - Default value. Text mode

"b" - Binary - Binary mode (e.g. images)


Syntax

To open a file for reading it is enough to specify the name of the file:

f = open("demofile.txt")

The code above is the same as:

f = open("demofile.txt", "rt")

Because "r" for read, and "t" for text are the default values, you do not need to specify them.

Note: Make sure the file exists, or else you will get an error.


Python File Open


Open a File on the Server

Assume we have the following file, located in the same folder as Python:

demofile.txt

Hello! Welcome to demofile.txt
This file is for testing purposes.
Good Luck!

To open the file, use the built-in open() function.

The open() function returns a file object, which has a read() method for reading the content of the file:

Example

f = open("demofile.txt", "r")
print(f.read())

If the file is located in a different location, you will have to specify the file path, like this:

Example

Open a file on a different location:

f = open("D:\myfileswelcome.txt", "r")
print(f.read())

Read Only Parts of the File

By default the read() method returns the whole text, but you can also specify how many characters you want to return:

Example

Return the 5 first characters of the file:

f = open("demofile.txt", "r")
print(f.read(5))


Read Lines

You can return one line by using the readline() method:

Example

Read one line of the file:

f = open("demofile.txt", "r")
print(f.readline())

By calling readline() two times, you can read the two first lines:

Example

Read two lines of the file:

f = open("demofile.txt", "r")
print(f.readline())
print(f.readline())

By looping through the lines of the file, you can read the whole file, line by line:

Example

Loop through the file line by line:

f = open("demofile.txt", "r")
for x in f:
  print(x)

Close Files

It is a good practice to always close the file when you are done with it.

Example

Close the file when you are finish with it:

f = open("demofile.txt", "r")
print(f.readline())
f.close()

Note: You should always close your files, in some cases, due to buffering, changes made to a file may not show until you close the file.


Python File Write


Write to an Existing File

To write to an existing file, you must add a parameter to the open() function:

"a" - Append - will append to the end of the file

"w" - Write - will overwrite any existing content

Example

Open the file "demofile2.txt" and append content to the file:

f = open("demofile2.txt", "a")
f.write("Now the file has more content!")
f.close()

#open and read the file after the appending:
f = open("demofile2.txt", "r")
print(f.read())

Example

Open the file "demofile3.txt" and overwrite the content:

f = open("demofile3.txt", "w")
f.write("Woops! I have deleted the content!")
f.close()

#open and read the file after the overwriting:
f = open("demofile3.txt", "r")
print(f.read())

Note: the "w" method will overwrite the entire file.


Create a New File

To create a new file in Python, use the open() method, with one of the following parameters:

"x" - Create - will create a file, returns an error if the file exist

"a" - Append - will create a file if the specified file does not exist

"w" - Write - will create a file if the specified file does not exist

Example

Create a file called "myfile.txt":

f = open("myfile.txt", "x")

Result: a new empty file is created!

Example

Create a new file if it does not exist:

f = open("myfile.txt", "w")

Python Delete File


Delete a File

To delete a file, you must import the OS module, and run its os.remove() function:

Example

Remove the file "demofile.txt":

import os
os.remove("demofile.txt")

Check if File exist:

To avoid getting an error, you might want to check if the file exists before you try to delete it:

Example

Check if file exists, then delete it:

import os
if os.path.exists("demofile.txt"):
  os.remove("demofile.txt")
else:
  print("The file does not exist")

Delete Folder

To delete an entire folder, use the os.rmdir() method:

Example

Remove the folder "myfolder":

import os
os.rmdir("myfolder")

Note: You can only remove empty folders.


NumPy Tutorial

[+:

NumPy is a Python library.

NumPy is used for working with arrays.

NumPy is short for "Numerical Python".

Learning by Reading

We have created 43 tutorial pages for you to learn more about NumPy.

Starting with a basic introduction and ends up with creating and plotting random data sets, and working with NumPy functions:

Basic

 
 
 
 
 
 
 
 
 
 
 
 
 
 

Random

 
 
 
 
 
 
 
 
 
 
 
 
 
 

ufunc

 
 
 
 
 
 
 
 
 
 
 
 


Learning by Quiz Test

Test your NumPy skills with a quiz test.


Learning by Exercises

Pandas Tutorial

[+:

Pandas is a Python library.

Pandas is used to analyze data.

Learning by Reading

We have created 14 tutorial pages for you to learn more about Pandas.

Starting with a basic introduction and ends up with cleaning and plotting data:

Basic

 

 

 

 

 

 

Cleaning Data


 

 

 

 

Advanced


 


Learning by Quiz Test

Test your Pandas skills with a quiz test.


Learning by Exercises

SciPy Tutorial

[+:

SciPy is a scientific computation library that uses underneath.

SciPy stands for Scientific Python.

Learning by Reading

We have created 10 tutorial pages for you to learn the fundamentals of SciPy:

Basic SciPy

 

 

 

 

 

 

 

 

 


Learning by Quiz Test

Test your SciPy skills with a quiz test.


Learning by Exercises

Matplotlib Tutorial



What is Matplotlib?

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C, Objective-C and Javascript for Platform compatibility.


Where is the Matplotlib Codebase?

The source code for Matplotlib is located at this github repository


Matplotlib Getting Started


Installation of Matplotlib

If you have and already installed on a system, then installation of Matplotlib is very easy.

Install it using this command:

C:UsersYour Name>pip install matplotlib

If this command fails, then use a python distribution that already has Matplotlib installed,  like Anaconda, Spyder etc.


Import Matplotlib

Once Matplotlib is installed, import it in your applications by adding the import module statement:

import matplotlib

Now Matplotlib is imported and ready to use:


Checking Matplotlib Version

The version string is stored under __version__ attribute.

Example

import matplotlib

print(matplotlib.__version__)

Note: two underscore characters are used in __version__.


Matplotlib Pyplot


Pyplot

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under the plt alias:

import matplotlib.pyplot as plt

Now the Pyplot package can be referred to as plt.

Example

Draw a line in a diagram from position (0,0) to position (6,250):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([0, 6])
ypoints = np.array([0, 250])

plt.plot(xpoints, ypoints)
plt.show()

Result:

You will learn more about drawing (plotting) in the next chapters.


Matplotlib Plotting


Plotting x and y points

The plot() function is used to draw points (markers) in a diagram.

By default, the plot() function draws a line from point to point.

The function takes parameters for specifying points in the diagram.

Parameter 1 is an array containing the points on the x-axis.

Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function.

Example

Draw a line in a diagram from position (1, 3) to position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 8])
ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints)
plt.show()

Result:

The x-axis is the horizontal axis.

The y-axis is the vertical axis.



Plotting Without Line

To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.

Example

Draw two points in the diagram, one at position (1, 3) and one in position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 8])
ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints, 'o')
plt.show()

Result:

You will learn more about markers in the next chapter.


Multiple Points

You can plot as many points as you like, just make sure you have the same number of points in both axis.

Example

Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position (8, 10):

import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 2, 6, 8])
ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)
plt.show()

Result:


Default X-Points

If we do not specify the points on the x-axis, they will get the default values 0, 1, 2, 3 (etc., depending on the length of the y-points.

So, if we take the same example as above, and leave out the x-points, the diagram will look like this:

Example

Plotting without x-points:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10, 5, 7])

plt.plot(ypoints)
plt.show()

Result:

The x-points in the example above are [0, 1, 2, 3, 4, 5].


Matplotlib Markers


Markers

You can use the keyword argument marker to emphasize each point with a specified marker:

Example

Mark each point with a circle:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o')
plt.show()

Result:

Example

Mark each point with a star:

...
plt.plot(ypoints, marker = '*')
...

Result:



Marker Reference

You can choose any of these markers:

Marker Description
'o' Circle
'*' Star
'.' Point
',' Pixel
'x' X
'X' X (filled)
'+' Plus
'P' Plus (filled)
's' Square
'D' Diamond
'd' Diamond (thin)
'p' Pentagon
'H' Hexagon
'h' Hexagon
'v' Triangle Down
'^' Triangle Up
'<' Triangle Left
'>' Triangle Right
'1' Tri Down
'2' Tri Up
'3' Tri Left
'4' Tri Right
'|' Vline
'_' Hline

Format Strings fmt

You can also use the shortcut string notation parameter to specify the marker.

This parameter is also called fmt, and is written with this syntax:

marker|line|color

Example

Mark each point with a circle:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, 'o:r')
plt.show()

Result:

The marker value can be anything from the Marker Reference above.

The line value can be one of the following:

Line Reference

Line Syntax Description
'-' Solid line
':' Dotted line
'--' Dashed line
'-.' Dashed/dotted line

Note: If you leave out the line value in the fmt parameter, no line will be plotted.

The short color value can be one of the following:

Color Reference

Color Syntax Description
'r' Red
'g' Green
'b' Blue
'c' Cyan
'm' Magenta
'y' Yellow
'k' Black
'w' White

Marker Size

You can use the keyword argument markersize or the shorter version, ms to set the size of the markers:

Example

Set the size of the markers to 20:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20)
plt.show()

Result:


Marker Color

You can use the keyword argument markeredgecolor or the shorter mec to set the color of the edge of the markers:

Example

Set the EDGE color to red:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r')
plt.show()

Result:

You can use the keyword argument markerfacecolor or the shorter mfc to set the color inside the edge of the markers:

Example

Set the FACE color to red:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20, mfc = 'r')
plt.show()

Result:

Use both the mec and mfc arguments to color the entire marker:

Example

Set the color of both the edge and the face to red:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r', mfc = 'r')
plt.show()

Result:

You can also use :

Example

Mark each point with a beautiful green color:

...
plt.plot(ypoints, marker = 'o', ms = 20, mec = '#4CAF50', mfc = '#4CAF50')
...

Result:

Or any of the .

Example

Mark each point with the color named "hotpink":

...
plt.plot(ypoints, marker = 'o', ms = 20, mec = 'hotpink', mfc = 'hotpink')
...

Result:


Matplotlib Line


Linestyle

You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted line:

Example

Use a dotted line:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, linestyle = 'dotted')
plt.show()

Result:

Example

Use a dashed line:


plt.plot(ypoints, linestyle = 'dashed')

Result:



Shorter Syntax

The line style can be written in a shorter syntax:

linestyle can be written as ls.

dotted can be written as :.

dashed can be written as --.

Example

Shorter syntax:

plt.plot(ypoints, ls = ':')

Result:


Line Styles

You can choose any of these styles:

Style Or
'solid' (default) '-'
'dotted' ':'
'dashed' '--'
'dashdot' '-.'
'None' '' or ' '

Line Color

You can use the keyword argument color or the shorter c to set the color of the line:

Example

Set the line color to red:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, color = 'r')
plt.show()

Result:

You can also use :

Example

Plot with a beautiful green line:

...
plt.plot(ypoints, c = '#4CAF50')
...

Result:

Or any of the .

Example

Plot with the color named "hotpink":

...
plt.plot(ypoints, c = 'hotpink')
...

Result:


Line Width

You can use the keyword argument linewidth or the shorter lw to change the width of the line.

The value is a floating number, in points:

Example

Plot with a 20.5pt wide line:

import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, linewidth = '20.5')
plt.show()

Result:


Multiple Lines

You can plot as many lines as you like by simply adding more plt.plot() functions:

Example

Draw two lines by specifying a plt.plot() function for each line:

import matplotlib.pyplot as plt
import numpy as np

y1 = np.array([3, 8, 1, 10])
y2 = np.array([6, 2, 7, 11])

plt.plot(y1)
plt.plot(y2)

plt.show()

Result:

You can also plot many lines by adding the points for the x- and y-axis for each line in the same plt.plot() function.

(In the examples above we only specified the points on the y-axis, meaning that the points on the x-axis got the the default values (0, 1, 2, 3).)

The x- and y- values come in pairs:

Example

Draw two lines by specifiyng the x- and y-point values for both lines:

import matplotlib.pyplot as plt
import numpy as np

x1 = np.array([0, 1, 2, 3])
y1 = np.array([3, 8, 1, 10])
x2 = np.array([0, 1, 2, 3])
y2 = np.array([6, 2, 7, 11])

plt.plot(x1, y1, x2, y2)
plt.show()

Result:


Matplotlib Labels and Title


Create Labels for a Plot

With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis.

Example

Add labels to the x- and y-axis:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.plot(x, y)

plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()

Result:


Create a Title for a Plot

With Pyplot, you can use the title() function to set a title for the plot.

Example

Add a plot title and labels for the x- and y-axis:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.plot(x, y)

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()

Result:



Set Font Properties for Title and Labels

You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font properties for the title and labels.

Example

Set font properties for the title and labels:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

font1 = {'family':'serif','color':'blue','size':20}
font2 = {'family':'serif','color':'darkred','size':15}

plt.title("Sports Watch Data", fontdict = font1)
plt.xlabel("Average Pulse", fontdict = font2)
plt.ylabel("Calorie Burnage", fontdict = font2)

plt.plot(x, y)
plt.show()

Result:


Position the Title

You can use the loc parameter in title() to position the title.

Legal values are: 'left', 'right', and 'center'. Default value is 'center'.

Example

Position the title to the left:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data", loc = 'left')
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)
plt.show()

Result:


Matplotlib Adding Grid Lines


Add Grid Lines to a Plot

With Pyplot, you can use the grid() function to add grid lines to the plot.

Example

Add grid lines to the plot:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)

plt.grid()

plt.show()

Result:



Specify Which Grid Lines to Display

You can use the axis parameter in the grid() function to specify which grid lines to display.

Legal values are: 'x', 'y', and 'both'. Default value is 'both'.

Example

Display only grid lines for the x-axis:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)

plt.grid(axis = 'x')

plt.show()

Result:

Example

Display only grid lines for the y-axis:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)

plt.grid(axis = 'y')

plt.show()

Result:


Set Line Properties for the Grid

You can also set the line properties of the grid, like this: grid(color = 'color', linestyle = 'linestyle', linewidth = number).

Example

Set the line properties of the grid:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)

plt.grid(color = 'green', linestyle = '--', linewidth = 0.5)

plt.show()

Result:


Matplotlib Subplot


Display Multiple Plots

With the subplot() function you can draw multiple plots in one figure:

Example

Draw 2 plots:

import matplotlib.pyplot as plt
import numpy as np

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(1, 2, 1)
plt.plot(x,y)

#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(1, 2, 2)
plt.plot(x,y)

plt.show()

Result:


The subplot() Function

The subplot() function takes three arguments that describes the layout of the figure.

The layout is organized in rows and columns, which are represented by the first and second argument.

The third argument represents the index of the current plot.

plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.

plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.

So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be displayed on top of each other instead of side-by-side), we can write the syntax like this:

Example

Draw 2 plots on top of each other:

import matplotlib.pyplot as plt
import numpy as np

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(2, 1, 1)
plt.plot(x,y)

#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(2, 1, 2)
plt.plot(x,y)

plt.show()

Result:

You can draw as many plots you like on one figure, just descibe the number of rows, columns, and the index of the plot.

Example

Draw 6 plots:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(2, 3, 1)
plt.plot(x,y)

x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(2, 3, 2)
plt.plot(x,y)

x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(2, 3, 3)
plt.plot(x,y)

x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(2, 3, 4)
plt.plot(x,y)

x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(2, 3, 5)
plt.plot(x,y)

x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(2, 3, 6)
plt.plot(x,y)

plt.show()

Result:



Title

You can add a title to each plot with the title() function:

Example

2 plots, with titles:

import matplotlib.pyplot as plt
import numpy as np

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(1, 2, 1)
plt.plot(x,y)
plt.title("SALES")

#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.title("INCOME")

plt.show()

Result:


Super Title

You can add a title to the entire figure with the suptitle() function:

Example

Add a title for the entire figure:

import matplotlib.pyplot as plt
import numpy as np

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(1, 2, 1)
plt.plot(x,y)
plt.title("SALES")

#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.title("INCOME")

plt.suptitle("MY SHOP")
plt.show()

Result:


Matplotlib Scatter


Creating Scatter Plots

With Pyplot, you can use the scatter() function to draw a scatter plot.

The scatter() function plots one dot for each observation. It needs two arrays of the same length, one for the values of the x-axis, and one for values on the y-axis:

Example

A simple scatter plot:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])

plt.scatter(x, y)
plt.show()

Result:

The observation in the example above is the result of 13 cars passing by.

The X-axis shows how old the car is.

The Y-axis shows the speed of the car when it passes.

Are there any relationships between the observations?

It seems that the newer the car, the faster it drives, but that could be a coincidence, after all we only registered 13 cars.


Compare Plots

In the example above, there seems to be a relationship between speed and age, but what if we plot the observations from another day as well? Will the scatter plot tell us something else?

Example

Draw two plots on the same figure:

import matplotlib.pyplot as plt
import numpy as np

#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)

#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)

plt.show()

Result:

Note: The two plots are plotted with two different colors, by default blue and orange, you will learn how to change colors later in this chapter.

By comparing the two plots, I think it is safe to say that they both gives us the same conclusion: the newer the car, the faster it drives.



Colors

You can set your own color for each scatter plot with the color or the c argument:

Example

Set your own color of the markers:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y, color = 'hotpink')

x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y, color = '#88c999')

plt.show()

Result:


Color Each Dot

You can even set a specific color for each dot by using an array of colors as value for the c argument:

Note: You cannot use the color argument for this, only the c argument.

Example

Set your own color of the markers:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","gray","cyan","magenta"])

plt.scatter(x, y, c=colors)

plt.show()

Result:


ColorMap

The Matplotlib module has a number of available colormaps.

A colormap is like a list of colors, where each color has a value that ranges from 0 to 100.

Here is an example of a colormap:

This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color, up to 100, which is a yellow color.

How to Use the ColorMap

You can specify the colormap with the keyword argument cmap with the value of the colormap, in this case 'viridis' which is one of the built-in colormaps available in Matplotlib.

In addition you have to create an array with values (from 0 to 100), one value for each point in the scatter plot:

Example

Create a color array, and specify a colormap in the scatter plot:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])

plt.scatter(x, y, c=colors, cmap='viridis')

plt.show()

Result:

You can include the colormap in the drawing by including the plt.colorbar() statement:

Example

Include the actual colormap:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])

plt.scatter(x, y, c=colors, cmap='viridis')

plt.colorbar()

plt.show()

Result:

Available ColorMaps

You can choose any of the built-in colormaps:

Name   Reverse
Accent   Accent_r
Blues   Blues_r
BrBG   BrBG_r
BuGn   BuGn_r
BuPu   BuPu_r
CMRmap   CMRmap_r
Dark2   Dark2_r
GnBu   GnBu_r
Greens   Greens_r
Greys   Greys_r
OrRd   OrRd_r
Oranges   Oranges_r
PRGn   PRGn_r
Paired   Paired_r
Pastel1   Pastel1_r
Pastel2   Pastel2_r
PiYG   PiYG_r
PuBu   PuBu_r
PuBuGn   PuBuGn_r
PuOr   PuOr_r
PuRd   PuRd_r
Purples   Purples_r
RdBu   RdBu_r
RdGy   RdGy_r
RdPu   RdPu_r
RdYlBu   RdYlBu_r
RdYlGn   RdYlGn_r
Reds   Reds_r
Set1   Set1_r
Set2   Set2_r
Set3   Set3_r
Spectral   Spectral_r
Wistia   Wistia_r
YlGn   YlGn_r
YlGnBu   YlGnBu_r
YlOrBr   YlOrBr_r
YlOrRd   YlOrRd_r
afmhot   afmhot_r
autumn   autumn_r
binary   binary_r
bone   bone_r
brg   brg_r
bwr   bwr_r
cividis   cividis_r
cool   cool_r
coolwarm   coolwarm_r
copper   copper_r
cubehelix   cubehelix_r
flag   flag_r
gist_earth   gist_earth_r
gist_gray   gist_gray_r
gist_heat   gist_heat_r
gist_ncar   gist_ncar_r
gist_rainbow   gist_rainbow_r
gist_stern   gist_stern_r
gist_yarg   gist_yarg_r
gnuplot   gnuplot_r
gnuplot2   gnuplot2_r
gray   gray_r
hot   hot_r
hsv   hsv_r
inferno   inferno_r
jet   jet_r
magma   magma_r
nipy_spectral   nipy_spectral_r
ocean   ocean_r
pink   pink_r
plasma   plasma_r
prism   prism_r
rainbow   rainbow_r
seismic   seismic_r
spring   spring_r
summer   summer_r
tab10   tab10_r
tab20   tab20_r
tab20b   tab20b_r
tab20c   tab20c_r
terrain   terrain_r
twilight   twilight_r
twilight_shifted   twilight_shifted_r
viridis   viridis_r
winter   winter_r

Size

You can change the size of the dots with the s argument.

Just like colors, make sure the array for sizes has the same length as the arrays for the x- and y-axis:

Example

Set your own size for the markers:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])

plt.scatter(x, y, s=sizes)

plt.show()

Result:


Alpha

You can adjust the transparency of the dots with the alpha argument.

Just like colors, make sure the array for sizes has the same length as the arrays for the x- and y-axis:

Example

Set your own size for the markers:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])

plt.scatter(x, y, s=sizes, alpha=0.5)

plt.show()

Result:


Combine Color Size and Alpha

You can combine a colormap with different sizes of the dots. This is best visualized if the dots are transparent:

Example

Create random arrays with 100 values for x-points, y-points, colors and sizes:

import matplotlib.pyplot as plt
import numpy as np

x = np.random.randint(100, size=(100))
y = np.random.randint(100, size=(100))
colors = np.random.randint(100, size=(100))
sizes = 10 * np.random.randint(100, size=(100))

plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='nipy_spectral')

plt.colorbar()

plt.show()

Result:


Matplotlib Bars


Creating Bars

With Pyplot, you can use the bar() function to draw bar graphs:

Example

Draw 4 bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.bar(x,y)
plt.show()

Result:

The bar() function takes arguments that describes the layout of the bars.

The categories and their values represented by the first and second argument as arrays.

Example

x = ["APPLES", "BANANAS"]
y = [400, 350]
plt.bar(x, y)



Horizontal Bars

If you want the bars to be displayed horizontally instead of vertically, use the barh() function:

Example

Draw 4 horizontal bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.barh(x, y)
plt.show()

Result:


Bar Color

The bar() and barh() take the keyword argument color to set the color of the bars:

Example

Draw 4 red bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.bar(x, y, color = "red")
plt.show()

Result:

Color Names

You can use any of the .

Example

Draw 4 "hot pink" bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.bar(x, y, color = "hotpink")
plt.show()

Result:

Color Hex

Or you can use :

Example

Draw 4 bars with a beautiful green color:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.bar(x, y, color = "#4CAF50")
plt.show()

Result:


Bar Width

The bar() takes the keyword argument width to set the width of the bars:

Example

Draw 4 very thin bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.bar(x, y, width = 0.1)
plt.show()

Result:

The default width value is 0.8

Note: For horizontal bars, use height instead of width.


Bar Height

The barh() takes the keyword argument height to set the height of the bars:

Example

Draw 4 very thin bars:

import matplotlib.pyplot as plt
import numpy as np

x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])

plt.barh(x, y, height = 0.1)
plt.show()

Result:

The default height value is 0.8


Matplotlib Histograms


Histogram

A histogram is a graph showing frequency distributions.

It is a graph showing the number of observations within each given interval.

Example: Say you ask for the height of 250 people, you might end up with a histogram like this:

You can read from the histogram that there are approximately:

2 people from 140 to 145cm
5 people from 145 to 150cm
15 people from 151 to 156cm
31 people from 157 to 162cm
46 people from 163 to 168cm
53 people from 168 to 173cm
45 people from 173 to 178cm
28 people from 179 to 184cm
21 people from 185 to 190cm
4 people from 190 to 195cm


Create Histogram

In Matplotlib, we use the hist() function to create histograms.

The hist() function will use an array of numbers to create a histogram, the array is sent into the function as an argument.

For simplicity we use NumPy to randomly generate an array with 250 values, where the values will concentrate around 170, and the standard deviation is 10. Learn more about in our .

Example

A Normal Data Distribution by NumPy:

import numpy as np

x = np.random.normal(170, 10, 250)

print(x)

Result:

This will generate a random result, and could look like this:


  [167.62255766 175.32495609 152.84661337 165.50264047 163.17457988
   162.29867872 172.83638413 168.67303667 164.57361342 180.81120541
   170.57782187 167.53075749 176.15356275 176.95378312 158.4125473
   187.8842668  159.03730075 166.69284332 160.73882029 152.22378865
   164.01255164 163.95288674 176.58146832 173.19849526 169.40206527
   166.88861903 149.90348576 148.39039643 177.90349066 166.72462233
   177.44776004 170.93335636 173.26312881 174.76534435 162.28791953
   166.77301551 160.53785202 170.67972019 159.11594186 165.36992993
   178.38979253 171.52158489 173.32636678 159.63894401 151.95735707
   175.71274153 165.00458544 164.80607211 177.50988211 149.28106703
   179.43586267 181.98365273 170.98196794 179.1093176  176.91855744
   168.32092784 162.33939782 165.18364866 160.52300507 174.14316386
   163.01947601 172.01767945 173.33491959 169.75842718 198.04834503
   192.82490521 164.54557943 206.36247244 165.47748898 195.26377975
   164.37569092 156.15175531 162.15564208 179.34100362 167.22138242
   147.23667125 162.86940215 167.84986671 172.99302505 166.77279814
   196.6137667  159.79012341 166.5840824  170.68645637 165.62204521
   174.5559345  165.0079216  187.92545129 166.86186393 179.78383824
   161.0973573  167.44890343 157.38075812 151.35412246 171.3107829
   162.57149341 182.49985133 163.24700057 168.72639903 169.05309467
   167.19232875 161.06405208 176.87667712 165.48750185 179.68799986
   158.7913483  170.22465411 182.66432721 173.5675715  176.85646836
   157.31299754 174.88959677 183.78323508 174.36814558 182.55474697
   180.03359793 180.53094948 161.09560099 172.29179934 161.22665588
   171.88382477 159.04626132 169.43886536 163.75793589 157.73710983
   174.68921523 176.19843414 167.39315397 181.17128255 174.2674597
   186.05053154 177.06516302 171.78523683 166.14875436 163.31607668
   174.01429569 194.98819875 169.75129209 164.25748789 180.25773528
   170.44784934 157.81966006 171.33315907 174.71390637 160.55423274
   163.92896899 177.29159542 168.30674234 165.42853878 176.46256226
   162.61719142 166.60810831 165.83648812 184.83238352 188.99833856
   161.3054697  175.30396693 175.28109026 171.54765201 162.08762813
   164.53011089 189.86213299 170.83784593 163.25869004 198.68079225
   166.95154328 152.03381334 152.25444225 149.75522816 161.79200594
   162.13535052 183.37298831 165.40405341 155.59224806 172.68678385
   179.35359654 174.19668349 163.46176882 168.26621173 162.97527574
   192.80170974 151.29673582 178.65251432 163.17266558 165.11172588
   183.11107905 169.69556831 166.35149789 178.74419135 166.28562032
   169.96465166 178.24368042 175.3035525  170.16496554 158.80682882
   187.10006553 178.90542991 171.65790645 183.19289193 168.17446717
   155.84544031 177.96091745 186.28887898 187.89867406 163.26716924
   169.71242393 152.9410412  158.68101969 171.12655559 178.1482624
   187.45272185 173.02872935 163.8047623  169.95676819 179.36887054
   157.01955088 185.58143864 170.19037101 157.221245   168.90639755
   178.7045601  168.64074373 172.37416382 165.61890535 163.40873027
   168.98683006 149.48186389 172.20815568 172.82947206 173.71584064
   189.42642762 172.79575803 177.00005573 169.24498561 171.55576698
   161.36400372 176.47928342 163.02642822 165.09656415 186.70951892
   153.27990317 165.59289527 180.34566865 189.19506385 183.10723435
   173.48070474 170.28701875 157.24642079 157.9096498  176.4248199 ]

The hist() function will read the array and produce a histogram:

Example

A simple histogram:

import matplotlib.pyplot as plt
import numpy as np

x = np.random.normal(170, 10, 250)

plt.hist(x)
plt.show() 

Result:


Matplotlib Pie Charts


Creating Pie Charts

With Pyplot, you can use the pie() function to draw pie charts:

Example

A simple pie chart:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])

plt.pie(y)
plt.show() 

Result:

As you can see the pie chart draws one piece (called a wedge) for each value in the array (in this case [35, 25, 25, 15]).

By default the plotting of the first wedge starts from the x-axis and moves counterclockwise:

Note: The size of each wedge is determined by comparing the value with all the other values, by using this formula:

The value divided by the sum of all values: x/sum(x)



Labels

Add labels to the pie chart with the label parameter.

The label parameter must be an array with one label for each wedge:

Example

A simple pie chart:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels)
plt.show() 

Result:


Start Angle

As mentioned the default start angle is at the x-axis, but you can change the start angle by specifying a startangle parameter.

The startangle parameter is defined with an angle in degrees, default angle is 0:

Example

Start the first wedge at 90 degrees:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels, startangle = 90)
plt.show() 

Result:


Explode

Maybe you want one of the wedges to stand out? The explode parameter allows you to do that.

The explode parameter, if specified, and not None, must be an array with one value for each wedge.

Each value represents how far from the center each wedge is displayed:

Example

Pull the "Apples" wedge 0.2 from the center of the pie:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]

plt.pie(y, labels = mylabels, explode = myexplode)
plt.show() 

Result:


Shadow

Add a shadow to the pie chart by setting the shadows parameter to True:

Example

Add a shadow:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]

plt.pie(y, labels = mylabels, explode = myexplode, shadow = True)
plt.show() 

Result:


Colors

You can set the color of each wedge with the colors parameter.

The colors parameter, if specified, must be an array with one value for each wedge:

Example

Specify a new color for each wedge:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
mycolors = ["black", "hotpink", "b", "#4CAF50"]

plt.pie(y, labels = mylabels, colors = mycolors)
plt.show() 

Result:

You can use , any of the , or one of these shortcuts:

'r' - Red
'g' - Green
'b' - Blue
'c' - Cyan
'm' - Magenta
'y' - Yellow
'k' - Black
'w' - White


Legend

To add a list of explanation for each wedge, use the legend() function:

Example

Add a legend:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels)
plt.legend()
plt.show() 

Result:

Legend With Header

To add a header to the legend, add the title parameter to the legend function.

Example

Add a legend with a header:

import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels)
plt.legend(title = "Four Fruits:")
plt.show() 

Result:


Machine Learning is making the computer learn from studying data and statistics.

Machine Learning is a step into the direction of artificial intelligence (AI).

Machine Learning is a program that analyses data and learns to predict the outcome.

Where To Start?

In this tutorial we will go back to mathematics and study statistics, and how to calculate important numbers based on data sets.

We will also learn how to use various Python modules to get the answers we need.

And we will learn how to make functions that are able to predict the outcome based on what we have learned.


Data Set

In the mind of a computer, a data set is any collection of data. It can be anything from an array to a complete database.

Example of an array:

[99,86,87,88,111,86,103,87,94,78,77,85,86]

Example of a database:

Carname Color Age Speed AutoPass
BMW red 5 99 Y
Volvo black 7 86 Y
VW gray 8 87 N
VW white 7 88 Y
Ford white 2 111 Y
VW white 17 86 Y
Tesla red 2 103 Y
BMW black 9 87 Y
Volvo gray 4 94 N
Ford white 11 78 N
Toyota gray 12 77 N
VW white 9 85 N
Toyota blue 6 86 Y

By looking at the array, we can guess that the average value is probably around 80 or 90, and we are also able to determine the highest value and the lowest value, but what else can we do?

And by looking at the database we can see that the most popular color is white, and the oldest car is 17 years, but what if we could predict if a car had an AutoPass, just by looking at the other values?

That is what Machine Learning is for! Analyzing data and predicting the outcome!

In Machine Learning it is common to work with very large data sets. In this tutorial we will try to make it as easy as possible to understand the different concepts of machine learning, and we will work with small easy-to-understand data sets.



Data Types

To analyze data, it is important to know what type of data we are dealing with.

We can split the data types into three main categories:

  • Numerical
  • Categorical
  • Ordinal

Numerical data are numbers, and can be split into two numerical categories:

  • Discrete Data
    - numbers that are limited to integers. Example: The number of cars passing by.
  • Continuous Data
    - numbers that are of infinite value. Example: The price of an item, or the size of an item

Categorical data are values that cannot be measured up against each other. Example: a color value, or any yes/no values.

Ordinal data are like categorical data, but can be measured up against each other. Example: school grades where A is better than B and so on.

By knowing the data type of your data source, you will be able to know what technique to use when analyzing them.

You will learn more about statistics and analyzing data in the next chapters.



Mean, Median, and Mode

What can we learn from looking at a group of numbers?

In Machine Learning (and in mathematics) there are often three values that interests us:

  • Mean - The average value
  • Median - The mid point value
  • Mode - The most common value

Example: We have registered the speed of 13 cars:

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

What is the average, the middle, or the most common speed value?


Mean

The mean value is the average value.

To calculate the mean, find the sum of all values, and divide the sum by the number of values:

(99+86+87+88+111+86+103+87+94+78+77+85+86) / 13 = 89.77

The NumPy module has a method for this. Learn about the NumPy module in our .

Example

Use the NumPy mean() method to find the average speed:

import numpy

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

x = numpy.mean(speed)

print(x)


Median

The median value is the value in the middle, after you have sorted all the values:

77, 78, 85, 86, 86, 86, 87, 87, 88, 94, 99, 103, 111

It is important that the numbers are sorted before you can find the median.

The NumPy module has a method for this:

Example

Use the NumPy median() method to find the middle value:

import numpy

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

x = numpy.median(speed)

print(x)

If there are two numbers in the middle, divide the sum of those numbers by two.

77, 78, 85, 86, 86, 86, 87, 87, 94, 98, 99, 103

(86 + 87) / 2 =
86.5

Example

Using the NumPy module:

import numpy

speed = [99,86,87,88,86,103,87,94,78,77,85,86]

x = numpy.median(speed)

print(x)

Mode

The Mode value is the value that appears the most number of times:

99,86, 87, 88, 111,86, 103, 87, 94, 78, 77, 85,86 = 86

The SciPy module has a method for this. Learn about the SciPy module in our .

Example

Use the SciPy mode() method to find the number that appears the most:

from scipy import stats

speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]

x = stats.mode(speed)

print(x)

Chapter Summary

The Mean, Median, and Mode are techniques that are often used in Machine Learning, so it is important to understand the concept behind them.



What is Standard Deviation?

Standard deviation is a number that describes how spread out the values are.

A low standard deviation means that most of the numbers are close to the mean (average) value.

A high standard deviation means that the values are spread out over a wider range.

Example: This time we have registered the speed of 7 cars:

speed = [86,87,88,86,87,85,86]

The standard deviation is:

0.9

Meaning that most of the values are within the range of 0.9 from the mean value, which is 86.4.

Let us do the same with a selection of numbers with a wider range:

speed = [32,111,138,28,59,77,97]

The standard deviation is:

37.85

Meaning that most of the values are within the range of 37.85 from the mean value, which is 77.4.

As you can see, a higher standard deviation indicates that the values are spread out over a wider range.

The NumPy module has a method to calculate the standard deviation:

Example

Use the NumPy std() method to find the standard deviation:

import numpy

speed = [86,87,88,86,87,85,86]

x = numpy.std(speed)

print(x)

Example

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.std(speed)

print(x)

Learn to Filter Data in Python Like a Data Analyst

Try a hands-on training sessions with step-by-step guidance from an expert. Try the guided project made in collaboration with Coursera now!


Variance

Variance is another number that indicates how spread out the values are.

In fact, if you take the square root of the variance, you get the standard deviation!

Or the other way around, if you multiply the standard deviation by itself, you get the variance!

To calculate the variance you have to do as follows:

1. Find the mean:

(32+111+138+28+59+77+97) / 7 = 77.4

2. For each value: find the difference from the mean:

 32 - 77.4 = -45.4
111 - 77.4 =  33.6
138 - 77.4 =  60.6
 28 - 77.4 = -49.4
 59 - 77.4 = -18.4
 77 - 77.4 = - 0.4
 97 - 77.4 =  19.6

3. For each difference: find the square value:

(-45.4)2 = 2061.16
 (33.6)2 = 1128.96
 (60.6)2 = 3672.36
(-49.4)2 = 2440.36
(-18.4)2 =  338.56
(- 0.4)2 =    0.16
 (19.6)2 =  384.16

4. The variance is the average number of these squared differences:

(2061.16+1128.96+3672.36+2440.36+338.56+0.16+384.16) / 7 = 1432.2

Luckily, NumPy has a method to calculate the variance:

Example

Use the NumPy var() method to find the variance:

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.var(speed)

print(x)

Standard Deviation

As we have learned, the formula to find the standard deviation is the square root of the variance:

√1432.25 = 37.85

Or, as in the example from before, use the NumPy to calculate the standard deviation:

Example

Use the NumPy std() method to find the standard deviation:

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.std(speed)

print(x)

Symbols

Standard Deviation is often represented by the symbol Sigma: σ

Variance is often represented by the symbol Sigma Squared: σ2


Chapter Summary

The Standard Deviation and Variance are terms that are often used in Machine Learning, so it is important to understand how to get them, and the concept behind them.



What are Percentiles?

Percentiles are used in statistics to give you a number that describes the value that a given percent of the values are lower than.

Example: Let's say we have an array of the ages of all the people that live in a street.

ages = [5,31,43,48,50,41,7,11,15,39,80,82,32,2,8,6,25,36,27,61,31]

What is the 75. percentile? The answer is 43, meaning that 75% of the people are 43 or younger.

The NumPy module has a method for finding the specified percentile:

Example

Use the NumPy percentile() method to find the percentiles:

import numpy

ages = [5,31,43,48,50,41,7,11,15,39,80,82,32,2,8,6,25,36,27,61,31]

x = numpy.percentile(ages, 75)

print(x)

Example

What is the age that 90% of the people are younger than?

import numpy

ages = [5,31,43,48,50,41,7,11,15,39,80,82,32,2,8,6,25,36,27,61,31]

x = numpy.percentile(ages, 90)

print(x)


Data Distribution

Earlier in this tutorial we have worked with very small amounts of data in our examples, just to understand the different concepts.

In the real world, the data sets are much bigger, but it can be difficult to gather real world data, at least at an early stage of a project.

How Can we Get Big Data Sets?

To create big data sets for testing, we use the Python module NumPy, which comes with a number of methods to create random data sets, of any size.

Example

Create an array containing 250 random floats between 0 and 5:

import numpy

x = numpy.random.uniform(0.0, 5.0, 250)

print(x)

Histogram

To visualize the data set we can draw a histogram with the data we collected.

We will use the Python module Matplotlib to draw a histogram.

Learn about the Matplotlib module in our .

Example

Draw a histogram:

import numpy
import matplotlib.pyplot as plt

x = numpy.random.uniform(0.0, 5.0, 250)

plt.hist(x, 5)
plt.show()

Result:

Histogram Explained

We use the array from the example above to draw a histogram with 5 bars.

The first bar represents how many values in the array are between 0 and 1.

The second bar represents how many values are between 1 and 2.

Etc.

Which gives us this result:

  • 52 values are between 0 and 1
  • 48 values are between 1 and 2
  • 49 values are between 2 and 3
  • 51 values are between 3 and 4
  • 50 values are between 4 and 5

Note: The array values are random numbers and will not show the exact same result on your computer.

Big Data Distributions

An array containing 250 values is not considered very big, but now you know how to create a random set of values, and by changing the parameters, you can create the data set as big as you want.

Example

Create an array with 100000 random numbers, and display them using a histogram with 100 bars:

import numpy
import matplotlib.pyplot as plt

x = numpy.random.uniform(0.0, 5.0, 100000)

plt.hist(x, 100)
plt.show()


Normal Data Distribution

In the previous chapter we learned how to create a completely random array, of a given size, and between two given values.

In this chapter we will learn how to create an array where the values are concentrated around a given value.

In probability theory this kind of data distribution is known as the normal data distribution, or the Gaussian data distribution, after the mathematician Carl Friedrich Gauss who came up with the formula of this data distribution.

Example

A typical normal data distribution:

import numpy
import matplotlib.pyplot as plt

x = numpy.random.normal(5.0, 1.0, 100000)

plt.hist(x, 100)
plt.show()

Result:

Note: A normal distribution graph is also known as the bell curve because of it's characteristic shape of a bell.

Histogram Explained

We use the array from the numpy.random.normal() method, with 100000 values,  to draw a histogram with 100 bars.

We specify that the mean value is 5.0, and the standard deviation is 1.0.

Meaning that the values should be concentrated around 5.0, and rarely further away than 1.0 from the mean.

And as you can see from the histogram, most values are between 4.0 and 6.0, with a top at approximately 5.0.



Scatter Plot

A scatter plot is a diagram where each value in the data set is represented by a dot.

The Matplotlib module has a method for drawing scatter plots, it needs two arrays of the same length, one for the values of the x-axis, and one for the values of the y-axis:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

The x array represents the age of each car.

The y array represents the speed of each car.

Example

Use the scatter() method to draw a scatter plot diagram:

import matplotlib.pyplot as plt

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

plt.scatter(x, y)
plt.show()

Result:

Scatter Plot Explained

The x-axis represents ages, and the y-axis represents speeds.

What we can read from the diagram is that the two fastest cars were both 2 years old, and the slowest car was 12 years old.

Note: It seems that the newer the car, the faster it drives, but that could be a coincidence, after all we only registered 13 cars.



Random Data Distributions

In Machine Learning the data sets can contain thousands-, or even millions, of values.

You might not have real world data when you are testing an algorithm, you might have to use randomly generated values.

As we have learned in the previous chapter, the NumPy module can help us with that!

Let us create two arrays that are both filled with 1000 random numbers from a normal data distribution.

The first array will have the mean set to 5.0 with a standard deviation of 1.0.

The second array will have the mean set to 10.0 with a standard deviation of 2.0:

Example

A scatter plot with 1000 dots:

import numpy
import matplotlib.pyplot as plt

x = numpy.random.normal(5.0, 1.0, 1000)
y = numpy.random.normal(10.0, 2.0, 1000)

plt.scatter(x, y)
plt.show()

Result:

Scatter Plot Explained

We can see that the dots are concentrated around the value 5 on the x-axis, and 10 on the y-axis.

We can also see that the spread is wider on the y-axis than on the x-axis.



Regression

The term regression is used when you try to find the relationship between variables.

In Machine Learning, and in statistical modeling, that relationship is used to predict the outcome of future events.


Linear Regression

Linear regression uses the relationship between the data-points to draw a straight line through all them.

This line can be used to predict future values.

In Machine Learning, predicting the future is very important.


How Does it Work?

Python has methods for finding a relationship between data-points and to draw a line of linear regression. We will show you how to use these methods instead of going through the mathematic formula.

In the example below, the x-axis represents age, and the y-axis represents speed. We have registered the age and speed of 13 cars as they were passing a tollbooth. Let us see if the data we collected could be used in a linear regression:

Example

Start by drawing a scatter plot:

import matplotlib.pyplot as plt

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

plt.scatter(x, y)
plt.show()

Result:

Example

Import scipy and draw the line of Linear Regression:

import matplotlib.pyplot as plt
from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):
  return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

Result:

Example Explained

Import the modules you need.

You can learn about the Matplotlib module in our .

You can learn about the SciPy module in our .

import matplotlib.pyplot as plt
from scipy import stats

Create the arrays that represent the values of the x and y axis:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

Execute a method that returns some important key values of Linear Regression:

slope, intercept, r, p, std_err = stats.linregress(x, y)

Create a function that uses the slope and intercept values to return a new value. This new value represents where on the y-axis the corresponding x value will be placed:

def myfunc(x):
  return slope * x + intercept

Run each value of the x array through the function. This will result in a new array with new values for the y-axis:

mymodel = list(map(myfunc, x))

Draw the original scatter plot:

plt.scatter(x, y)

Draw the line of linear regression:

plt.plot(x, mymodel)

Display the diagram:

plt.show()



R for Relationship

It is important to know how the relationship between the values of the x-axis and the values of the y-axis is, if there are no relationship the linear regression can not be used to predict anything.

This relationship - the coefficient of correlation - is called r.

The r value ranges from -1 to 1, where 0 means no relationship, and 1 (and -1) means 100% related.

Python and the Scipy module will compute this value for you, all you have to do is feed it with the x and y values.

Example

How well does my data fit in a linear regression?

from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

print(r)

Note: The result -0.76 shows that there is a relationship, not perfect, but it indicates that we could use linear regression in future predictions.


Predict Future Values

Now we can use the information we have gathered to predict future values.

Example: Let us try to predict the speed of a 10 years old car.

To do so, we need the same myfunc() function from the example above:

def myfunc(x):
  return slope * x + intercept

Example

Predict the speed of a 10 years old car:

from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):
  return slope * x + intercept

speed = myfunc(10)

print(speed)

The example predicted a speed at 85.6, which we also could read from the diagram:


Bad Fit?

Let us create an example where linear regression would not be the best method to predict future values.

Example

These values for the x- and y-axis should result in a very bad fit for linear regression:

import matplotlib.pyplot as plt
from scipy import stats

x = [89,43,36,36,95,10,66,34,38,20,26,29,48,64,6,5,36,66,72,40]
y = [21,46,3,35,67,95,53,72,58,10,26,34,90,33,38,20,56,2,47,15]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):
  return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

Result:

And the r for relationship?

Example

You should get a very low r value.

import numpy
from scipy import stats

x = [89,43,36,36,95,10,66,34,38,20,26,29,48,64,6,5,36,66,72,40]
y = [21,46,3,35,67,95,53,72,58,10,26,34,90,33,38,20,56,2,47,15]

slope, intercept, r, p, std_err = stats.linregress(x, y)

print(r)

The result: 0.013 indicates a very bad relationship, and tells us that this data set is not suitable for linear regression.



Polynomial Regression

If your data points clearly will not fit a linear regression (a straight line through all data points), it might be ideal for polynomial regression.

Polynomial regression, like linear regression, uses the relationship between the variables x and y to find the best way to draw a line through the data points.


How Does it Work?

Python has methods for finding a relationship between data-points and to draw a line of polynomial regression. We will show you how to use these methods instead of going through the mathematic formula.

In the example below, we have registered 18 cars as they were passing a certain tollbooth.

We have registered the car's speed, and the time of day (hour) the passing occurred.

The x-axis represents the hours of the day and the y-axis represents the speed:

Example

Start by drawing a scatter plot:

import matplotlib.pyplot as plt

x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22]
y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100]

plt.scatter(x, y)
plt.show()

Result:

Example

Import numpy and matplotlib then draw the line of Polynomial Regression:

import numpy
import matplotlib.pyplot as plt

x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22]
y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100]

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

myline = numpy.linspace(1, 22, 100)

plt.scatter(x, y)
plt.plot(myline, mymodel(myline))
plt.show()

Result:

Example Explained

Import the modules you need.

You can learn about the NumPy module in our .

You can learn about the SciPy module in our .

import numpy
import matplotlib.pyplot as plt

Create the arrays that represent the values of the x and y axis:

x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22]
y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100]

NumPy has a method that lets us make a polynomial model:

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

Then specify how the line will display, we start at position 1, and end at position 22:

myline = numpy.linspace(1, 22, 100)

Draw the original scatter plot:

plt.scatter(x, y)

Draw the line of polynomial regression:

plt.plot(myline, mymodel(myline))

Display the diagram:

plt.show()



R-Squared

It is important to know how well the relationship between the values of the x- and y-axis is, if there are no relationship the polynomial regression can not be used to predict anything.

The relationship is measured with a value called the r-squared.

The r-squared value ranges from 0 to 1, where 0 means no relationship, and 1 means 100% related.

Python and the Sklearn module will compute this value for you, all you have to do is feed it with the x and y arrays:

Example

How well does my data fit in a polynomial regression?

import numpy
from sklearn.metrics import r2_score

x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22]
y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100]

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

print(r2_score(y, mymodel(x)))

Note: The result 0.94 shows that there is a very good relationship, and we can use polynomial regression in future predictions.


Predict Future Values

Now we can use the information we have gathered to predict future values.

Example: Let us try to predict the speed of a car that passes the tollbooth at around the time 17:00:

To do so, we need the same mymodel array from the example above:

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

Example

Predict the speed of a car passing at 17:00:

import numpy
from sklearn.metrics import r2_score

x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22]
y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100]

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

speed = mymodel(17)
print(speed)

The example predicted a speed to be 88.87, which we also could read from the diagram:


Bad Fit?

Let us create an example where polynomial regression would not be the best method to predict future values.

Example

These values for the x- and y-axis should result in a very bad fit for polynomial regression:

import numpy
import matplotlib.pyplot as plt

x = [89,43,36,36,95,10,66,34,38,20,26,29,48,64,6,5,36,66,72,40]
y = [21,46,3,35,67,95,53,72,58,10,26,34,90,33,38,20,56,2,47,15]

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

myline = numpy.linspace(2, 95, 100)

plt.scatter(x, y)
plt.plot(myline, mymodel(myline))
plt.show()

Result:

And the r-squared value?

Example

You should get a very low r-squared value.

import numpy
from sklearn.metrics import r2_score

x = [89,43,36,36,95,10,66,34,38,20,26,29,48,64,6,5,36,66,72,40]
y = [21,46,3,35,67,95,53,72,58,10,26,34,90,33,38,20,56,2,47,15]

mymodel = numpy.poly1d(numpy.polyfit(x, y, 3))

print(r2_score(y, mymodel(x)))

The result: 0.00995 indicates a very bad relationship, and tells us that this data set is not suitable for polynomial regression.



Multiple Regression

Multiple regression is like , but with more than one independent value, meaning that we try to predict a value based on two or more variables.

Take a look at the data set below, it contains some information about cars.

Car Model Volume Weight CO2
Toyota Aygo 1000 790 99
Mitsubishi Space Star 1200 1160 95
Skoda Citigo 1000 929 95
Fiat 500 900 865 90
Mini Cooper 1500 1140 105
VW Up! 1000 929 105
Skoda Fabia 1400 1109 90
Mercedes A-Class 1500 1365 92
Ford Fiesta 1500 1112 98
Audi A1 1600 1150 99
Hyundai I20 1100 980 99
Suzuki Swift 1300 990 101
Ford Fiesta 1000 1112 99
Honda Civic 1600 1252 94
Hundai I30 1600 1326 97
Opel Astra 1600 1330 97
BMW 1 1600 1365 99
Mazda 3 2200 1280 104
Skoda Rapid 1600 1119 104
Ford Focus 2000 1328 105
Ford Mondeo 1600 1584 94
Opel Insignia 2000 1428 99
Mercedes C-Class 2100 1365 99
Skoda Octavia 1600 1415 99
Volvo S60 2000 1415 99
Mercedes CLA 1500 1465 102
Audi A4 2000 1490 104
Audi A6 2000 1725 114
Volvo V70 1600 1523 109
BMW 5 2000 1705 114
Mercedes E-Class 2100 1605 115
Volvo XC70 2000 1746 117
Ford B-Max 1600 1235 104
BMW 2 1600 1390 108
Opel Zafira 1600 1405 109
Mercedes SLK 2500 1395 120

We can predict the CO2 emission of a car based on the size of the engine, but with multiple regression we can throw in more variables, like the weight of the car, to make the prediction more accurate.


How Does it Work?

In Python we have modules that will do the work for us. Start by importing the Pandas module.

import pandas

Learn about the Pandas module in our .

The Pandas module allows us to read csv files and return a DataFrame object.

The file is meant for testing purposes only, you can download it here:

df = pandas.read_csv("data.csv")

Then make a list of the independent values and call this variable X.

Put the dependent values in a variable called y.

X = df[['Weight', 'Volume']]
y = df['CO2']

Tip: It is common to name the list of independent values with a upper case X, and the list of dependent values with a lower case y.

We will use some methods from the sklearn module, so we will have to import that module as well:

from sklearn import linear_model

From the sklearn module we will use the LinearRegression() method to create a linear regression object.

This object has a method called fit() that takes the independent and dependent values as parameters and fills the regression object with data that describes the relationship:

regr = linear_model.LinearRegression()
regr.fit(X, y)

Now we have a regression object that are ready to predict CO2 values based on a car's weight and volume:

#predict the CO2 emission of a car where the weight is 2300kg, and the volume is 1300cm3:
predictedCO2 = regr.predict([[2300, 1300]])

Example

See the whole example in action:

import pandas
from sklearn import linear_model

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']

regr = linear_model.LinearRegression()
regr.fit(X, y)

#predict the CO2 emission of a car where the weight is 2300kg, and the volume is 1300cm3:
predictedCO2 = regr.predict([[2300, 1300]])

print(predictedCO2)

Result:

[107.2087328]

We have predicted that a car with 1.3 liter engine, and a weight of 2300 kg, will release approximately 107 grams of CO2 for every kilometer it drives.



Coefficient

The coefficient is a factor that describes the relationship with an unknown variable.

Example: if x is a variable, then 2x is x two times. x is the unknown variable, and the number 2 is the coefficient.

In this case, we can ask for the coefficient value of weight against CO2, and for volume against CO2. The answer(s) we get tells us what would happen if we increase, or decrease, one of the independent values.

Example

Print the coefficient values of the regression object:

import pandas
from sklearn import linear_model

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']

regr = linear_model.LinearRegression()
regr.fit(X, y)

print(regr.coef_)

Result:

[0.00755095 0.00780526]

Result Explained

The result array represents the coefficient values of weight and volume.

Weight: 0.00755095
Volume: 0.00780526

These values tell us that if the weight increase by 1kg, the CO2 emission increases by 0.00755095g.

And if the engine size (Volume) increases by 1 cm3, the CO2 emission increases by 0.00780526 g.

I think that is a fair guess, but let test it!

We have already predicted that if a car with a 1300cm3 engine weighs 2300kg, the CO2 emission will be approximately 107g.

What if we increase the weight with 1000kg?

Example

Copy the example from before, but change the weight from 2300 to 3300:

import pandas
from sklearn import linear_model

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']

regr = linear_model.LinearRegression()
regr.fit(X, y)

predictedCO2 = regr.predict([[3300, 1300]])

print(predictedCO2)

Result:

[114.75968007]

We have predicted that a car with 1.3 liter engine, and a weight of 3300 kg, will release approximately 115 grams of CO2 for every kilometer it drives.

Which shows that the coefficient of 0.00755095 is correct:

107.2087328 + (1000 * 0.00755095) = 114.75968



Scale Features

When your data has different values, and even different measurement units, it can be difficult to compare them. What is kilograms compared to meters? Or altitude compared to time?

The answer to this problem is scaling. We can scale data into new values that are easier to compare.

Take a look at the table below, it is the same data set that we used in the , but this time the volume column contains values in liters instead of cm3 (1.0 instead of 1000).

Car Model Volume Weight CO2
Toyota Aygo 1.0 790 99
Mitsubishi Space Star 1.2 1160 95
Skoda Citigo 1.0 929 95
Fiat 500 0.9 865 90
Mini Cooper 1.5 1140 105
VW Up! 1.0 929 105
Skoda Fabia 1.4 1109 90
Mercedes A-Class 1.5 1365 92
Ford Fiesta 1.5 1112 98
Audi A1 1.6 1150 99
Hyundai I20 1.1 980 99
Suzuki Swift 1.3 990 101
Ford Fiesta 1.0 1112 99
Honda Civic 1.6 1252 94
Hundai I30 1.6 1326 97
Opel Astra 1.6 1330 97
BMW 1 1.6 1365 99
Mazda 3 2.2 1280 104
Skoda Rapid 1.6 1119 104
Ford Focus 2.0 1328 105
Ford Mondeo 1.6 1584 94
Opel Insignia 2.0 1428 99
Mercedes C-Class 2.1 1365 99
Skoda Octavia 1.6 1415 99
Volvo S60 2.0 1415 99
Mercedes CLA 1.5 1465 102
Audi A4 2.0 1490 104
Audi A6 2.0 1725 114
Volvo V70 1.6 1523 109
BMW 5 2.0 1705 114
Mercedes E-Class 2.1 1605 115
Volvo XC70 2.0 1746 117
Ford B-Max 1.6 1235 104
BMW 2 1.6 1390 108
Opel Zafira 1.6 1405 109
Mercedes SLK 2.5 1395 120

It can be difficult to compare the volume 1.0 with the weight 790, but if we scale them both into comparable values, we can easily see how much one value is compared to the other.

There are different methods for scaling data, in this tutorial we will use a method called standardization.

The standardization method uses this formula:

z = (x - u) / s

Where z is the new value, x is the original value, u is the mean and s is the standard deviation.

If you take the weight column from the data set above, the first value is 790, and the scaled value will be:

(790 - ) / = -2.1

If you take the volume column from the data set above, the first value is 1.0, and the scaled value will be:

(1.0 - ) / = -1.59

Now you can compare -2.1 with -1.59 instead of comparing 790 with 1.0.

You do not have to do this manually, the Python sklearn module has a method called StandardScaler() which returns a Scaler object with methods for transforming data sets.

Example

Scale all values in the Weight and Volume columns:

import pandas
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
scale = StandardScaler()

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]

scaledX = scale.fit_transform(X)

print(scaledX)

Result:

Note that the first two values are -2.1 and -1.59, which corresponds to our calculations:

[[-2.10389253 -1.59336644]
 [-0.55407235 -1.07190106]
 [-1.52166278 -1.59336644]
 [-1.78973979 -1.85409913]
 [-0.63784641 -0.28970299]
 [-1.52166278 -1.59336644]
 [-0.76769621 -0.55043568]
 [ 0.3046118  -0.28970299]
 [-0.7551301  -0.28970299]
 [-0.59595938 -0.0289703 ]
 [-1.30803892 -1.33263375]
 [-1.26615189 -0.81116837]
 [-0.7551301  -1.59336644]
 [-0.16871166 -0.0289703 ]
 [ 0.14125238 -0.0289703 ]
 [ 0.15800719 -0.0289703 ]
 [ 0.3046118  -0.0289703 ]
 [-0.05142797  1.53542584]
 [-0.72580918 -0.0289703 ]
 [ 0.14962979  1.01396046]
 [ 1.2219378  -0.0289703 ]
 [ 0.5685001   1.01396046]
 [ 0.3046118   1.27469315]
 [ 0.51404696 -0.0289703 ]
 [ 0.51404696  1.01396046]
 [ 0.72348212 -0.28970299]
 [ 0.8281997   1.01396046]
 [ 1.81254495  1.01396046]
 [ 0.96642691 -0.0289703 ]
 [ 1.72877089  1.01396046]
 [ 1.30990057  1.27469315]
 [ 1.90050772  1.01396046]
 [-0.23991961 -0.0289703 ]
 [ 0.40932938 -0.0289703 ]
 [ 0.47215993 -0.0289703 ]
 [ 0.4302729   2.31762392]]



Predict CO2 Values

The task in the was to predict the CO2 emission from a car when you only knew its weight and volume.

When the data set is scaled, you will have to use the scale when you predict values:

Example

Predict the CO2 emission from a 1.3 liter car that weighs 2300 kilograms:

import pandas
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
scale = StandardScaler()

df = pandas.read_csv("data.csv")

X = df[['Weight', 'Volume']]
y = df['CO2']

scaledX = scale.fit_transform(X)

regr = linear_model.LinearRegression()
regr.fit(scaledX, y)

scaled = scale.transform([[2300, 1.3]])

predictedCO2 = regr.predict([scaled[0]])
print(predictedCO2)

Result:

[107.2087328]



Evaluate Your Model

In Machine Learning we create models to predict the outcome of certain events, like in the previous chapter where we predicted the CO2 emission of a car when we knew the weight and engine size.

To measure if the model is good enough, we can use a method called Train/Test.


What is Train/Test

Train/Test is a method to measure the accuracy of your model.

It is called Train/Test because you split the data set into two sets: a training set and a testing set.

80% for training, and 20% for testing.

You train the model using the training set.

You test the model using the testing set.

Train the model means create the model.

Test the model means test the accuracy of the model.


Start With a Data Set

Start with a data set you want to test.

Our data set illustrates 100 customers in a shop, and their shopping habits.

Example

import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

plt.scatter(x, y)
plt.show()

Result:

The x axis represents the number of minutes before making a purchase.

The y axis represents the amount of money spent on the purchase.



Split Into Train/Test

The training set should be a random selection of 80% of the original data.

The testing set should be the remaining 20%.

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]


Display the Training Set

Display the same scatter plot with the training set:

Example

plt.scatter(train_x, train_y)
plt.show()

Result:

It looks like the original data set, so it seems to be a fair selection:


Display the Testing Set

To make sure the testing set is not completely different, we will take a look at the testing set as well.

Example

plt.scatter(test_x, test_y)
plt.show()

Result:

The testing set also looks like the original data set:


Fit the Data Set

What does the data set look like? In my opinion I think the best fit would be a , so let us draw a line of polynomial regression.

To draw a line through the data points, we use the plot() method of the matplotlib module:

Example

Draw a polynomial regression line through the data points:

import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

plt.scatter(train_x, train_y)
plt.plot(myline, mymodel(myline))
plt.show()

Result:

The result can back my suggestion of the data set fitting a polynomial regression, even though it would give us some weird results if we try to predict values outside of the data set. Example: the line indicates that a customer spending 6 minutes in the shop would make a purchase worth 200. That is probably a sign of overfitting.

But what about the R-squared score? The R-squared score is a good indicator of how well my data set is fitting the model.


R2

Remember R2, also known as R-squared?

It measures the relationship between the x axis and the y axis, and the value ranges from 0 to 1, where 0 means no relationship, and 1 means totally related.

The sklearn module has a method called r2_score() that will help us find this relationship.

In this case we would like to measure the relationship between the minutes a customer stays in the shop and how much money they spend.

Example

How well does my training data fit in a polynomial regression?

import numpy
from sklearn.metrics import r2_score
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

r2 = r2_score(train_y, mymodel(train_x))

print(r2)

Note: The result 0.799 shows that there is a OK relationship.

Bring in the Testing Set

Now we have made a model that is OK, at least when it comes to training data.

Now we want to test the model with the testing data as well, to see if gives us the same result.

Example

Let us find the R2 score when using testing data:

import numpy
from sklearn.metrics import r2_score
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

r2 = r2_score(test_y, mymodel(test_x))

print(r2)

Note: The result 0.809 shows that the model fits the testing set as well, and we are confident that we can use the model to predict future values.


Predict Values

Now that we have established that our model is OK, we can start predicting new values.

Example

How much money will a buying customer spend, if she or he stays in the shop for 5 minutes?

print(mymodel(5))

The example predicted the customer to spend 22.88 dollars, as seems to correspond to the diagram:




Decision Tree

In this chapter we will show you how to make a "Decision Tree". A Decision Tree is a Flow Chart, and can help you make decisions based on previous experience.

In the example, a person will try to decide if he/she should go to a comedy show or not.

Luckily our example person has registered every time there was a comedy show in town, and registered some information about the comedian, and also registered if he/she went or not.

Age Experience Rank Nationality Go
36 10 9 UK NO
42 12 4 USA NO
23 4 6 N NO
52 4 4 USA NO
43 21 8 USA YES
44 14 5 UK NO
66 3 7 N YES
35 14 9 UK YES
52 13 7 N YES
35 5 9 N YES
24 3 5 USA NO
18 3 7 UK YES
45 9 9 UK YES

Now, based on this data set, Python can create a decision tree that can be used to decide if any new shows are worth attending to.



How Does it Work?

First, read the dataset with pandas:

Example

Read and print the data set:

import pandas

df = pandas.read_csv("data.csv")

print(df)

To make a decision tree, all data has to be numerical.

We have to convert the non numerical columns 'Nationality' and 'Go' into numerical values.

Pandas has a map() method that takes a dictionary with information on how to convert the values.

{'UK': 0, 'USA': 1, 'N': 2}

Means convert the values 'UK' to 0, 'USA' to 1, and 'N' to 2.

Example

Change string values into numerical values:

d = {'UK': 0, 'USA': 1, 'N': 2}
df['Nationality'] = df['Nationality'].map(d)
d = {'YES': 1, 'NO': 0}
df['Go'] = df['Go'].map(d)

print(df)

Then we have to separate the feature columns from the target column.

The feature columns are the columns that we try to predict from, and the target column is the column with the values we try to predict.

Example

X is the feature columns, y is the target column:

features = ['Age', 'Experience', 'Rank', 'Nationality']

X = df[features]
y = df['Go']

print(X)
print(y)

Now we can create the actual decision tree, fit it with our details. Start by importing the modules we need:

Example

Create and display a Decision Tree:

import pandas
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt

df = pandas.read_csv("data.csv")

d = {'UK': 0, 'USA': 1, 'N': 2}
df['Nationality'] = df['Nationality'].map(d)
d = {'YES': 1, 'NO': 0}
df['Go'] = df['Go'].map(d)

features = ['Age', 'Experience', 'Rank', 'Nationality']

X = df[features]
y = df['Go']

dtree = DecisionTreeClassifier()
dtree = dtree.fit(X, y)

tree.plot_tree(dtree, feature_names=features)


Result Explained

The decision tree uses your earlier decisions to calculate the odds for you to wanting to go see a comedian or not.

Let us read the different aspects of the decision tree:

Rank

Rank <= 6.5 means that every comedian with a rank of 6.5 or lower will follow the True arrow (to the left), and the rest will follow the False arrow (to the right).

gini = 0.497 refers to the quality of the split, and is always a number between 0.0 and 0.5, where 0.0 would mean all of the samples got the same result, and 0.5 would mean that the split is done exactly in the middle.

samples = 13 means that there are 13 comedians left at this point in the decision, which is all of them since this is the first step.

value = [6, 7] means that of these 13 comedians, 6 will get a "NO", and 7 will get a "GO".

Gini

There are many ways to split the samples, we use the GINI method in this tutorial.

The Gini method uses this formula:

Gini = 1 - (x/n)2 - (y/n)2

Where x is the number of positive answers("GO"), n is the number of samples, and y is the number of negative answers ("NO"), which gives us this calculation:

1 - (7 / 13)2 - (6 / 13)2 = 0.497

The next step contains two boxes, one box for the comedians with a 'Rank' of 6.5 or lower, and one box with the rest.

True - 5 Comedians End Here:

gini = 0.0 means all of the samples got the same result.

samples = 5 means that there are 5 comedians left in this branch (5 comedian with a Rank of 6.5 or lower).

value = [5, 0] means that 5 will get a "NO" and 0 will get a "GO".

False - 8 Comedians Continue:

Nationality

Nationality <= 0.5 means that the comedians with a nationality value of less than 0.5 will follow the arrow to the left (which means everyone from the UK, ), and the rest will follow the arrow to the right.

gini = 0.219 means that about 22% of the samples would go in one direction.

samples = 8 means that there are 8 comedians left in this branch (8 comedian with a Rank higher than 6.5).

value = [1, 7] means that of these 8 comedians, 1 will get a "NO" and 7 will get a "GO".




True - 4 Comedians Continue:

Age

Age <= 35.5 means that comedians at the age of 35.5 or younger will follow the arrow to the left, and the rest will follow the arrow to the right.

gini = 0.375 means that about 37,5% of the samples would go in one direction.

samples = 4 means that there are 4 comedians left in this branch (4 comedians from the UK).

value = [1, 3] means that of these 4 comedians, 1 will get a "NO" and 3 will get a "GO".

False - 4 Comedians End Here:

gini = 0.0 means all of the samples got the same result.

samples = 4 means that there are 4 comedians left in this branch (4 comedians not from the UK).

value = [0, 4] means that of these 4 comedians, 0 will get a "NO" and 4 will get a "GO".




True - 2 Comedians End Here:

gini = 0.0 means all of the samples got the same result.

samples = 2 means that there are 2 comedians left in this branch (2 comedians at the age 35.5 or younger).

value = [0, 2] means that of these 2 comedians, 0 will get a "NO" and 2 will get a "GO".

False - 2 Comedians Continue:

Experience

Experience <= 9.5 means that comedians with 9.5 years of experience, or less, will follow the arrow to the left, and the rest will follow the arrow to the right.

gini = 0.5 means that 50% of the samples would go in one direction.

samples = 2 means that there are 2 comedians left in this branch (2 comedians older than 35.5).

value = [1, 1] means that of these 2 comedians, 1 will get a "NO" and 1 will get a "GO".




True - 1 Comedian Ends Here:

gini = 0.0 means all of the samples got the same result.

samples = 1 means that there is 1 comedian left in this branch (1 comedian with 9.5 years of experience or less).

value = [0, 1] means that 0 will get a "NO" and 1 will get a "GO".

False - 1 Comedian Ends Here:

gini = 0.0 means all of the samples got the same result.

samples = 1 means that there is 1 comedians left in this branch (1 comedian with more than 9.5 years of experience).

value = [1, 0] means that 1 will get a "NO" and 0 will get a "GO".


Predict Values

We can use the Decision Tree to predict new values.

Example: Should I go see a show starring a 40 years old American comedian, with 10 years of experience, and a comedy ranking of 7?

Example

Use predict() method to predict new values:

print(dtree.predict([[40, 10, 7, 1]]))

Example

What would the answer be if the comedy rank was 6?

print(dtree.predict([[40, 10, 6, 1]]))


Different Results

You will see that the Decision Tree gives you different results if you run it enough times, even if you feed it with the same data.

That is because the Decision Tree does not give us a 100% certain answer. It is based on the probability of an outcome, and the answer will vary.



On this page, W3schools.com collaborates with , to deliver digital training content to our students.


What is a confusion matrix?

It is a table that is used in classification problems to assess where errors in the model were made.

The rows represent the actual classes the outcomes should have been. While the columns represent the predictions we have made. Using this table it is easy to see which predictions are wrong.

Creating a Confusion Matrix

Confusion matrixes can be created by predictions made from a logistic regression.

For now we will generate actual and predicted values by utilizing NumPy:

import numpy

Next we will need to generate the numbers for "actual" and "predicted" values.

actual = numpy.random.binomial(1, 0.9, size = 1000)
predicted = numpy.random.binomial(1, 0.9, size = 1000)

In order to create the confusion matrix we need to import metrics from the sklearn module.

from sklearn import metrics

Once metrics is imported we can use the confusion matrix function on our actual and predicted values.

confusion_matrix = metrics.confusion_matrix(actual, predicted)

To create a more interpretable visual display we need to convert the table into a confusion matrix display.

cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labels = [False, True])

Vizualizing the display requires that we import pyplot from matplotlib.

import matplotlib.pyplot as plt

Finally to display the plot we can use the functions plot() and show() from pyplot.

cm_display.plot()
plt.show()

See the whole example in action:

Example

import matplotlib.pyplot as plt
import numpy
from sklearn import metrics

actual = numpy.random.binomial(1,.9,size = 1000)
predicted = numpy.random.binomial(1,.9,size = 1000)

confusion_matrix = metrics.confusion_matrix(actual, predicted)

cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labels = [False, True])

cm_display.plot()
plt.show()

Result

Results Explained

The Confusion Matrix created has four different quadrants:

True Negative (Top-Left Quadrant)
False Positive (Top-Right Quadrant)
False Negative (Bottom-Left Quadrant)
True Positive (Bottom-Right Quadrant)

True means that the values were accurately predicted, False means that there was an error or wrong prediction.

Now that we have made a Confusion Matrix, we can calculate different measures to quantify the quality of the model. First, lets look at Accuracy.


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Hierarchical Clustering

Hierarchical clustering is an unsupervised learning method for clustering data points. The algorithm builds clusters by measuring the dissimilarities between data. Unsupervised learning means that a model does not have to be trained, and we do not need a "target" variable. This method can be used on any data to visualize and interpret the relationship between individual data points.

Here we will use hierarchical clustering to group data points and visualize the clusters using both a dendrogram and scatter plot.


How does it work?

We will use Agglomerative Clustering, a type of hierarchical clustering that follows a bottom up approach. We begin by treating each data point as its own cluster. Then, we join clusters together that have the shortest distance between them to create larger clusters. This step is repeated until one large cluster is formed containing all of the data points.

Hierarchical clustering requires us to decide on both a distance and linkage method. We will use euclidean distance and the Ward linkage method, which attempts to minimize the variance between clusters.

Example

Start by visualizing some data points:

import numpy as np
import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]

plt.scatter(x, y)
plt.show()

Result


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Logistic Regression

Logistic regression aims to solve classification problems. It does this by predicting categorical outcomes, unlike linear regression that predicts a continuous outcome.

In the simplest case there are two outcomes, which is called binomial, an example of which is predicting if a tumor is malignant or benign. Other cases have more than two outcomes to classify, in this case it is called multinomial. A common example for multinomial logistic regression would be predicting the class of an iris flower between 3 different species.

Here we will be using basic logistic regression to predict a binomial variable. This means it has only two possible outcomes.


How does it work?

In Python we have modules that will do the work for us. Start by importing the NumPy module.

import numpy

Store the independent variables in X.

Store the dependent variable in y.

Below is a sample dataset:

#X represents the size of a tumor in centimeters.
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1,1)

#Note: X has to be reshaped into a column from a row for the LogisticRegression() function to work.
#y represents whether or not the tumor is cancerous (0 for "No", 1 for "Yes").
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

We will use a method from the sklearn module, so we will have to import that module as well:

from sklearn import linear_model

From the sklearn module we will use the LogisticRegression() method to create a logistic regression object.

This object has a method called fit() that takes the independent and dependent values as parameters and fills the regression object with data that describes the relationship:

logr = linear_model.LogisticRegression()
logr.fit(X,y)

Now we have a logistic regression object that is ready to whether a tumor is cancerous based on the tumor size:

#predict if tumor is cancerous where the size is 3.46mm:
predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))

Example

See the whole example in action:

import numpy
from sklearn import linear_model

#Reshaped for Logistic function.
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1,1)
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

logr = linear_model.LogisticRegression()
logr.fit(X,y)

#predict if tumor is cancerous where the size is 3.46mm:
predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))
print(predicted)

Result


 [0]
 

We have predicted that a tumor with a size of 3.46mm will not be cancerous.


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Grid Search

The majority of machine learning models contain parameters that can be adjusted to vary how the model learns. For example, the logistic regression model, from sklearn, has a parameter C that controls regularization,which affects the complexity of the model.

How do we pick the best value for C? The best value is dependent on the data used to train the model.


How does it work?

One method is to try out different values and then pick the value that gives the best score. This technique is known as a grid search. If we had to select the values for two or more parameters, we would evaluate all combinations of the sets of values thus forming a grid of values.

Before we get into the example it is good to know what the parameter we are changing does. Higher values of C tell the model, the training data resembles real world information, place a greater weight on the training data. While lower values of C do the opposite.


Using Default Parameters

First let's see what kind of results we can generate without a grid search using only the base parameters.

To get started we must first load in the dataset we will be working with.

from sklearn import datasets
iris = datasets.load_iris()

Next in order to create the model we must have a set of independent variables X and a dependant variable y.

X = iris['data']
y = iris['target']

Now we will load the logistic model for classifying the iris flowers.

from sklearn.linear_model import LogisticRegression

Creating the model, setting max_iter to a higher value to ensure that the model finds a result.

Keep in mind the default value for C in a logistic regression model is 1, we will compare this later.

In the example below, we look at the iris data set and try to train a model with varying values for C in logistic regression.

logit = LogisticRegression(max_iter = 10000)

After we create the model, we must fit the model to the data.

print(logit.fit(X,y))

To evaluate the model we run the score method.

print(logit.score(X,y))

Example

from sklearn import datasets
from sklearn.linear_model import LogisticRegression

iris = datasets.load_iris()

X = iris['data']
y = iris['target']

logit = LogisticRegression(max_iter = 10000)

print(logit.fit(X,y))

print(logit.score(X,y))

With the default setting of C = 1, we achieved a score of 0.973.

Let's see if we can do any better by implementing a grid search with difference values of 0.973.


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Categorical Data

When your data has categories represented by strings, it will be difficult to use them to train machine learning models which often only accepts numeric data.

Instead of ignoring the categorical data and excluding the information from our model, you can tranform the data so it can be used in your models.

Take a look at the table below, it is the same data set that we used in the chapter.

Example

import pandas as pd

cars = pd.read_csv('data.csv')
print(cars.to_string())

Result

             Car       Model  Volume  Weight  CO2
  0       Toyoty        Aygo    1000     790   99
  1   Mitsubishi  Space Star    1200    1160   95
  2        Skoda      Citigo    1000     929   95
  3         Fiat         500     900     865   90
  4         Mini      Cooper    1500    1140  105
  5           VW         Up!    1000     929  105
  6        Skoda       Fabia    1400    1109   90
  7     Mercedes     A-Class    1500    1365   92
  8         Ford      Fiesta    1500    1112   98
  9         Audi          A1    1600    1150   99
  10     Hyundai         I20    1100     980   99
  11      Suzuki       Swift    1300     990  101
  12        Ford      Fiesta    1000    1112   99
  13       Honda       Civic    1600    1252   94
  14      Hundai         I30    1600    1326   97
  15        Opel       Astra    1600    1330   97
  16         BMW           1    1600    1365   99
  17       Mazda           3    2200    1280  104
  18       Skoda       Rapid    1600    1119  104
  19        Ford       Focus    2000    1328  105
  20        Ford      Mondeo    1600    1584   94
  21        Opel    Insignia    2000    1428   99
  22    Mercedes     C-Class    2100    1365   99
  23       Skoda     Octavia    1600    1415   99
  24       Volvo         S60    2000    1415   99
  25    Mercedes         CLA    1500    1465  102
  26        Audi          A4    2000    1490  104
  27        Audi          A6    2000    1725  114
  28       Volvo         V70    1600    1523  109
  29         BMW           5    2000    1705  114
  30    Mercedes     E-Class    2100    1605  115
  31       Volvo        XC70    2000    1746  117
  32        Ford       B-Max    1600    1235  104
  33         BMW         216    1600    1390  108
  34        Opel      Zafira    1600    1405  109
  35    Mercedes         SLK    2500    1395  120
  

In the multiple regression chapter, we tried to predict the CO2 emitted based on the volume of the engine and the weight of the car but we excluded information about the car brand and model.

The information about the car brand or the car model might help us make a better prediction of the CO2 emitted.


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


K-means

K-means is an unsupervised learning method for clustering data points. The algorithm iteratively divides data points into K clusters by minimizing the variance in each cluster.

Here, we will show you how to estimate the best value for K using the elbow method, then use K-means clustering to group the data points into clusters.


How does it work?

First, each data point is randomly assigned to one of the K clusters. Then, we compute the centroid (functionally the center) of each cluster, and reassign each data point to the cluster with the closest centroid. We repeat this process until the cluster assignments for each data point are no longer changing.

K-means clustering requires us to select K, the number of clusters we want to group the data into. The elbow method lets us graph the inertia (a distance-based metric) and visualize the point at which it starts decreasing linearly. This point is referred to as the "eblow" and is a good estimate for the best value for K based on our data.

Example

Start by visualizing some data points:

import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]

plt.scatter(x, y)
plt.show()

Result


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Bagging

Methods such as Decision Trees, can be prone to overfitting on the training set which can lead to wrong predictions on new data.

Bootstrap Aggregation (bagging) is a ensembling method that attempts to resolve overfitting for classification or regression problems. Bagging aims to improve the accuracy and performance of machine learning algorithms. It does this by taking random subsets of an original dataset, with replacement, and fits either a classifier (for classification) or regressor (for regression) to each subset. The predictions for each subset are then aggregated through majority vote for classification or averaging for regression, increasing prediction accuracy.


Evaluating a Base Classifier

To see how bagging can improve model performance, we must start by evaluating how the base classifier performs on the dataset. If you do not know what decision trees are review the lesson on decision trees before moving forward, as bagging is an continuation of the concept.

We will be looking to identify different classes of wines found in Sklearn's wine dataset.

Let's start by importing the necessary modules.

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier

Next we need to load in the data and store it into X (input features) and y (target). The parameter as_frame is set equal to True so we do not lose the feature names when loading the data. (sklearn version older than 0.23 must skip the as_frame argument as it is not supported)

data = datasets.load_wine(as_frame = True)

X = data.data
y = data.target

In order to properly evaluate our model on unseen data, we need to split X and y into train and test sets. For information on splitting data, see the Train/Test lesson.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 22)

With our data prepared, we can now instantiate a base classifier and fit it to the training data.

dtree = DecisionTreeClassifier(random_state = 22)
dtree.fit(X_train,y_train)

Result:

DecisionTreeClassifier(random_state=22)

We can now predict the class of wine the unseen test set and evaluate the model performance.

y_pred = dtree.predict(X_test)

print("Train data accuracy:",accuracy_score(y_true = y_train, y_pred = dtree.predict(X_train)))
print("Test data accuracy:",accuracy_score(y_true = y_test, y_pred = y_pred))

Result:

Train data accuracy: 1.0
Test data accuracy: 0.8222222222222222

Example

Import the necessary data and evaluate base classifier performance.

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier

data = datasets.load_wine(as_frame = True)

X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 22)

dtree = DecisionTreeClassifier(random_state = 22)
dtree.fit(X_train,y_train)

y_pred = dtree.predict(X_test)

print("Train data accuracy:",accuracy_score(y_true = y_train, y_pred = dtree.predict(X_train)))
print("Test data accuracy:",accuracy_score(y_true = y_test, y_pred = y_pred))

The base classifier performs reasonably well on the dataset achieving 82% accuracy on the test dataset with the current parameters (Different results may occur if you do not have the random_state parameter set).

Now that we have a baseline accuracy for the test dataset, we can see how the Bagging Classifier out performs a single Decision Tree Classifier.


ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


Cross Validation

When adjusting models we are aiming to increase overall model performance on unseen data. Hyperparameter tuning can lead to much better performance on test sets. However, optimizing parameters to the test set can lead information leakage causing the model to preform worse on unseen data. To correct for this we can perform cross validation.

To better understand CV, we will be performing different methods on the iris dataset. Let us first load in and separate the data.

from sklearn import datasets

X, y = datasets.load_iris(return_X_y=True)

There are many methods to cross validation, we will start by looking at k-fold cross validation.


K-Fold

The training data used in the model is split, into k number of smaller sets, to be used to validate the model. The model is then trained on k-1 folds of training set. The remaining fold is then used as a validation set to evaluate the model.

As we will be trying to classify different species of iris flowers we will need to import a classifier model, for this exercise we will be using a DecisionTreeClassifier. We will also need to import CV modules from sklearn.

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import KFold, cross_val_score

With the data loaded we can now create and fit a model for evaluation.

clf = DecisionTreeClassifier(random_state=42)

Now let's evaluate our model and see how it performs on each k-fold.

k_folds = KFold(n_splits = 5)

scores = cross_val_score(clf, X, y, cv = k_folds)

It is also good pratice to see how CV performed overall by averaging the scores for all folds.

Example

Run k-fold CV:

from sklearn import datasets
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import KFold, cross_val_score

X, y = datasets.load_iris(return_X_y=True)

clf = DecisionTreeClassifier(random_state=42)

k_folds = KFold(n_splits = 5)

scores = cross_val_score(clf, X, y, cv = k_folds)

print("Cross Validation Scores: ", scores)
print("Average CV Score: ", scores.mean())
print("Number of CV Scores used in Average: ", len(scores))

ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


AUC - ROC Curve

In classification, there are many different evaluation metrics. The most popular is accuracy, which measures how often the model is correct. This is a great metric because it is easy to understand and getting the most correct guesses is often desired. There are some cases where you might consider using another evaluation metric.

Another common metric is AUC, area under the receiver operating characteristic (ROC) curve. The Reciever operating characteristic curve plots the true positive (TP) rate versus the false positive (FP) rate at different classification thresholds. The thresholds are different probability cutoffs that separate the two classes in binary classification. It uses probability to tell us how well a model separates the classes.


Imbalanced Data

Suppose we have an imbalanced data set where the majority of our data is of one value. We can obtain high accuracy for the model by predicting the majority class.

Example

import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix, roc_auc_score, roc_curve

n = 10000
ratio = .95
n_0 = int((1-ratio) * n)
n_1 = int(ratio * n)

y = np.array([0] * n_0 + [1] * n_1)
# below are the probabilities obtained from a hypothetical model that always predicts the majority class
# probability of predicting class 1 is going to be 100%
y_proba = np.array([1]*n)
y_pred = y_proba > .5

print(f'accuracy score: {accuracy_score(y, y_pred)}')
cf_mat = confusion_matrix(y, y_pred)
print('Confusion matrix')
print(cf_mat)
print(f'class 0 accuracy: {cf_mat[0][0]/n_0}')
print(f'class 1 accuracy: {cf_mat[1][1]/n_1}')

ADVERTISEMENT


On this page, W3schools.com collaborates with , to deliver digital training content to our students.


KNN

KNN is a simple, supervised machine learning (ML) algorithm that can be used for classification or regression tasks - and is also frequently used in missing value imputation. It is based on the idea that the observations closest to a given data point are the most "similar" observations in a data set, and we can therefore classify unforeseen points based on the values of the closest existing points. By choosing K, the user can select the number of nearby observations to use in the algorithm.

Here, we will show you how to implement the KNN algorithm for classification, and show how different values of K affect the results.


How does it work?

K is the number of nearest neighbors to use. For classification, a majority vote is used to determined which class a new observation should fall into. Larger values of K are often more robust to outliers and produce more stable decision boundaries than very small values (K=3 would be better than K=1, which might produce undesirable results.

Example

Start by visualizing some data points:

import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 8, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]
classes = [0, 0, 1, 0, 0, 1, 1, 0, 1, 1]

plt.scatter(x, y, c=classes)
plt.show()

Result


ADVERTISEMENT

Python MySQL


Python can be used in database applications.

One of the most popular databases is MySQL.


MySQL Database

To be able to experiment with the code examples in this tutorial, you should have MySQL installed on your computer.

You can download a MySQL database at .


Install MySQL Driver

Python needs a MySQL driver to access the MySQL database.

In this tutorial we will use the driver "MySQL Connector".

We recommend that you use PIP to install "MySQL Connector".

PIP is most likely already installed in your Python environment.

Navigate your command line to the location of PIP, and type the following:

Download and install "MySQL Connector":

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>python -m pip install mysql-connector-python

Now you have downloaded and installed a MySQL driver.


Test MySQL Connector

To test if the installation was successful, or if you already have "MySQL Connector" installed, create a Python page with the following content:

demo_mysql_test.py:

import mysql.connector

If the above code was executed with no errors, "MySQL Connector" is installed and ready to be used.



Create Connection

Start by creating a connection to the database.

Use the username and password from your MySQL database:

demo_mysql_connection.py:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword"
)

print(mydb)

Now you can start querying the database using SQL statements.


Python MySQL Create Database


Creating a Database

To create a database in MySQL, use the "CREATE DATABASE" statement:

Example

create a database named "mydatabase":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword"
)

mycursor = mydb.cursor()

mycursor.execute("CREATE DATABASE mydatabase")

If the above code was executed with no errors, you have successfully created a database.


Check if Database Exists

You can check if a database exist by listing all databases in your system by using the "SHOW DATABASES" statement:

Example

Return a list of your system's databases:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword"
)

mycursor = mydb.cursor()

mycursor.execute("SHOW DATABASES")

for x in mycursor:
  print(x)

Or you can try to access the database when making the connection:

Example

Try connecting to the database "mydatabase":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

If the database does not exist, you will get an error.


Python MySQL Create Table


Creating a Table

To create a table in MySQL, use the "CREATE TABLE" statement.

Make sure you define the name of the database when you create the connection

Example

Create a table named "customers":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("CREATE TABLE customers (name VARCHAR(255), address VARCHAR(255))")

If the above code was executed with no errors, you have now successfully created a table.


Check if Table Exists

You can check if a table exist by listing all tables in your database with the "SHOW TABLES" statement:

Example

Return a list of your system's databases:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SHOW TABLES")

for x in mycursor:
  print(x)


Primary Key

When creating a table, you should also create a column with a unique key for each record.

This can be done by defining a PRIMARY KEY.

We use the statement "INT AUTO_INCREMENT PRIMARY KEY" which will insert a unique number for each record. Starting at 1, and increased by one for each record.

Example

Create primary key when creating the table:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("CREATE TABLE customers (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255), address VARCHAR(255))")

If the table already exists, use the ALTER TABLE keyword:

Example

Create primary key on an existing table:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("ALTER TABLE customers ADD COLUMN id INT AUTO_INCREMENT PRIMARY KEY")

Python MySQL Insert Into Table


Insert Into Table

To fill a table in MySQL, use the "INSERT INTO" statement.

Example

Insert a record in the "customers" table:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
val = ("John", "Highway 21")
mycursor.execute(sql, val)

mydb.commit()

print(mycursor.rowcount, "record inserted.")

Important!: Notice the statement: mydb.commit(). It is required to make the changes, otherwise no changes are made to the table.


Insert Multiple Rows

To insert multiple rows into a table, use the executemany() method.

The second parameter of the executemany() method is a list of tuples, containing the data you want to insert:

Example

Fill the "customers" table with data:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
val = [
  ('Peter', 'Lowstreet 4'),
  ('Amy', 'Apple st 652'),
  ('Hannah', 'Mountain 21'),
  ('Michael', 'Valley 345'),
  ('Sandy', 'Ocean blvd 2'),
  ('Betty', 'Green Grass 1'),
  ('Richard', 'Sky st 331'),
  ('Susan', 'One way 98'),
  ('Vicky', 'Yellow Garden 2'),
  ('Ben', 'Park Lane 38'),
  ('William', 'Central st 954'),
  ('Chuck', 'Main Road 989'),
  ('Viola', 'Sideway 1633')
]

mycursor.executemany(sql, val)

mydb.commit()

print(mycursor.rowcount, "was inserted.")


Get Inserted ID

You can get the id of the row you just inserted by asking the cursor object.

Note: If you insert more than one row, the id of the last inserted row is returned.

Example

Insert one row, and return the ID:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
val = ("Michelle", "Blue Village")
mycursor.execute(sql, val)

mydb.commit()

print("1 record inserted, ID:", mycursor.lastrowid)

Python MySQL Select From


Select From a Table

To select from a table in MySQL, use the "SELECT" statement:

Example

Select all records from the "customers" table, and display the result:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT * FROM customers")

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Note: We use the fetchall() method, which fetches all rows from the last executed statement.


Selecting Columns

To select only some of the columns in a table, use the "SELECT" statement followed by the column name(s):

Example

Select only the name and address columns:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT name, address FROM customers")

myresult = mycursor.fetchall()

for x in myresult:
  print(x)


Using the fetchone() Method

If you are only interested in one row, you can use the fetchone() method.

The fetchone() method will return the first row of the result:

Example

Fetch only one row:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT * FROM customers")

myresult = mycursor.fetchone()

print(myresult)

Python MySQL Where


Select With a Filter

When selecting records from a table, you can filter the selection by using the "WHERE" statement:

Example

Select record(s) where the address is "Park Lane 38": result:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT * FROM customers WHERE address ='Park Lane 38'"

mycursor.execute(sql)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Wildcard Characters

You can also select the records that starts, includes, or ends with a given letter or phrase.

Use the %  to represent wildcard characters:

Example

Select records where the address contains the word "way":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT * FROM customers WHERE address LIKE '%way%'"

mycursor.execute(sql)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)


Prevent SQL Injection

When query values are provided by the user, you should escape the values.

This is to prevent SQL injections, which is a common web hacking technique to destroy or misuse your database.

The mysql.connector module has methods to escape query values:

Example

Escape query values by using the placholder %s method:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT * FROM customers WHERE address = %s"
adr = ("Yellow Garden 2", )

mycursor.execute(sql, adr)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Python MySQL Order By


Sort the Result

Use the ORDER BY statement to sort the result in ascending or descending order.

The ORDER BY keyword sorts the result ascending by default. To sort the result in descending order, use the DESC keyword.

Example

Sort the result alphabetically by name: result:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT * FROM customers ORDER BY name"

mycursor.execute(sql)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

ORDER BY DESC

Use the DESC keyword to sort the result in a descending order.

Example

Sort the result reverse alphabetically by name:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT * FROM customers ORDER BY name DESC"

mycursor.execute(sql)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Python MySQL Delete From By


Delete Record

You can delete records from an existing table by using the "DELETE FROM" statement:

Example

Delete any record where the address is "Mountain 21":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "DELETE FROM customers WHERE address = 'Mountain 21'"

mycursor.execute(sql)

mydb.commit()

print(mycursor.rowcount, "record(s) deleted")

Important!: Notice the statement: mydb.commit(). It is required to make the changes, otherwise no changes are made to the table.

Notice the WHERE clause in the DELETE syntax: The WHERE clause specifies which record(s) that should be deleted. If you omit the WHERE clause, all records will be deleted!



Prevent SQL Injection

It is considered a good practice to escape the values of any query, also in delete statements.

This is to prevent SQL injections, which is a common web hacking technique to destroy or misuse your database.

The mysql.connector module uses the placeholder %s to escape values in the delete statement:

Example

Escape values by using the placeholder %s method:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "DELETE FROM customers WHERE address = %s"
adr = ("Yellow Garden 2", )

mycursor.execute(sql, adr)

mydb.commit()

print(mycursor.rowcount, "record(s) deleted")

Python MySQL Drop Table


Delete a Table

You can delete an existing table by using the "DROP TABLE" statement:

Example

Delete the table "customers":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "DROP TABLE customers"

mycursor.execute(sql)

Drop Only if Exist

If the table you want to delete is already deleted, or for any other reason does not exist, you can use the IF EXISTS keyword to avoid getting an error.

Example

Delete the table "customers" if it exists:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "DROP TABLE IF EXISTS customers"

mycursor.execute(sql)

Python MySQL Update Table


Update Table

You can update existing records in a table by using the "UPDATE" statement:

Example

Overwrite the address column from "Valley 345" to "Canyon 123":

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "UPDATE customers SET address = 'Canyon 123' WHERE address = 'Valley 345'"

mycursor.execute(sql)

mydb.commit()

print(mycursor.rowcount, "record(s) affected")

Important!: Notice the statement: mydb.commit(). It is required to make the changes, otherwise no changes are made to the table.

Notice the WHERE clause in the UPDATE syntax: The WHERE clause specifies which record or records that should be updated. If you omit the WHERE clause, all records will be updated!



Prevent SQL Injection

It is considered a good practice to escape the values of any query, also in update statements.

This is to prevent SQL injections, which is a common web hacking technique to destroy or misuse your database.

The mysql.connector module uses the placeholder %s to escape values in the delete statement:

Example

Escape values by using the placeholder %s method:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "UPDATE customers SET address = %s WHERE address = %s"
val = ("Valley 345", "Canyon 123")

mycursor.execute(sql, val)

mydb.commit()

print(mycursor.rowcount, "record(s) affected")

Python MySQL Limit


Limit the Result

You can limit the number of records returned from the query, by using the "LIMIT" statement:

Example

Select the 5 first records in the "customers" table:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT * FROM customers LIMIT 5")

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Start From Another Position

If you want to return five records, starting from the third record, you can use the "OFFSET" keyword:

Example

Start from position 3, and return 5 records:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

mycursor.execute("SELECT * FROM customers LIMIT 5 OFFSET 2")

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Python MySQL Join


Join Two or More Tables

You can combine rows from two or more tables, based on a related column between them, by using a JOIN statement.

Consider you have a "users" table and a "products" table:

users

{ id: 1, name: 'John', fav: 154},
{ id: 2, name: 'Peter', fav: 154},
{ id: 3, name: 'Amy', fav: 155},
{ id: 4, name: 'Hannah', fav:},
{ id: 5, name: 'Michael', fav:}

products

{ id: 154, name: 'Chocolate Heaven' },
{ id: 155, name: 'Tasty Lemons' },
{ id: 156, name: 'Vanilla Dreams' }

These two tables can be combined by using users' fav field and products' id field.

Example

Join users and products to see the name of the users favorite product:

import mysql.connector

mydb = mysql.connector.connect(
  host="localhost",
  user="yourusername",
  password="yourpassword",
  database="mydatabase"
)

mycursor = mydb.cursor()

sql = "SELECT
  users.name AS user,
  products.name AS favorite
  FROM users
  INNER JOIN products ON users.fav = products.id"

mycursor.execute(sql)

myresult = mycursor.fetchall()

for x in myresult:
  print(x)

Note: You can use JOIN instead of INNER JOIN. They will both give you the same result.



LEFT JOIN

In the example above, Hannah, and Michael were excluded from the result, that is because INNER JOIN only shows the records where there is a match.

If you want to show all users, even if they do not have a favorite product, use the LEFT JOIN statement:

Example

Select all users and their favorite product:

sql = "SELECT
  users.name AS user,
  products.name AS favorite
  FROM users
  LEFT JOIN products ON users.fav = products.id"

RIGHT JOIN

If you want to return all products, and the users who have them as their favorite, even if no user have them as their favorite, use the RIGHT JOIN statement:

Example

Select all products, and the user(s) who have them as their favorite:

sql = "SELECT
  users.name AS user,
  products.name AS favorite
  FROM users
  RIGHT JOIN products ON users.fav = products.id"

Note: Hannah and Michael, who have no favorite product, are not included in the result.


Python MongoDB


Python can be used in database applications.

One of the most popular NoSQL database is MongoDB.


MongoDB

MongoDB stores data in JSON-like documents, which makes the database very flexible and scalable.

To be able to experiment with the code examples in this tutorial, you will need access to a MongoDB database.

You can download a free MongoDB database at .

Or get started right away with a MongoDB cloud service at .


PyMongo

Python needs a MongoDB driver to access the MongoDB database.

In this tutorial we will use the MongoDB driver "PyMongo".

We recommend that you use PIP to install "PyMongo".

PIP is most likely already installed in your Python environment.

Navigate your command line to the location of PIP, and type the following:

Download and install "PyMongo":

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>python -m pip install pymongo

Now you have downloaded and installed a mongoDB driver.


Test PyMongo

To test if the installation was successful, or if you already have "pymongo" installed, create a Python page with the following content:

demo_mongodb_test.py:

import pymongo

If the above code was executed with no errors, "pymongo" is installed and ready to be used.


Python MongoDB Create Database


Creating a Database

To create a database in MongoDB, start by creating a MongoClient object, then specify a connection URL with the correct ip address and the name of the database you want to create.

MongoDB will create the database if it does not exist, and make a connection to it.

Example

Create a database called "mydatabase":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")

mydb = myclient["mydatabase"]

Important: In MongoDB, a database is not created until it gets content!

MongoDB waits until you have created a collection (table), with at least one document (record) before it actually creates the database (and collection).


Check if Database Exists

Remember: In MongoDB, a database is not created until it gets content, so if this is your first time creating a database, you should complete the next two chapters (create collection and create document) before you check if the database exists!

You can check if a database exist by listing all databases in you system:

Example

Return a list of your system's databases:

print(myclient.list_database_names())

Or you can check a specific database by name:

Example

Check if "mydatabase" exists:

dblist = myclient.list_database_names()
if "mydatabase" in dblist:
  print("The database exists.")

Python MongoDB Create Collection


A collection in MongoDB is the same as a table in SQL databases.

Creating a Collection

To create a collection in MongoDB, use database object and specify the name of the collection you want to create.

MongoDB will create the collection if it does not exist.

Example

Create a collection called "customers":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]

mycol = mydb["customers"]

Important: In MongoDB, a collection is not created until it gets content!

MongoDB waits until you have inserted a document before it actually creates the collection.


Check if Collection Exists

Remember: In MongoDB, a collection is not created until it gets content, so if this is your first time creating a collection, you should complete the next chapter (create document) before you check if the collection exists!

You can check if a collection exist in a database by listing all collections:

Example

Return a list of all collections in your database:

print(mydb.list_collection_names())

Or you can check a specific collection by name:

Example

Check if the "customers" collection exists:

collist = mydb.list_collection_names()
if "customers" in collist:
  print("The collection exists.")

Python MongoDB Insert Document


A document in MongoDB is the same as a record in SQL databases.

Insert Into Collection

To insert a record, or document as it is called in MongoDB, into a collection, we use the insert_one() method.

The first parameter of the insert_one() method is a dictionary containing the name(s) and value(s) of each field in the document you want to insert.

Example

Insert a record in the "customers" collection:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mydict = { "name": "John", "address": "Highway 37" }

x = mycol.insert_one(mydict)

Return the _id Field

The insert_one() method returns a InsertOneResult object, which has a property, inserted_id, that holds the id of the inserted document.

Example

Insert another record in the "customers" collection, and return the value of the _id field:

mydict = { "name": "Peter", "address": "Lowstreet 27" }

x = mycol.insert_one(mydict)

print(x.inserted_id)

If you do not specify an _id field, then MongoDB will add one for you and assign a unique id for each document.

In the example above no _id field was specified, so MongoDB assigned a unique _id for the record (document).



Insert Multiple Documents

To insert multiple documents into a collection in MongoDB, we use the insert_many() method.

The first parameter of the insert_many() method is a list containing dictionaries with the data you want to insert:

Example

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mylist = [
  { "name": "Amy", "address": "Apple st 652"},
  { "name": "Hannah", "address": "Mountain 21"},
  { "name": "Michael", "address": "Valley 345"},
  { "name": "Sandy", "address": "Ocean blvd 2"},
  { "name": "Betty", "address": "Green Grass 1"},
  { "name": "Richard", "address": "Sky st 331"},
  { "name": "Susan", "address": "One way 98"},
  { "name": "Vicky", "address": "Yellow Garden 2"},
  { "name": "Ben", "address": "Park Lane 38"},
  { "name": "William", "address": "Central st 954"},
  { "name": "Chuck", "address": "Main Road 989"},
  { "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print list of the _id values of the inserted documents:
print(x.inserted_ids)

The insert_many() method returns a InsertManyResult object, which has a property, inserted_ids, that holds the ids of the inserted documents.


Insert Multiple Documents, with Specified IDs

If you do not want MongoDB to assign unique ids for you document, you can specify the _id field when you insert the document(s).

Remember that the values has to be unique. Two documents cannot have the same _id.

Example

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mylist = [
  { "_id": 1, "name": "John", "address": "Highway 37"},
  { "_id": 2, "name": "Peter", "address": "Lowstreet 27"},
  { "_id": 3, "name": "Amy", "address": "Apple st 652"},
  { "_id": 4, "name": "Hannah", "address": "Mountain 21"},
  { "_id": 5, "name": "Michael", "address": "Valley 345"},
  { "_id": 6, "name": "Sandy", "address": "Ocean blvd 2"},
  { "_id": 7, "name": "Betty", "address": "Green Grass 1"},
  { "_id": 8, "name": "Richard", "address": "Sky st 331"},
  { "_id": 9, "name": "Susan", "address": "One way 98"},
  { "_id": 10, "name": "Vicky", "address": "Yellow Garden 2"},
  { "_id": 11, "name": "Ben", "address": "Park Lane 38"},
  { "_id": 12, "name": "William", "address": "Central st 954"},
  { "_id": 13, "name": "Chuck", "address": "Main Road 989"},
  { "_id": 14, "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print list of the _id values of the inserted documents:
print(x.inserted_ids)

Python MongoDB Find


In MongoDB we use the find() and find_one() methods to find data in a collection.

Just like the SELECT statement is used to find data in a table in a MySQL database.

Find One

To select data from a collection in MongoDB, we can use the find_one() method.

The find_one() method returns the first occurrence in the selection.

Example

Find the first document in the customers collection:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

x = mycol.find_one()

print(x)

Find All

To select data from a table in MongoDB, we can also use the find() method.

The find() method returns all occurrences in the selection.

The first parameter of the find() method is a query object. In this example we use an empty query object, which selects all documents in the collection.

No parameters in the find() method gives you the same result as SELECT * in MySQL.

Example

Return all documents in the "customers" collection, and print each document:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

for x in mycol.find():
  print(x)


Return Only Some Fields

The second parameter of the find() method is an object describing which fields to include in the result.

This parameter is optional, and if omitted, all fields will be included in the result.

Example

Return only the names and addresses, not the _ids:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

for x in mycol.find({},{ "_id": 0, "name": 1, "address": 1 }):
  print(x)

You are not allowed to specify both 0 and 1 values in the same object (except if one of the fields is the _id field). If you specify a field with the value 0, all other fields get the value 1, and vice versa:

Example

This example will exclude "address" from the result:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

for x in mycol.find({},{ "address": 0 }):
  print(x)

Example

You get an error if you specify both 0 and 1 values in the same object (except if one of the fields is the _id field):

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

for x in mycol.find({},{ "name": 1, "address": 0 }):
  print(x)

Python MongoDB Query


Filter the Result

When finding documents in a collection, you can filter the result by using a query object.

The first argument of the find() method is a query object, and is used to limit the search.

Example

Find document(s) with the address "Park Lane 38":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": "Park Lane 38" }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x)

Advanced Query

To make advanced queries you can use modifiers as values in the query object.

E.g. to find the documents where the "address" field starts with the letter "S" or higher (alphabetically), use the greater than modifier: {"$gt": "S"}:

Example

Find documents where the address starts with the letter "S" or higher:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": { "$gt": "S" } }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x)

Filter With Regular Expressions

You can also use regular expressions as a modifier.

Regular expressions can only be used to query strings.

To find only the documents where the "address" field starts with the letter "S", use the regular expression {"$regex": "^S"}:

Example

Find documents where the address starts with the letter "S":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": { "$regex": "^S" } }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x)

Python MongoDB Sort


Sort the Result

Use the sort() method to sort the result in ascending or descending order.

The sort() method takes one parameter for "fieldname" and one parameter for "direction" (ascending is the default direction).

Example

Sort the result alphabetically by name:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mydoc = mycol.find().sort("name")

for x in mydoc:
  print(x)

Sort Descending

Use the value -1 as the second parameter to sort descending.

sort("name", 1) #ascending
sort("name", -1) #descending

Example

Sort the result reverse alphabetically by name:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mydoc = mycol.find().sort("name", -1)

for x in mydoc:
  print(x)

Python MongoDB Delete Document


Delete Document

To delete one document, we use the delete_one() method.

The first parameter of the delete_one() method is a query object defining which document to delete.

Note: If the query finds more than one document, only the first occurrence is deleted.

Example

Delete the document with the address "Mountain 21":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": "Mountain 21" }

mycol.delete_one(myquery)

Delete Many Documents

To delete more than one document, use the delete_many() method.

The first parameter of the delete_many() method is a query object defining which documents to delete.

Example

Delete all documents were the address starts with the letter S:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": {"$regex": "^S"} }

x = mycol.delete_many(myquery)

print(x.deleted_count, " documents deleted.")

Delete All Documents in a Collection

To delete all documents in a collection, pass an empty query object to the delete_many() method:

Example

Delete all documents in the "customers" collection:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

x = mycol.delete_many({})

print(x.deleted_count, " documents deleted.")

Python MongoDB Drop Collection


Delete Collection

You can delete a table, or collection as it is called in MongoDB, by using the drop() method.

Example

Delete the "customers" collection:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mycol.drop()

The drop() method returns true if the collection was dropped successfully, and false if the collection does not exist.


Python MongoDB Update


Update Collection

You can update a record, or document as it is called in MongoDB, by using the update_one() method.

The first parameter of the update_one() method is a query object defining which document to update.

Note: If the query finds more than one record, only the first occurrence is updated.

The second parameter is an object defining the new values of the document.

Example

Change the address from "Valley 345" to "Canyon 123":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": "Valley 345" }
newvalues = { "$set": { "address": "Canyon 123" } }

mycol.update_one(myquery, newvalues)

#print "customers" after the update:
for x in mycol.find():
  print(x)

Update Many

To update all documents that meets the criteria of the query, use the update_many() method.

Example

Update all documents where the address starts with the letter "S":

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": { "$regex": "^S" } }
newvalues = { "$set": { "name": "Minnie" } }

x = mycol.update_many(myquery, newvalues)

print(x.modified_count, "documents updated.")

Python MongoDB Limit


Limit the Result

To limit the result in MongoDB, we use the limit() method.

The limit() method takes one parameter, a number defining how many documents to return.

Consider you have a "customers" collection:

Customers

{'_id': 1, 'name': 'John', 'address': 'Highway37'}
{'_id': 2, 'name': 'Peter', 'address': 'Lowstreet 27'}
{'_id': 3, 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': 4, 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': 5, 'name': 'Michael', 'address': 'Valley 345'}
{'_id': 6, 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': 7, 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': 8, 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': 9, 'name': 'Susan', 'address': 'One way 98'}
{'_id': 10, 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': 11, 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': 12, 'name': 'William', 'address': 'Central st 954'}
{'_id': 13, 'name': 'Chuck', 'address': 'Main Road 989'}
{'_id': 14, 'name': 'Viola', 'address': 'Sideway 1633'}

Example

Limit the result to only return 5 documents:

import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myresult = mycol.find().limit(5)

#print the result:
for x in myresult:
  print(x)

Python Reference


This section contains a Python reference documentation.


Python Reference


Module Reference


Python Built in Functions


Python has a set of built-in functions.

Function Description
Returns the absolute value of a number
Returns True if all items in an iterable object are true
Returns True if any item in an iterable object is true
Returns a readable version of an object. Replaces none-ascii characters with escape character
Returns the binary version of a number
Returns the boolean value of the specified object
Returns an array of bytes
Returns a bytes object
Returns True if the specified object is callable, otherwise False
Returns a character from the specified Unicode code.
classmethod() Converts a method into a class method
Returns the specified source as an object, ready to be executed
Returns a complex number
Deletes the specified attribute (property or method) from the specified object
Returns a dictionary (Array)
Returns a list of the specified object's properties and methods
Returns the quotient and the remainder when argument1 is divided by argument2
Takes a collection (e.g. a tuple) and returns it as an enumerate object
Evaluates and executes an expression
Executes the specified code (or object)
Use a filter function to exclude items in an iterable object
Returns a floating point number
Formats a specified value
Returns a frozenset object
Returns the value of the specified attribute (property or method)
Returns the current global symbol table as a dictionary
Returns True if the specified object has the specified attribute (property/method)
hash() Returns the hash value of a specified object
help() Executes the built-in help system
Converts a number into a hexadecimal value
Returns the id of an object
Allowing user input
Returns an integer number
Returns True if a specified object is an instance of a specified object
Returns True if a specified class is a subclass of a specified object
Returns an iterator object
Returns the length of an object
Returns a list
Returns an updated dictionary of the current local symbol table
Returns the specified iterator with the specified function applied to each item
Returns the largest item in an iterable
Returns a memory view object
Returns the smallest item in an iterable
Returns the next item in an iterable
Returns a new object
Converts a number into an octal
Opens a file and returns a file object
Convert an integer representing the Unicode of the specified character
Returns the value of x to the power of y
Prints to the standard output device
property() Gets, sets, deletes a property
Returns a sequence of numbers, starting from 0 and increments by 1 (by default)
repr() Returns a readable version of an object
Returns a reversed iterator
Rounds a numbers
Returns a new set object
Sets an attribute (property/method) of an object
Returns a slice object
Returns a sorted list
staticmethod() Converts a method into a static method
Returns a string object
Sums the items of an iterator
Returns an object that represents the parent class
Returns a tuple
Returns the type of an object
Returns the __dict__ property of an object
Returns an iterator, from two or more iterators


Python String Methods


Python has a set of built-in methods that you can use on strings.

Note: All string methods returns new values. They do not change the original string.

Method Description
Converts the first character to upper case
Converts string into lower case
Returns a centered string
Returns the number of times a specified value occurs in a string
Returns an encoded version of the string
Returns true if the string ends with the specified value
Sets the tab size of the string
Searches the string for a specified value and returns the position of where it was found
Formats specified values in a string
format_map() Formats specified values in a string
Searches the string for a specified value and returns the position of where it was found
Returns True if all characters in the string are alphanumeric
Returns True if all characters in the string are in the alphabet
Returns True if all characters in the string are ascii characters
Returns True if all characters in the string are decimals
Returns True if all characters in the string are digits
Returns True if the string is an identifier
Returns True if all characters in the string are lower case
Returns True if all characters in the string are numeric
Returns True if all characters in the string are printable
Returns True if all characters in the string are whitespaces
Returns True if the string follows the rules of a title
Returns True if all characters in the string are upper case
Converts the elements of an iterable into a string
Returns a left justified version of the string
Converts a string into lower case
Returns a left trim version of the string
Returns a translation table to be used in translations
Returns a tuple where the string is parted into three parts
Returns a string where a specified value is replaced with a specified value
Searches the string for a specified value and returns the last position of where it was found
Searches the string for a specified value and returns the last position of where it was found
Returns a right justified version of the string
Returns a tuple where the string is parted into three parts
Splits the string at the specified separator, and returns a list
Returns a right trim version of the string
Splits the string at the specified separator, and returns a list
Splits the string at line breaks and returns a list
Returns true if the string starts with the specified value
Returns a trimmed version of the string
Swaps cases, lower case becomes upper case and vice versa
Converts the first character of each word to upper case
Returns a translated string
Converts a string into upper case
Fills the string with a specified number of 0 values at the beginning

Note: All string methods returns new values. They do not change the original string.


Learn more about strings in our .



Python List/Array Methods


Python has a set of built-in methods that you can use on lists/arrays.

Method Description
Adds an element at the end of the list
Removes all the elements from the list
Returns a copy of the list
Returns the number of elements with the specified value
Add the elements of a list (or any iterable), to the end of the current list
Returns the index of the first element with the specified value
Adds an element at the specified position
Removes the element at the specified position
Removes the first item with the specified value
Reverses the order of the list
Sorts the list

Note: Python does not have built-in support for Arrays, but Python Lists can be used instead.


Learn more about lists in our .

Learn more about arrays in our .


Python Dictionary Methods


Python has a set of built-in methods that you can use on dictionaries.

Method Description
Removes all the elements from the dictionary
Returns a copy of the dictionary
Returns a dictionary with the specified keys and value
Returns the value of the specified key
Returns a list containing a tuple for each key value pair
Returns a list containing the dictionary's keys
Removes the element with the specified key
Removes the last inserted key-value pair
Returns the value of the specified key. If the key does not exist: insert the key, with the specified value
Updates the dictionary with the specified key-value pairs
Returns a list of all the values in the dictionary

Learn more about dictionaries in our .



Python Tuple Methods


Python has two built-in methods that you can use on tuples.

Method Description
Returns the number of times a specified value occurs in a tuple
Searches the tuple for a specified value and returns the position of where it was found

Learn more about tuples in our .



Python Set Methods


Python has a set of built-in methods that you can use on sets.

Method Description
Adds an element to the set
Removes all the elements from the set
Returns a copy of the set
Returns a set containing the difference between two or more sets
Removes the items in this set that are also included in another, specified set
Remove the specified item
Returns a set, that is the intersection of two or more sets
Removes the items in this set that are not present in other, specified set(s)
Returns whether two sets have a intersection or not
Returns whether another set contains this set or not
Returns whether this set contains another set or not
Removes an element from the set
Removes the specified element
Returns a set with the symmetric differences of two sets
inserts the symmetric differences from this set and another
Return a set containing the union of sets
Update the set with another set, or any other iterable

Learn more about sets in our .



Python File Methods


Python has a set of methods available for the file object.

Method Description
Closes the file
detach() Returns the separated raw stream from the buffer
Returns a number that represents the stream, from the operating system's perspective
Flushes the internal buffer
Returns whether the file stream is interactive or not
Returns the file content
Returns whether the file stream can be read or not
Returns one line from the file
Returns a list of lines from the file
Change the file position
Returns whether the file allows us to change the file position
Returns the current file position
Resizes the file to a specified size
Returns whether the file can be written to or not
Writes the specified string to the file
Writes a list of strings to the file

Learn more about the file object in our .



Python Keywords


Python has a set of keywords that are reserved words that cannot be used as variable names, function names, or any other identifiers:

Keyword Description
A logical operator
To create an alias
For debugging
To break out of a loop
To define a class
To continue to the next iteration of a loop
To define a function
To delete an object
Used in conditional statements, same as else if
Used in conditional statements
Used with exceptions, what to do when an exception occurs
Boolean value, result of comparison operations
Used with exceptions, a block of code that will be executed no matter if there is an exception or not
To create a for loop
To import specific parts of a module
To declare a global variable
To make a conditional statement
To import a module
To check if a value is present in a list, tuple, etc.
To test if two variables are equal
To create an anonymous function
Represents a null value
To declare a non-local variable
A logical operator
A logical operator
A null statement, a statement that will do nothing
To raise an exception
To exit a function and return a value
Boolean value, result of comparison operations
To make a try...except statement
To create a while loop
with Used to simplify exception handling
yield To end a function, returns a generator


Python Built-in Exceptions


Built-in Exceptions

The table below shows built-in exceptions that are usually raised in Python:

Exception Description
ArithmeticError Raised when an error occurs in numeric calculations
AssertionError Raised when an assert statement fails
AttributeError Raised when attribute reference or assignment fails
Exception Base class for all exceptions
EOFError Raised when the input() method hits an "end of file" condition (EOF)
FloatingPointError Raised when a floating point calculation fails
GeneratorExit Raised when a generator is closed (with the close() method)
ImportError Raised when an imported module does not exist
IndentationError Raised when indentation is not correct
IndexError Raised when an index of a sequence does not exist
KeyError Raised when a key does not exist in a dictionary
KeyboardInterrupt Raised when the user presses Ctrl+c, Ctrl+z or Delete
LookupError Raised when errors raised cant be found
MemoryError Raised when a program runs out of memory
NameError Raised when a variable does not exist
NotImplementedError Raised when an abstract method requires an inherited class to override the method
OSError Raised when a system related operation causes an error
OverflowError Raised when the result of a numeric calculation is too large
ReferenceError Raised when a weak reference object does not exist
RuntimeError Raised when an error occurs that do not belong to any specific exceptions
StopIteration Raised when the next() method of an iterator has no further values
SyntaxError Raised when a syntax error occurs
TabError Raised when indentation consists of tabs or spaces
SystemError Raised when a system error occurs
SystemExit Raised when the sys.exit() function is called
TypeError Raised when two different types are combined
UnboundLocalError Raised when a local variable is referenced before assignment
UnicodeError Raised when a unicode problem occurs
UnicodeEncodeError Raised when a unicode encoding problem occurs
UnicodeDecodeError Raised when a unicode decoding problem occurs
UnicodeTranslateError Raised when a unicode translation problem occurs
ValueError Raised when there is a wrong value in a specified data type
ZeroDivisionError Raised when the second operator in a division is zero

Python Glossary


This is a list of all the features explained in the Python Tutorial.


Feature Description
Indentation refers to the spaces at the beginning of a code line
Comments are code lines that will not be executed
How to insert comments on multiple lines
Variables are containers for storing data values
How to name your variables
How to assign values to multiple variables
Use the print statement to output variables
How to combine strings
Global variables are variables that belongs to the global scope
Python has a set of built-in data types
How to get the data type of an object
How to set the data type of an object
There are three numeric types in Python
The integer number type
The floating number type
The complex number type
How to convert from one number type to another
How to create a random number
How to specify a certain data type for a variable
How to create string literals
How to assign a string value to a variable
How to create a multiline string
Strings in Python are arrays of bytes representing Unicode characters
How to slice a string
How to use negative indexing when accessing a string
How to get the length of a string
How to check if a string contains a specified phrase
How to combine two strings
How to use escape characters
True or False
Evaluate a value or statement and return either True or False
Functions that return a Boolean value
Use operator to perform operations in Python
Arithmetic operator are used to perform common mathematical operations
Assignment operators are use to assign values to variables
Comparison operators are used to compare two values
Logical operators are used to combine conditional statements
Identity operators are used to see if two objects are in fact the same object
Membership operators are used to test is a sequence is present in an object
Bitwise operators are used to compare (binary) numbers
A list is an ordered, and changeable, collection
How to access items in a list
How to change the value of a list item
How to loop through the items in a list
How use a list comprehensive
How to check if a specified item is present in a list
How to determine the length of a list
How to add items to a list
How to remove list items
How to copy a list
How to join two lists
A tuple is an ordered, and unchangeable, collection
How to access items in a tuple
How to change the value of a tuple item
How to loop through the items in a tuple
How to check if a specified item is present in a tuple
How to determine the length of a tuple
How to create a tuple with only one item
How to remove tuple items
How to join two tuples
A set is an unordered, and unchangeable, collection
How to access items in a set
How to add items to a set
How to loop through the items in a set
How to check if a item exists
How to determine the length of a set
How to remove set items
How to join two sets
A dictionary is an unordered, and changeable, collection
How to access items in a dictionary
How to change the value of a dictionary item
How to loop through the items in a tuple
How to check if a specified item is present in a dictionary
How to determine the length of a dictionary
How to add an item to a dictionary
How to remove dictionary items
How to copy a dictionary
A dictionary within a dictionary
How to write an if statement
If statements in Python relies on indentation (whitespace at the beginning of a line)
elif is the same as "else if" in other programming languages
How to write an if...else statement
How to write an if statement in one line
How to write an if...else statement in one line
Use the and keyword to combine if statements
Use the or keyword to combine if statements
Use the not keyword to reverse the condition
How to write an if statement inside an if statement
Use the pass keyword inside empty if statements
How to write a while loop
How to break a while loop
How to stop the current iteration and continue wit the next
How to use an else statement in a while loop
How to write a for loop
How to loop through a string
How to break a for loop
How to stop the current iteration and continue wit the next
How to loop through a range of values
How to use an else statement in a for loop
How to write a loop inside a loop
Use the pass keyword inside empty for loops
How to create a function in Python
How to call a function in Python
How to use arguments in a function
To deal with an unknown number of arguments in a function, use the * symbol before the parameter name
How to use keyword arguments in a function
To deal with an unknown number of keyword arguments in a function, use the * symbol before the parameter name
How to use a default parameter value
How to pass a list as an argument
How to return a value from a function
Use the pass statement in empty functions
Functions that can call itself is called recursive functions
How to create anonymous functions in Python
Learn when to use a lambda function or not
Lists can be used as Arrays
Arrays are variables that can hold more than one value
How to access array items
How to get the length of an array
How to loop through array elements
How to add elements from an array
How to remove elements from an array
Python has a set of Array/Lists methods
A class is like an object constructor
How to create a class
The __init__() function is executed when the class is initiated
Methods in objects are functions that belongs to the object
The self parameter refers to the current instance of the class
How to modify properties of an object
How to modify properties of an object
How to delete an object
Use the pass statement in empty classes
How to create a parent class
How to create a child class
How to create the __init__() function
The super() function make the child class inherit the parent class
How to add a property to a class
How to add a method to a class
An iterator is an object that contains a countable number of values
What is the difference between an iterator and an iterable
How to loop through the elements of an iterator
How to create an iterator
How to stop an iterator
When does a variable belong to the global scope?
The global keyword makes the variable global
How to create a module
How to use variables in a module
How to rename a module
How to import built-in modules
List all variable names and function names in a module
How to import only parts from a module
How to work with dates in Python
How to output a date
How to create a date object
How to format a date object into a readable string
The datetime module has a set of legal format codes
How to work with JSON in Python
How to parse JSON code in Python
How to convert a Python object in to JSON
How to format JSON output with indentations and line breaks
How to sort JSON
How to import the regex module
The re module has a set of functions
Metacharacters are characters with a special meaning
A backslash followed by a a character has a special meaning
A set is a set of characters inside a pair of square brackets with a special meaning
The Match Object is an object containing information about the search and the result
How to install PIP
How to download and install a package with PIP
How to remove a package with PIP
How to handle errors in Python
How to handle more than one exception
How to use the else keyword in a try statement
How to use the finally keyword in a try statement
How to raise an exception in Python


Python Random Module


Python has a built-in module that you can use to make random numbers.

The random module has a set of methods:

Method Description
Initialize the random number generator
Returns the current internal state of the random number generator
Restores the internal state of the random number generator
Returns a number representing the random bits
Returns a random number between the given range
Returns a random number between the given range
Returns a random element from the given sequence
Returns a list with a random selection from the given sequence
Takes a sequence and returns the sequence in a random order
Returns a given sample of a sequence
Returns a random float number between 0 and 1
Returns a random float number between two given parameters
Returns a random float number between two given parameters, you can also set a mode parameter to specify the midpoint between the two other parameters
betavariate() Returns a random float number between 0 and 1 based on the Beta distribution (used in statistics)
expovariate() Returns a random float number based on the Exponential distribution (used in statistics)
gammavariate() Returns a random float number based on the Gamma distribution (used in statistics)
gauss() Returns a random float number based on the Gaussian distribution (used in probability theories)
lognormvariate() Returns a random float number based on a log-normal distribution (used in probability theories)
normalvariate() Returns a random float number based on the normal distribution (used in probability theories)
vonmisesvariate() Returns a random float number based on the von Mises distribution (used in directional statistics)
paretovariate() Returns a random float number based on the Pareto distribution (used in probability theories)
weibullvariate() Returns a random float number based on the Weibull distribution (used in statistics)



Example

Make a request to a web page, and print the response text:

import requests

x = requests.get('https://w3schools.com/python/demopage.htm')

print(x.text)

Definition and Usage

The requests module allows you to send HTTP requests using Python.

The HTTP request returns a with all the response data (content, encoding, status, etc).


Download and Install the Requests Module

Navigate your command line to the location of PIP, and type the following:

C:UsersYour NameAppDataLocalProgramsPythonPython36-32Scripts>pip install requests

Syntax

requests.methodname(params)

Methods

Method Description
Sends a DELETE request to the specified url
Sends a GET request to the specified url
Sends a HEAD request to the specified url
patch(url, data, args) Sends a PATCH request to the specified url
Sends a POST request to the specified url
put(url, data, args) Sends a PUT request to the specified url
request(method, url, args) Sends a request of the specified method to the specified url

Python statistics Module


Python statistics Module

Python has a built-in module that you can use to calculate mathematical statistics of numeric data.

The statistics module was new in Python 3.4.


Statistics Methods

Method Description
Calculates the harmonic mean (central location) of the given data
Calculates the mean (average) of the given data
Calculates the median (middle value) of the given data
Calculates the median of grouped continuous data
Calculates the high median of the given data
Calculates the low median of the given data
Calculates the mode (central tendency) of the given numeric or nominal data
Calculates the standard deviation from an entire population
Calculates the standard deviation from a sample of data
Calculates the variance of an entire population
Calculates the variance from a sample of data

Python math Module


Python math Module

Python has a built-in module that you can use for mathematical tasks.

The math module has a set of methods and constants.


Math Methods

Method Description
Returns the arc cosine of a number
Returns the inverse hyperbolic cosine of a number
Returns the arc sine of a number
Returns the inverse hyperbolic sine of a number
Returns the arc tangent of a number in radians
Returns the arc tangent of y/x in radians
Returns the inverse hyperbolic tangent of a number
Rounds a number up to the nearest integer
Returns the number of ways to choose k items from n items without repetition and order
Returns a float consisting of the value of the first parameter and the sign of the second parameter
Returns the cosine of a number
Returns the hyperbolic cosine of a number
Converts an angle from radians to degrees
Returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point
Returns the error function of a number
Returns the complementary error function of a number
Returns E raised to the power of x
Returns Ex - 1
Returns the absolute value of a number
Returns the factorial of a number
Rounds a number down to the nearest integer
Returns the remainder of x/y
Returns the mantissa and the exponent, of a specified number
Returns the sum of all items in any iterable (tuples, arrays, lists, etc.)
Returns the gamma function at x
Returns the greatest common divisor of two integers
Returns the Euclidean norm
Checks whether two values are close to each other, or not
Checks whether a number is finite or not
Checks whether a number is infinite or not
Checks whether a value is NaN (not a number) or not
Rounds a square root number downwards to the nearest integer
Returns the inverse of which is x * (2**i) of the given numbers x and i
Returns the log gamma value of x
Returns the natural logarithm of a number, or the logarithm of number to base
Returns the base-10 logarithm of x
Returns the natural logarithm of 1+x
Returns the base-2 logarithm of x
Returns the number of ways to choose k items from n items with order and without repetition
Returns the value of x to the power of y
Returns the product of all the elements in an iterable
Converts a degree value into radians
Returns the closest value that can make numerator completely divisible by the denominator
Returns the sine of a number
Returns the hyperbolic sine of a number
Returns the square root of a number
Returns the tangent of a number
Returns the hyperbolic tangent of a number
Returns the truncated integer parts of a number

Math Constants

Constant Description
Returns Euler's number (2.7182...)
Returns a floating-point positive infinity
Returns a floating-point NaN (Not a Number) value
Returns PI (3.1415...)
Returns tau (6.2831...)

Python cmath Module


Python cmath Module

Python has a built-in module that you can use for mathematical tasks for complex numbers.

The methods in this module accepts int, float, and complex numbers. It even accepts Python objects that has a __complex__() or __float__() method.

The methods in this module almost always return a complex number. If the return value can be expressed as a real number, the return value has an imaginary part of 0.

The cmath module has a set of methods and constants.


cMath Methods

Method Description
Returns the arc cosine value of x
Returns the hyperbolic arc cosine of x
Returns the arc sine of x
Returns the hyperbolic arc sine of x
Returns the arc tangent value of x
Returns the hyperbolic arctangent value of x
Returns the cosine of x
Returns the hyperbolic cosine of x
Returns the value of Ex, where E is Euler's number (approximately 2.718281...), and x is the number passed to it
Checks whether two values are close, or not
Checks whether x is a finite number
Check whether x is a positive or negative infinty
Checks whether x is NaN (not a number)
Returns the logarithm of x to the base
Returns the base-10 logarithm of x
Return the phase of a complex number
Convert a complex number to polar coordinates
Convert polar coordinates to rectangular form
Returns the sine of x
Returns the hyperbolic sine of x
Returns the square root of x
Returns the tangent of x
Returns the hyperbolic tangent of x

cMath Constants

Constant Description
Returns Euler's number (2.7182...)
Returns a floating-point positive infinity value
Returns a complex infinity value
Returns floating-point NaN (Not a Number) value
Returns coplext NaN (Not a Number) value
Returns PI (3.1415...)
Returns tau (6.2831...)


Learn how to remove duplicates from a List in Python.


Example

Remove any duplicates from a List:

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)

Example Explained

First we have a List that contains duplicates:

A List with Duplicates

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)

Create a dictionary, using the List items as keys. This will automatically remove any duplicates because dictionaries cannot have duplicate keys.

Create a Dictionary

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)

Then, convert the dictionary back into a list:

Convert Into a List

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)

Now we have a List without any duplicates, and it has the same order as the original List.

Print the List to demonstrate the result

Print the List

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)


Create a Function

If you like to have a function where you can send your lists, and get them back without duplicates, you can create a function and insert the code from the example above.

Example

def my_function(x):
  return list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Example Explained

Create a function that takes a List as an argument.

Create a Function

def my_function(x):
  return list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Create a dictionary, using this List items as keys.

Create a Dictionary

def my_function(x):
  return list(
dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Convert the dictionary into a list.

Convert Into a List

def my_function(x):
  return
list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Return the list

Return List

def my_function(x):
 
return list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Call the function, with a list as a parameter:

Call the Function

def my_function(x):
  return list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)

Print the result:

Print the Result

def my_function(x):
  return list(dict.fromkeys(x))

mylist = my_function(["a", "b", "a", "c", "c"])

print(mylist)


Learn how to reverse a String in Python.


There is no built-in function to reverse a String in Python.

The fastest (and easiest?) way is to use a slice that steps backwards, -1.

Example

Reverse the string "Hello World":

txt = "Hello World"[::-1]
print(txt)

Example Explained

We have a string, "Hello World", which we want to reverse:

The String to Reverse

txt = "Hello World"[::-1]
print(txt)

Create a slice that starts at the end of the string, and moves backwards.

In this particular example, the slice statement [::-1] means start at the end of the string and end at position 0, move with the step -1, negative one, which means one step backwards.

Slice the String

txt = "Hello World"[::-1]
print(txt)

Now we have a string txt that reads "Hello World" backwards.

Print the String to demonstrate the result

Print the List

txt = "Hello World"[::-1]
print(txt)


Create a Function

If you like to have a function where you can send your strings, and return them backwards, you can create a function and insert the code from the example above.

Example

def my_function(x):
  return x[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt)

Example Explained

Create a function that takes a String as an argument.

Create a Function

def my_function(x):
  return x[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt)

Slice the string starting at the end of the string and move backwards.

Slice the String

def my_function(x):
  return x
[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt)

Return the backward String

Return the String

def my_function(x):
 
return x[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt )

Call the function, with a string as a parameter:

Call the Function

def my_function(x):
  return x[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt)

Print the result:

Print the Result

def my_function(x):
  return x[::-1]

mytxt = my_function("I wonder how this text looks like backwards")

print(mytxt)


Learn how to add two numbers in Python.


Use the + operator to add two numbers:

Example

x = 5
y = 10
print(x + y)

Add Two Numbers with User Input

In this example, the user must input two numbers. Then we print the sum by calculating (adding) the two numbers:

Example

x = input("Type a number: ")
y = input("Type another number: ")

sum = int(x) + int(y)

print("The sum is: ", sum)

Python Examples


Python Syntax


Python Variables


Python Numbers


Python Casting



Python Strings


Python Operators


Python Lists


Python Tuples


Python Sets


Python Dictionaries


Python If ... Else


Python While Loop


Python For Loop


Python Functions


Python Lambda


Python Arrays


Python Classes and Objects


Python Iterators


Python Modules


Python Dates


Python Math


Python JSON


Python RegEx


Python PIP


Python Try Except


Python File Handling


Python MySQL


Python MongoDB



Python Online Compiler


Python Compiler (Editor)

With our online Python compiler, you can edit Python code, and view the result in your browser.


Example

print("Hello, World!")

x = "Python"
y = "is"
z = "awesome"
print(x, y, z)
Hello, World!
Python is awesome

Click on the "Try it Yourself" button to see how it works.


Publish Your Code

If you want to create your own website or build Python applications, check out .

is a website-building tool that enables you to create and share your own website. You can also get a Python server, allowing you to develop and host your Python applications with ease.

Note: This includes Python libraries such as: Django, Pandas, NumPy, SciPy and more.

You can change the website's look and how it works by editing the code right in your web browser.

It's easy to use and doesn't require any setup:

The code editor is packed with features to help you achieve more:

  • Templates: Start from scratch or use a template
  • Cloud-based: no installations required. You only need your browser
  • Terminal & Log: debug and troubleshoot your code easily
  • File Navigator: switch between files inside the code editor
  • And much more!

Learn Faster

Practice is key to mastering coding, and the best way to put your Python knowledge into practice is by getting practical with code.

Use to build, test and deploy code.

The code editor lets you write and practice different types of computer languages. It includes Python, but you can use it for other languages too.

New languages are added all the time:

Languages

If you don't know Python, we suggest that you read our from scratch.


Easy Package Management

Get an overview of your packages and easily add or delete frameworks and libraries. Then, with just one click, you can make changes to your packages without manual installation.


Build Powerful Websites

You can also use the code editor in to build frontend or full-stack websites from scratch.

Or you can use the 60+ templates available and save time:

Photographer website template
Blog website template
Webshop template
Tutor website template

Create your Spaces account today and explore them all!


Share Your Website With The World

Host and publish your websites in no time with .

W3Schools subdomain and SSL certificate are included for free with . An SSL certificate makes your website safe and secure. It also helps people trust your website and makes it easier to find it online.

Want a custom domain for your website?

You can buy a domain or transfer an existing one and connect it to your space.


How Does It Work?

Get started in a few clicks with .



Python Exercises


You can test your Python skills with W3Schools' Exercises.


Exercises

We have gathered a variety of Python exercises (with answers) for each Python Chapter.

Try to solve an exercise by filling in the missing parts of a code. If you're stuck, hit the "Show Answer" button to see what you've done wrong.

Count Your Score

You will get 1 point for each correct answer. Your score and total score will always be displayed.

Start Python Exercises

Good luck!

If you don't know Python, we suggest that you read our from scratch.


Kickstart your career

Get certified by completing the course

w3schools CERTIFIED . 2023

Python Quiz


You can test your Python skills with W3Schools' Quiz.


The Test

The test contains 25 questions and there is no time limit.

The test is not official, it's just a nice way to see how much you know, or don't know, about Python.

Count Your Score

You will get 1 point for each correct answer. At the end of the Quiz, your total score will be displayed. Maximum score is 25 points.

Start the Quiz

Good luck!

If you don't know Python, we suggest that you read our from scratch.


Kickstart your career

Get certified by completing the course

w3schools CERTIFIED . 2023

W3Schools Python Bootcamp


Python Bootcamp

Learn with W3Schools.

Live online learning sessions.

Duration: 3 Weeks.

w3schools BOOTCAMP . 2023

In this mini bootcamp, you will learn the fundamentals of Python. This powerful programming language is widely used in web development, data science and artificial intelligence, and many other fields.

You will learn the fundamentals of Python over a 3 weeks period together with a live instructor and an interactive cohort of engaged learners!

You Will Learn
  • The fundamentals of the Python language & syntax. Fundamentals including variables and datatypes, through to control structures, arrays, functions, objects, and more.
  • To write scripts for automating tasks using Python.
  • To create and serve web applications.
  • To write programs for mathematical and scientific computing using Python and relevant libraries.
  • Use object-oriented programming and third-party libraries.
  • To build secure and reliable applications.

3 Reasons to Join the Bootcamp

1. Live Online Instruction

Learn directly from experienced instructors through live online learning sessions.

2. Live Sessions

Learn three evenings per week between 7pm and 9pm, available in multiple time zones.

3. Affordable & Flexible

One of the most affordable instructor-led online bootcamps, with flexible payment options.

How it Works

1. Application and Enrollment:
Reserve your seat and enroll in the bootcamp by paying the bootcamp fee of only $395. The price includes the exam fee and 18 hours with live instructor.

2. Complete the Bootcamp:
After enrollment, you will be placed in a learning cohort with other students. You will go through the course material together and complete assignments and projects with the help of an experienced instructor.

The bootcamp covers the fundamentals of Python. The curriculum used is the . The bootcamp will help you prepare for the W3Schools Python exam so that you can document and validate your competence.

Throughout the bootcamp, you will receive support from your cohort and the W3Schools team to help you grow your skill set.

3. Certification and Job Application:
Upon completing the bootcamp, you will have passed the W3Schools Python exam and obtained the Certified Python Developer Title and an extra acknowledgement that you were a part of the bootcamp on the certification. This certification demonstrates that you know the fundamentals of Python.



W3Schools Python Certificate


w3schools CERTIFIED . 2023   

W3Schools offers an Online Certification Program.

The perfect solution for busy professionals who need to balance work, family, and career building.

More than 50 000 certificates already issued!


w3schools CERTIFIED . 2023

W3Schools offers an Online Certification Program.

The perfect solution for busy professionals who need to balance work, family, and career building.

More than 50 000 certificates already issued!

Document your skills
Improve your career
Study at your own pace
Save time and money
Known brand
Trusted by top companies

Who Should Consider Getting Certified?

Any student or professional within the digital industry.

Certifications are valuable assets to gain trust and demonstrate knowledge to your clients, current or future employers on a ever increasing competitive market.

W3Schools is Trusted by Top Companies

W3Schools has over two decades of experience with teaching coding online.

Our certificates are recognized and valued by companies looking to employ skilled developers.

Save Time and Money

Show the world your coding skills by getting a certification.

The prices is a small fraction compared to the price of traditional education.

Document and validate your competence by getting certified!

Exam overview

Fee: 95 USD

Number of questions: 70

Requirement to pass: 75% correct answers

Time limit: 70 minutes

Number of attempts to pass: Two

Exam deadline: None

Certification Expiration: None

Format: Online, multiple choice


Advance Faster in Your Career

Getting a certificate proves your commitment to upgrading your skills.

The certificate can be added as credentials to your CV, Resume, LinkedIn profile, and so on.

It gives you the credibility needed for more responsibilities, larger projects, and a higher salary.

Knowledge is power, especially in the current job market.

Documentation of your skills enables you to advance your career or helps you to start a new one.


How Does It Work?

  • Study for free at W3Schools.com
  • Study at your own speed
  • Test your skills with W3Schools online quizzes
  • Apply for your certificate by paying an exam fee
  • Take your exam online, at any time, and from any location

Get Your Certificate and Share It With The World

Example certificate:

Each certificate gets a unique link that can be shared with others.

Validate your certification with the link or QR code.

Check how it looks like in this .

Share your certificate on LinkedIn in the Certifications section in just one click!


Document Your Skills

Getting a certificate proves your commitment to upgrade your skills, gives you the credibility needed for more responsibilities, larger projects, and a higher salary.



Looking to add multiple users?

Are you an educator, manager or business owner looking for courses or certifications?

We are working with schools, companies and organizations from all over the world.



Login
ADS CODE