Python professionals: A Common Scenario

I live in the Silicon Valley of India, Bangalore for the last four years and there is a common saying that I often overhear when I travel to different cities. It is funny at times but speaking as someone who has been a Bangalore citizen for a long time, this is not entirely false.

“If I throw a stone at a random crowd in Bangalore; 8 out of 10 times it would be an IT professional who would get hit by it.”

On a similar line, as an IT professional working in the same city; having known the industry, the technologies that are flourishing and knowing the trend of the skill demand of the future; allow me to alter some words from this saying and let me frame another sentence,

“If I go around in any organization asking random questions on programming languages that aids in Data Science, 8 out to 10 professionals would instantly speak up, Python.”

Undoubtedly, Python for Data Science is the most preferred programming language when it comes to data science by majority of the professionals in this field. Though there are other programming languages such as R, SAS, etc. Python stands out from the rest because of its unique features. Speaking from the viewpoint of a professional, below are some of the pointers which make Python the most preferred language in the present generation,

  • Python is a 4th generation programming language which is very easy to learn and apply.
  • Python code looks more like a conversation between a human and machine in English language rather than the complex syntax followed by many programming languages such as Java, C++, C#, etc. In short, the language is highly intuitive and easy to follow because of the English language natured keywords.
  • The code base of a software typed in Python programming language is highly optimized for maintenance and debugging with fewer lines of code needed to perform a particular operation which usually takes several lines in other programming languages.
  • Python is not a standalone programming language and is supported by global communities who keep on developing python libraries and support functions. Furthermore, Python for Data Science is an open source programming language which simply means that anyone across the globe can contribute to the community and make Python more useful and convenient to use.
  • Python for data science course is like a blessing in disguise because of the overwhelming support from its communities, research institutes and students across the globe.
  • Python is perhaps one of the few programming languages which supports graphical interfaces as well as functions and packages for data visualization. A true package if I may say so.
  • Unlike many scripting languages, Python is more robust and complete when it comes to various use cases. Python supports web development, software development, game development, etc. which is not the case with its competitors.
Data science courses

One of the major drawbacks that Python for Data Science has is the lack of highly mathematical and technical algorithms related to Statistics, Mathematics, algebra and calculus which is omnipresent in the R programming language. Yet for the majority of applications in the corporate sector whatever Python has to offer is more than sufficient and hence this drawback is often overlooked while considering the project plan of a data science project. While this may be true in a corporate set up; the same does not hold true in a research oriented platform. Scientists and computer science experts doing research on machine learning algorithms and data science prefer to use R due to the availability of highly mathematical and sophisticated algorithms in those fields of research.

Yet, there are professionals one would come across in an IT landscape who are more inclined towards R or even towards SAS. There is nothing wrong with it. It all depends on the type of applications they deal in on a regular basis and also their area of expertise but frankly speaking, there is no right answer to this question as to which programming language has more advantages than the competitors.

This article is dedicated to some of the facts revolving around the programming language Python especially on Python for Data Science. But before we get into the details of these fun facts, let’s have a closer look at Python Programming language right from its birth till present.

Python as we know it

Python is a high level programming language developed by Guido van Rossum in 1991. Python is a robust programming language and I consider it one of the few complete programming languages with applications ranging from web developments, server configuration, software development and maintenance, data analytics and as a scripting language. Some of the areas of software development where Python comes handy are as follows,

  • Python is often used for server side configurations in a web development project.
  • Python is used to create workflows.
  • Python can connect to a variety of databases, query and update data within those databases.
  • Python is capable of handling big data and complex mathematical operations.

The capabilities of Python for Data Science in Data Analytics range from getting the simplest summary statistics of the data to more complex and sophisticated models that get deeper insights from the data.

Now that we have seen some background of the Python programming language, lets dive straight into some of the interesting facts about this language.

Top Facts about Python for Data Science

  • Python was a time pass project: What if I told you that the programming language that is so widely used in Data Science today was a brainchild of a programmer who was looking to get through his holidays by taking up a hobby project? That was indeed the case with our hero Python for Data Science back then. World Renowned computer programmer Guido van Rossum was looking for a project that would help him go through the Christmas holidays in 1989. He wanted to develop a scripting language that was advanced in terms of usage and also would have helped hackers at that time. Two years later, in 1991, Python was born and the rest is history.
  • Python is not the snake: Unlike the popular belief that the programming language was named after the famous non venomous snake Python, it is not the case with Python for Data Science. Python got its name from the renowned British comedy group Monty Python who performed in the British colonies back in the 1970s. Guido was a big fan of Monty Python and thus he decided to name this programming language after his favorite celebrity.
  • The Zen of Python: Our friend Python for Data Science has its own poem which gives out guidelines on the best practices that is usually followed by programmers coding in this programming language. The Zen of Python is a poem written by Tim Peters who is a major contributor to the open source project. The Zen of Python underlays the philosophies of the Python language. All that you have to do is type the command import this and you get the Zen of Python displayed on your computer screen. The Zen of Python looks something like the below snippet,

>>> import this

“The Zen of Python, by Tim Peters

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren’t special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one– and preferably only one –obvious way to do it.

Although that way may not be obvious at first unless you’re Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it’s a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea — let’s do more of those!”

  • Python is preferred over the French Language: In one of the recent surveys by the United Kingdom in 2015, the result went staggering as 6 out of 10 parents wanted their children to learn Python than learning the French Language. 75% of the children who were interviewed with a similar set of questions mentioned that they would rather like to control a computer and learn to program a robot then learning a modern foreign language.
  • Different types of Python: Python for Data Science comes in various flavors. But before we dive deep into the flavors lets us spare some time to understand how these flavors come into existence. One of the lesser known facts about programming languages are that they are essentially written in English language when programmers brain storm as to what to include and what not to include in the same. Python in itself is just a set of instructions that are written in English which can be read by humans on the screen. Consider the code snippet below,

>>> def func ():

            print (“Python for Data Science is a great way to solve problems!!!”)

As simple it may seem to any programmer, this function prints the one line within the print statement. However, to a machine, this is completely alien and hence this has to be converted into a machine readable format which is usually strings of 0 and 1 or bits as they call in the technical forum. This is done by the interpreter which is in turn is written using other existing programming languages such C, Java, C#, etc. This concept is called an implementation of a programming Language and depending on the programming language that is used to code the interpreter, various flavors of Python come into existence.

Some of these flavors of Python are as follows,

  • CPython: This uses the C Programming Language for the interpreter development. This is the most commonly used interpreter and the most common implementation of the Python language.
  • Jython: This uses Java Programming Language for its implementation. It converts the program into a bytecode, a file format which is run by the virtual machine particularly seen in the Java Programming Language.
  • IronPython: This uses C# programming Language using frameworks from .NET.
  • Brython: This is a flavor of Python which runs on browsers.
  • RubyPython: This uses Ruby Programming Language for the implementation.
  • PyPy: This uses Python itself for the implementation.
  • MicroPython: This flavor of Python programming language runs on Microcontrollers.
  • Python uses no braces: Unlike many modern Programming languages, Python relies heavily on indentation and whitespaces for the control flow of the program. Consider the function below written in Python,

>>> def add (a, b):

            return a+b

This is a simple function that calculates the sum of two numbers and returns the value to the calling function which in turn gets stored in a variable or gets used by some other processes or functions. This same code in Java would look something like this,

int add (int a, int b) {

            return (a+b);

}

The above function does the same job as the one written in Python above. The only difference here is that Java uses curly braces to bind lines of code together while python relies on indentation which makes it a bit clearer in appearance and looks a little less daunting. However, the downside of this is that programmers have to extremely careful as to where they write their code. If that is not properly maintained, the program may give unexpected results.

Python developers have a very good sense of humor. Such exceptional lines of codes as you see below are often considered as the holy grail of Python for Data Science. If you try executing the below code, you get a very witty exception from the developers of the programming language,

>>> from __future__ import braces

SyntaxError: not a chance

  • Python supports multiple returns: Python for Data Science supports multiple returns from a single function which is not possible in most of the modern day programming languages such as Java, C, etc. Consider the following code snippet below,

>>> def func_multi_return ():

            return 10, 20, 30

>>> val1, val2, val3 = func_multi_return ()

>>> val1, val2, val3

(10, 20, 30)

>>> val1

10

>>> val2

20

>>>val3

30

This is not possible in programming languages like Java and C. You have an option of returning an array or a composite data type instead but one cannot return multiple values from a function. For example, the below code snippet is syntactically wrong in Java,

int funcMultiReturn () {

            return (10, 20, 30);

}

While the below code snippet returning an array of values is perfectly correct in Java and many more programming languages which follow similar syntax for its code.

Int funcArrayReturn () {

Int[ ] myArray = new int [3];

myArray[0] = 10;

myArray[1] = 20;

myArray[2] = 30;

return (myArray);

}

  • Multiple assignments: Python for Data Science supports multiple assignments in one statement which makes it easier for the programmer to shorten the code length by eliminating unnecessary lines for assignments of values. Consider, the below code snippet which is perfectly correct when it comes to Python programming language.

>>>val1, val2 = 10, 20

>>>val1

10

>>>val2

20

This comes handy when writing common programs for swapping of data values which becomes complex when working with programming languages such as Java and C programming languages. Let us consider the problem of swapping data values in Python and other programming languages like Java and C.

This code snippet below shows how one can swap values in Python,

>>>val1, val2 = 10, 20

>>>val1

10

>>>val2

20

>>>val1, val2 = val2, val1

>>>val1

20

>>>val2

10

Whereas the same functionality in Java is achieved by writing a logic for swapping of values between the variables. The code snippet in Java looks like the below,

public static void main () {

            int val1 = 10, val2 = 20, val3 = 0;

            val3 = val1;

            val1 = val2;

            val 2 = val3;

//This statement shows the swapped values when the code is run

System.out.print (val1, val2);

}

  • Chain Comparison: Another trait of Python that makes it so intuitive is that it supports what is called a chain comparison. Python for Data Science allows the programmer to compare multiple conditions without the use of logical operators such as AND, OR and NOT. This adds to the readability of the code as well and makes it easier for someone to read and debug. Below is a code snippet taken from Python,

>>>1<2<3

TRUE

>>>1<2>1.5

TRUE

>>>1>2<3

FALSE

The same comparison when done in Java looks similar but notice the presence of logical AND operator &&. Such syntax makes the code looks complicated especially in large code bases.

public static void main () {

            //This Statement displays True as the output when the code is run

            System.out.println((1<2) && (2<3));

            //This Statement displays True as the output when the code is run

            System.out.println((1<2) && (2>1.5));

            //This Statement displays False as the output when the code is run

            System.out.println((1>2) && (2<3));

}

  • else statement in a loop: One of the typical features which makes Python for Data Science stand out in the league of exemplary programming language is that Python allows to have an else statement in the looping constructs of ‘For’ and ‘While’ Loops. This is however not the case with most of the modern day programming languages such as Java and C. Consider, the below code snippet which is perfectly valid in Python for Data Science.

>>> for val in range (10):

            If val == 50:

                        print (“Value Found”)

                        break

       else:

            print (“Value not found”)

Let me now take some time to explain how this works in Python for Data Science. The else statement is executed only when the loop completes successfully without encountering a break statement or the loop terminates abruptly due to an error or an exception. In the above example, the if-clause which searches for the value 50 is executed 10 times till the range function is covered and if you notice the break statement is under the if statement which is never executed. Thus the output will be simply, “Value not found”. Such constructs are particularly useful when writing codes for search within an application or a module of the program.

  • Python does not know Infinities: One of the facts that were have been told during our school days is the fact that infinities are not defined. Well this is not true for Python. Python for Data Science allows us to define infinity in the programs. Let us consider the following code snippet written in Python,

>>> positive_infinity = float (‘Inf’)

>>> if 9999999999999999999 > positive_infinity:

            print (‘9999999999999999999 is greater’)

        else:

            print (“Positive Infinity is greater”)

>>> negative_infinity = float (‘-Inf’)

>>> if -999999999999999999 < negative_infinity:

            Print (“-999999999999999999 is lesser”)

        else:

            Print (“Negative Infinity is greater”)

When we run this code snippet, the following are shown as the output,

Positive Infinity is greater

Negative Infinity is lesser

Which is the expected result as per Mathematics.

In most of the modern day programming languages such as Java such operations are not possible as infinities in these languages are handled as exceptions and usually no operations can be carried out when the system throws an exception.

  • Python is an interpreted language: Most of the modern day programming languages including Java and C need compilers which take the source code and converts it into a machine readable format consisting of strings of zeroes and ones. For example, Java compiler compiles the source code into something called as a bytecode. Unlike many of these programming languages, Python is not dependent on a compiler rather it uses something called as an interpreter to generate the machine readable set of instructions. The output of an interpreter is the generation of the .pyc file which is then executed by a virtual machine to produce the output.
  • The underscore has a memory power in Python: Many of us who are somewhat familiar to Python for Data Science may already know that Python programming language constructs and statements can be executed in either in the interactive shell or can be run via a python source code file which ends in the .py extension. What many of the professionals using python do not know that the underscore (_) is used to retrieve the result from the last expression that was executed in the command line interface. Consider the following code snippet written in Python,

>>> val1 = 10

>>> val2 = 20

>>> val1 + val2

30

>>> val3 = _

>>> val3

30

This feature of the Python programming language comes handy especially while working with the Jupiter Notebooks which enables programmers to use both the interactive console and full -fledged programming constructs to run in the same window.

With this we come to an end of some of the facts about Python for Data Science. There are similar facts about many of the libraries that Python has to offer for a variety of purposes but stopping here is going to be for good as these are some of the basic fun facts about Python that many of the programmers are unaware of.

Summing it all up

In Summary, Python as is widely known is the programming language of the future and that Python for Data Science is the most popular programming language in the field of machine Learning, data analytics and artificial intelligence. Python is known for its ease to learn and its proximity to the day to day conversational English that we do to interact with others. Python for Data Science is well equipped to handle big and complex data sets, get an understanding of the data and get an insight of what the data has to show us. Python for Data Science is the most sought after programming language and professionals possessing skills in this domain are highly paid individuals.

This article was not particularly focused on the technical aspects of the programming language but was aimed at highlighting some of the fun facts which make Python for Data Science more lovable. There are fun facts for packages and libraries in Python too which will be shared in some other article.

Here are the thirteen facts about Python for Data Science that we saw in this article,

  • Python was a time pass project
  • Python is not the snake
  • The Zen of Python
  • Python is preferred over French
  • Different types of Python
  • Python uses no braces
  • Python supports multiple returns
  • Multiple assignments
  • Chain Comparison
  • else statement in a loop
  • Python does not know Infinities
  • Python is an interpreted language
  • The underscore has a memory power in Python
Spread the love
Author

Write A Comment