Monday, November 27, 2017

Password Verification using a Regular Expression

A student asked me to explain this wonderful regular expression

^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\\S+$).{6,}$
There are two things you need to understand well, before attempting to determine what this regular expression matches. One is to understand the use of ?= and the other is to understand \\S . Assuming that you can understand the rest of the Regular Expression.

The latter is the easier one to understand. It's trying to represent a non-whitespace character. But, the usual regular expression \S has been preceded with a backslash, since the programming language, Java in this case, expects the backslash to be present as a escape character to subsequent \S

So, the more regular, regular expression would be
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\S+$).{6,}$
Now, let's understand ?=

The technical term for is ?= "Positive Lookahead". It basically means that the string "should contain the specified set of characters, but they are not consumed". In simple terms, this is used to validate if a string contains any of the set of characters we are interested in, irrespective of their order or location in the string. Thus, (?=.*[0-9]) matches any string that has zero or more occurrences of any character followed by a digit. In much simpler terms, we expect the string to contain a digit.

So, to understand the entire regular expression, we need to first break down into smaller chunks. You will notice that there are 5 sets of "positive lookahead" blocks. If we eliminate them, we are left with
^.{6,}$
This matches a string that has atleast 6 characters. The below table should help you visualise this better.
    • String Type String Result
      Six Spaces PASS
      Six Digits 123456 PASS
      Six Alphabets abcdef PASS
      Six Upper case ABCDEF PASS
      Six Special #$%^&+ PASS
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the first "positive lookahead" block
^(?=.*[0-9]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has atleast one digit, as is mentioned in the first "positive lookahead" block. The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 PASS
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the second "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Similarly, let's add the third "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the fourth "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
  • atleast one special character as mentioned in the fourth "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 FAIL
      With space ab D56 FAIL
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Finally, let's add the fifth "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\\S+$).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
  • atleast one special character as mentioned in the fourth "positive lookahead" block
  • should not have a space character as mentioned in the fifth "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 FAIL
      With space ab D56 FAIL
      With Space and Special a C+56 FAIL
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Tuesday, August 29, 2017

[Solved] Python Quiz : 001

I came across a very simple Python Quiz question in a newsgroup, so thought about expanding on that.

Assuming, you are attending an interview and this question is asked. What would be your answer ?

[ Remember : you need to give this answer from memory and not by trying out this code :-) ]

>>> def foo(num) : print num; return num
...
>>>
>>> foo(5) * foo(1) * foo(2)
5
1
2
10
>>> foo(5) ** foo(1) ** foo(2)

And, as in any good interview, you are asked to explain your answer as well :-)

Post your answers in the comments below
    • There are two parts to this problem.
      1> Order of the execution of the functions
      2> Order of evaluation of the expression that involves **

      From the example mentioned in the earlier part of the question, its very evident that Python invokes these functions from left to right.

      So, that the expression to evaluate, after the functions return their values, would be
      >>> 5 ** 1 ** 2

      ** or exponentiation is the only operator in Python that is right-associative.

      The reason [ as far as I can understand ] for this behaviour, is that, it is the way it is handled in mathematics as well. It adds confusion only when the programming language, evaluates every other operator using left-association

      This link gives a very good explanation on why right-associativity is preferred : https://core.tcl.tk/tips/doc/trunk/tip/274.md

      Thus,
      >>> 5 ** 1 ** 2
      evaluates as
      >>> 5 ** (1 ** 2)
      >>> 5 ** ( 1 )
      >>> 5

      CONGRATULATIONS !!! to all those who got the result and the explanation right :-)

Monday, July 24, 2017

6 different ways to reverse a string in Python

A lot of interviewers seem to have a fascination for this often asked question :

"How to reverse a string in Python ?"

There are of course, many possible answers for this. Here are some that might come in handy for your understanding and learning.

1. Old School while loop method

>>> x = "hello"
>>> idx = len(x)
>>> rev = list()
>>> while idx > 0:
...         rev.append( x[idx-1] )
...         idx -= 1
...
>>> print ''.join(rev)
olleh

2. Using Python's reversed function with a for loop


Monday, July 17, 2017

The Curious Case of Python String Slicing

We were studying about Python strings and we tried to understand Python String Slicing through an example. We found a very strange [at least at first look] result when we tried to use String Slicing for creating a copy of the string.

To better understand the scenario and the result, lets start with the basics of integers and lists w.r.t to slicing.

>>> m = 10         # Integer m initialised to 10
>>> n = m           # Another integer n storing the same value as in m
>>> print m, n
10 10

>>> id(m), id(n)
( 140204825415168 , 140204825415168 ) # <-- Both are of the same id
A variable name in Python, is a "tag" name associated with a memory location.
In the above scenario, when we initialised variable m, Python associated a tag name m with memory location [ 140204825415168 ] ( For this discussion, assume that id represents memory location )