Monday, November 27, 2017

Password Verification using a Regular Expression

A student asked me to explain this wonderful regular expression

^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\\S+$).{6,}$
There are two things you need to understand well, before attempting to determine what this regular expression matches. One is to understand the use of ?= and the other is to understand \\S . Assuming that you can understand the rest of the Regular Expression.

The latter is the easier one to understand. It's trying to represent a non-whitespace character. But, the usual regular expression \S has been preceded with a backslash, since the programming language, Java in this case, expects the backslash to be present as a escape character to subsequent \S

So, the more regular, regular expression would be
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\S+$).{6,}$
Now, let's understand ?=

The technical term for is ?= "Positive Lookahead". It basically means that the string "should contain the specified set of characters, but they are not consumed". In simple terms, this is used to validate if a string contains any of the set of characters we are interested in, irrespective of their order or location in the string. Thus, (?=.*[0-9]) matches any string that has zero or more occurrences of any character followed by a digit. In much simpler terms, we expect the string to contain a digit.

So, to understand the entire regular expression, we need to first break down into smaller chunks. You will notice that there are 5 sets of "positive lookahead" blocks. If we eliminate them, we are left with
^.{6,}$
This matches a string that has atleast 6 characters. The below table should help you visualise this better.
    • String Type String Result
      Six Spaces PASS
      Six Digits 123456 PASS
      Six Alphabets abcdef PASS
      Six Upper case ABCDEF PASS
      Six Special #$%^&+ PASS
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the first "positive lookahead" block
^(?=.*[0-9]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has atleast one digit, as is mentioned in the first "positive lookahead" block. The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 PASS
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the second "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 PASS
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Similarly, let's add the third "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 PASS
      With space ab D56 PASS
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Now, let's add the fourth "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
  • atleast one special character as mentioned in the fourth "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 FAIL
      With space ab D56 FAIL
      With Space and Special a C+56 PASS
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


Finally, let's add the fifth "positive lookahead" block
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])(?=\\S+$).{6,}$
This matches a string that has atleast 6 characters. It also validates that the string also has meets "all" the following conditions.
  • atleast one digit, as is mentioned in the first "positive lookahead" block
  • atleast one small case alphabet as mentioned in the second "positive lookahead" block
  • atleast one upper case alphabet as mentioned in the third "positive lookahead" block
  • atleast one special character as mentioned in the fourth "positive lookahead" block
  • should not have a space character as mentioned in the fifth "positive lookahead" block
The below table should help you visualise this better.
    • String Type String Result
      Six Spaces FAIL
      Six Digits 123456 FAIL
      Six Alphabets abcdef FAIL
      Six Upper case ABCDEF FAIL
      Six Special #$%^&+ FAIL
      Alphanumeric abc456 FAIL
      Alpha Upper numeric abCD56 FAIL
      With space ab D56 FAIL
      With Space and Special a C+56 FAIL
      All types a2C$eF PASS
      Seven Characters a2C&eF7 PASS


No comments:

Post a Comment