PHP 2 Flashcards

1
Q

How do you obtain the IP address of a client?

A

$_SERVER[‘REMOTE_ADDR’];

Please note this is note safe as it can be faked by the client. It may also not be the ip of the client, rather the proxy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you determine the type of web browser that the user is using?

A

$_SERVER[‘HTTP_USER_AGENT’];

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What constant in PHP stores the current file name?

A

__FILE__

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is one way of getting the current file name?

A

$current_file = basename($_SERVER[‘PHP_SELF’]);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What global variable (not constant) holds the name of the current file?

A

$_SERVER[‘PHP_SELF’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In a regex expression, what does \b match?

A

Word boundaries. Very useful for picking the first letter in a sentence for example:
/(\b[a-z])/i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What would one way be to retrieve a web page and print the source out with line numbers?

A

Use ‘file’ to obtain an array of the lines, then print them out passing each line through htmlspecialchars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between sort and usort?

A

usort uses a user defined comparison function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If you unset an element in an array, are the integer keys reordered?

A

No - it is common to use unset to remove an element out of an array, but elements after it will continue their numbering scheme. Use sort to fix this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

By default json_decode returns data in objects, how do you make it return the data in an associative array?

A

Set the second parameter to true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why use array_walk_recursive over array_walk?

A

In some cases an array will contain another array (i.e. nested json) - array_walk will fail in this case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

You can sort an array by key ksort - how would you sort the same array in reverse?

A

krsort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You can sort an array by value using asort, how would you sort the same array in reverse?

A

arsort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You’ve been asked to add up all the elements in an array, do you write a function or use a function in built in php

A

Use an inbuilt function, namely array_sum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What function can be used to make an array of arrays?

A

array_map - if you pass the callback function as null, it will create an array of arrays from a given set of arrays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does is_scalar test for?

A

Whether a value is scalar or not - a scalar variable is an atomic variable, whereas arrays, objects and resources are not scalars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How would you obtain an array of numbers that fit within a range. I.e. from 0-16, but in steps of 4?

A

range(0,16,4);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How could you create a random list of numbers in a range, without calling rand()?

A

$a = range(1, 10);
shuffle($a);

$a is now an array containing a random order of numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How would you take a string and convert it into an array of characters so they can be traversed?

A

$array_char = str_split($str);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How could you use substr to check each character in a string

A
for ($x = 0; $x < strlen($str); $x++) {
   if (substr($text, $x, 1) == $search_char) {
      // Do stuff
   }
}
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How would you tell curl to inform a server that you can recieve compressed data to your application?

A

$header[] = “Accept-Encoding: compress, gzip”;

curl_setopt($curl_session, CURLOPT_HTTPHEADER, $header);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Why is it important that a webbot emulate a standard browser?

A

Firstly, it reduces the risk of sites from detecting the presence of a web bot - but in cases where you want compressed files, most servers check for valid web clients before sending compressed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What test can you do to determine if the incoming file is compressed?

A

if (stristr($http_header, “zip”))
$compressed = TRUE;

(This assumes you’ve extracted the header into $http_header)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What function could you use to decompress a file in PHP?

A

gzuncompress($filename);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

I have an interface with a number of const’s, how do I access them?

A

interfaceName::ConstName;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is one reason for overriding the __unset magic method?

A

You may wish to unset or manage associated data (i.e. a database connection) when the main variable is unset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a good use for the __sleep magic method?

A

If an object is serialized, it may be necessary to disconnect any objects that should not be serialized along with it - including things like database connections.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

If you use the __sleep magic method, what must it return?

A

An array of the object variables that you wish to serialised from within the object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What has largely replaced the __sleep / __wake magic methods?

A

The SPL now provides an interface called Serializable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is one advantage of the Serializable interface over the old method using magic methods?

A

It is now type hintable (because interfaces are type hintable)..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is one important use for the __toString magic method?

A

Templating engines - being able to print out the name of the object is important in these kind of engines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is one purpose of the __invoke magic method?

A

To allow an object to be used like a function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Name some uses of the __clone magic method?

A

By making it private, you can prevent an object from being copied. You can also use it to reset database connections or other resources just before the object is copied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q
Given the following classes:
class myClass{}
class secondClass extends myClass{}
class thirdClass extends secondClass{}
is the following statement true?
$obj = new thirdClass();
$result = $obj instanceof myClass;
A

Yes - due to inheritance, the child class inherits all the attributes of the parent class, and is therefore an instance of the parent class (which allows typehinting to work with child objects)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

When passing an object and using typehinting, what is one major consideration?

A

It is possible to pass a parent object that may not have all the methods the more specialised child has. This means you can call a method that does not exist and cause a fatal error. Using an interface (that should guarantee a method exists) will help.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Can an interface define a protected or private method?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Anonymous functions are also known as…

A

closures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Internally, is an anonymous function really a function?

A

No, it is an object of type Closure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is one use for the __invoke magic method of an object?

A

In functions that take a callback, PHP sometimes requires an array with the object and a string defining the method to call as a callback. Using __invoke, you can pass the object directly, and the system will cause the __invoke method on the object to be called. As long as __invoke has been implemented, the correct code should be called.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Exceptions that rise to the top level are logged by PHP, what about exceptions that are caught and handled?

A

No - these need to be logged

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Is it possible to register multiple autoload functions using spl_autoload_register?

A

Yes - it behaves like a queue, it will call the first registered function first and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is wrong with the following code?

foreach ($bindings as $key => $val) {
$stm->bindParam($key, $val);
}

A

When using bindParam, it expects to receive the reference to the $val variable.

// NOTE the ampersand
foreach ($bindings as $key => &$val) {
  $stm->bindParam($key, $val);
}
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is an alternative to using bindParam in a loop - and what would be different about the loop?

A

You can use PDOStatement::bindValue($key, $val). The loop would not need to pass by reference.

// Notice - no & on the $val
foreach ($bindings as $key => $val) {
  $stm->bindValue($key, $val);
}
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What function would you use to remove any empty array elements?

A

array_filter($array);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

If you are throwing an exception in one class, and the calling class doesn’t catch the exception, what is one possible reason that it doesn’t get caught?

A

If you are using namespaces, you will need to either use the ‘use’ clause, or explicitly use the full path to the exception class.

catch(\InvalidArgumentException($error)) { }

While you’ve probably remembered to do this in the throw statement (it will throw an error if you don’t) - it will fail silently in the catching class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is one reason to avoid using the .inc file extension for class files?

A

Some browsers display this as plain text, which (if sensitive information was stored here) would be a security risk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is one way to store meta data that can be used to configure the behaviour of a class?

A

Use PHPDoc syntax to store the meta data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Given a function that takes a key, and deletes it, what is wrong with this code:

public function delete_key($key)
{
   if ($key) {
      unset(this->data[$key]);
   }
}
A

A key in an array could well be 0, and if that is the case, $key will fail - because 0 also === false. Therefore the key would never be deleted.

if (false !== $key)

Would be better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is YAML?

A

It is a markup language (YAML A’int Markup Language) - it is a human readable language, and takes concepts from C, Perl and Python and XML and uses the data format from electronic mail.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

If you get the error ‘browscap ini directive not set’ when using the function get_browser(), what is that refering to?

A

In the php.ini configuration file you have to point the browscap.ini directive to the file (which must be downloaded as it doesn’t come with PHP by default).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What is one clean way to test whether an element exists in an array, and if it does assign it to a variable - WITHOUT using the ternary ? operator?

A

isset($array[‘element’]) && $this->val = $array[‘element’];

NOTE: This works because if the left hand side of the && is false, it will fail - otherwise it will attempt the right hand side. Because assignment is always true - this will work.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What method would you use to get the last element of an array?

A

end($myArray);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Why do you need to filter input AND escape output?

A

Users could use a CSRF (Cross Site Request Forgery). Basically, a user sends another user a link with a page variable already filled in - the variable with be set to a malicious script. If the other user clicks on this link while logged into the site, it will execute with their privileges, allowing the malicious user to do things like grab cookie data, or run code as that user.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What two functions in combination are useful for dealing with CSRF attacks?

A

filter_input

htmlspecialchars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

If you are not using namespaces, but are using the composer autoloader - how would you tell composer where to look for classes?

A
In the composer.json file use the following
{
   "autoload": {
       "classmap": ["src/"]
   }
}
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What is sso?

A

Single Sign On - an enterprise system that allows one sign in to access all content on a site. Also RSO - Reduced Sign On - which may have multiple layers if authentication complexity depending on what the user is trying to access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Why should you never leave an empty catch block?

A

Swallowing an exception means that the error is never seen - PHP ONLY logs errors if the error is unhandled. So you either need to leave them unhandled (bad) - or handle them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What is a good way to handle non-fatal exceptions in PHP?

A

Catch the error - log the error, and store an error message (possibly in the session). Then redirect to an error page that can give the user information about what the error was.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What does the ‘compact’ function do?

A

It takes an array of variable names, and if there is a corresponding variable name in scope, it will create an array with the variable names as keys, and the variable data as values. Works the opposite to explode.

$name = 'superman';
$hat = 'red';
$myvars = [ 'name', 'hat'];
$result = compact ($myvars);

results in :

array {
‘name’ => ‘superman’
‘hat’ => ‘red’
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What should you always set in a PHP 5.+ app?

A

date_default_timezone_set(‘UTC’)

(or to your location such as ‘Japan’.

This is because most functions in PHP require this information to be set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What is the difference between && and AND

A

The AND operator has a different precedence than the && operator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What would be the result of this?
$this = true;
$that = false;
$truthiness = $this AND $that;

A

$truthiness will equal TRUE, which may not be what is expected. The reason being - AND has a different precedence to &&.

($truthiness = $this) AND $that;
$truthiness = ($this &amp;&amp; $that);

Because ‘=’ has a higher precedence than AND, but not a higher precedence than &&.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

How do you ensure your app is using UTF8 characters?

A

Call

mb_internal_encoding(‘UTF-8’);

At the beginning of the script.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

How can you be sure you are outputting UTF-8 data to the browser?

A

Call

mb_http_output(‘UTF-8’);

At the beginning of the script.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

If you are storing UTF8 strings in a database, what encoding / collation should the be?

A

utf8mb4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

In a GET request, form arguments are in the query string, where are they in a POST request?

A

In the request body

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

The request needs a …………… header that tells the server the size of the content to expect in the body?

A

Content-Length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What CURLOPT_ define is used to specify a custom request type?

A

curl_setopt($c, CURLOPT_CUSTOMREQUEST, ‘PUT’);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

How do you use curl_setopt to specify the HTTP header?

A

curl_setopt($c, CURLOPT_HTTPHEADER, array(‘Content-Type: application/json’);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Once you have sent a PUT request, how do you read it on the server side?

A

There is no $_PUT magic array in php - instead you can use a file stream reader, as put requests come in via stdin.

$data = file_get_contents(“php://input”);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

How would you parse the input from a PUT request?

A

Generally, you can use:

parse_str($inputdata, $outputarray);

Don’t forget to use the $outputarray, or else the variables get written to local scope (BAD).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

To upload a file with cURL, what three options are very useful?

A

CURLOPT_PUT - set to true
CURLOPT_INFILE - set to a file handle of that file
CURLOPT_INFILESIZE - set to size of file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What cURL option is useful for dealing with cookies?

A

CURLOPT_COOKIEJAR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Considering a setup where .htaccess routing is not configured - is the following a valid url?
http://localhost/books.php/mybook

A

Yes - books.php will be accessed - and mybook will be accessible through the $_SERVER[‘PATH_INFO’] global variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

What is a simple PHP way to test if a string is JSON?

A

Call json_decode on the string. Then call json_last_error and check if there was an error.

json_decode($str);
return json_last_error();
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Would the following resource URL be thought of as RESTful?

/client/add

A

No - resources are best thought of nouns. The given example is using a URL to describe an action.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

If urls should not describe actions in a RESTful system, what does describe the actions?

A

The HTTP request type (i.e. GET, POST, PUT, DELETE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

HTTP Verbs tell the server…

A

what action to take on the given url resource.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

In a RESTful system, can data be modified on the server as a result of a GET request?

A

No - it should not be modified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

POST requests should not use the resource at the given URL, rather…

A

an additional ID is supplied after the resource, to indicate which resource is to be modified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

What is the difference between safe and unsafe HTTP methods?

A

Safe methods are those that don’t modify resources at all. Those that modify data are called unsafe methods. GET (out of POST/DELETE/GET/PUT) is the only safe method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

What is an ‘idempotent’ method?

A

It is a method that achieves the same result no matter how many times it is repeated. Usually the idempotent methods in HTTP are GET/PUT/DELETE.

(i.e. a GET will always return the same results (given no changes), a repeated PUT should not change the original data, no matter how many times it is called, unless the data changes. Delete can obviously not be achieved more than once on a given resource. Once deleted - it’s deleted.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

What method in HTTP is not considered idempotent?

A

POST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

Is the nature of the idempotent/non-idempotent methods determined by the HTTP spec?

A

No - it is up to the programmer to make sure the methods behave in this way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

The content type of the HTTP response is configured where?

A

The HTTP header

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

What is the benefit of modifying the HTTP header to specify the content type (beyond getting the correct content type)?

A

You change the content type based on who is consuming it - i.e. text for a browser, json for an application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

What is one of the first things you do when you receive a request for a resource in REST?

A

Determine what request type it was using $_SERVER[‘REQUEST_METHOD’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

What is the second thing that we do when receiving a request for a resource in REST?

A

Determine what resource was requested, using $_SERVER[‘REQUEST_URI’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

What PHP function do you use to notify the requesting client about the status of their request?

A

header();

i.e. it could be:
header(‘HTTP/1.1 405 Method Not Allowed’);
header(‘Allow: GET, PUT, DELETE’);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

Why would the following print:
“10NumberYeah?” instead of
“NumberYeah?10” ?

class foobar {
    protected $foobar = 10;
    public function chicken()  {
        echo $this->foobar;
    }
}
$myObject = new foobar();
echo 'Number: ' . "Yeah?" . $myObject->chicken();
A

Because echo is being used instead of return in the method chicken. Because of the way functions are called, it is called first in the echo statement at the bottom, resulting in 10 being echoed, then the rest of the string being printed.

If it had been returned, it would be appended to the end as expected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

What will the following look like?

class object {
   protected $foobar = 10;
}

echo json_encode(new object());

A

{}

The json_encode call will not access variables that are protected / private.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

What is the purpose of the array_unique function?

A

It takes and array, and returns a new version of the array with no duplicate entries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

If you use a trait on a class, and it has a method with the same name as method on that class - which method gets called?

A

The class has precedence in this case, so the method on the class will be called.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

If you use a trait on a class that extends another class - and that class has a method with the same name as the one on the trait, which one gets called?

A

The method on the trait takes precedence, therefore it is the one that is called.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

In MVC, the model is actually a what?

A

Layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

How should the M communicate with the V and C?

A

Via services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

If you extend from a base class that has a constructor that expects a value, and your child constructor doesn’t pass that value on, what error do you get?

A

No error - it allows it - but it does mean the parent will not be initialised correctly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

If you extend from a base class that has a constructor which expects a parameter, and your child class does not implement a constructor at all, what error do you get (assuming you pass the parameter when calling new)?

A

None - if you provide a parameter when creating the object. The child class inherits it’s parents constructor and that will be used to initialise the object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

If you extend from a base class that has a constructor that expects a parameter, and you do not pass any parameters when creating the child object, what error do you get?

A

Fatal error - the parent class is expecting a parameter and did not receive it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

If you extend from a base class that has a constructor that expects a parameter, but your child class has a constructor that takes no parameter, what error do you get?

A

None - it is up to you to pass on that value (using parent::__construct) - or else the system will just create it normally.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

In order to execute a global regex search using preg_match, what do you need to do?

A

Use the function preg_match_all - preg_match does not take ‘g’ as a global argument.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

In order to run a case insensitive search using preg_match, what do you do?

A

Use the ‘i’ modifier on the regex. i.e. ‘/car/i’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

Regular expressions are…

A

eager (i.e. want to return a match as soon as possible)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

The ‘.’ character in regex will match any character except…

A

newline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

What is wrong with the following regex? /9.00/

A

It will match 9.00, but it is likely it was meant to match 9.00 dollars. The error is that it will also match 9500, 9-00.

The way to fix it is to escape the meta character.

/9.00/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

Are quotation marks meta characters in regex?

A

No, it does not need to be escaped.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

Why do you need to escape forward slashes in a regex when you are trying to match a file path?

A

Because in a language where you need to enclose the regex in forward slashes, it will read the forward slash as the end of the regex (prematurely).

/foobar/foochicken/chicke.txt

Will need the regex:
/\/foobar\/foochicken\/chicke.txt/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

What character is used to search for tabs?

A

\t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

What three possibilities would you need to search for with new lines?

A

\n
\r
\r\n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

Will /gr[ea]t/ match the word “great”?

A

No - [ea] only matches on character, either e or a.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

When is the ‘-‘ character a meta character?

A

Only when it’s in a character set. i.e. [1-9]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

What is the issue with [50-99]

A

It wont give you 50-59, it’s still just 0-9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

How do you negate a character set? Say /[abcde]/

A

/[^abcde]/

This will mean it will match anything but a,b,c,d,e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Will this match the word “seem”?

/see[^k]/

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

Will this match the word “see”? (assuming there is no space after the word see)?
/see[^k]/

A

No - there still must be a character in that last spot, just not the letter k.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

Metacharacters inside a character set need to be escaped, true of false.

A

False.

[abc.ef] - The dot in this case would only match ‘.’s. It is not acting as a wildcard. Therefore it does not need to be escaped.

The exceptions are “] - ^ and " - This is because they are characters used specifically in character sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

When trying to match a sequence like
file-1 file01 file\1 and file_1,
What is wrong with the following pattern?
/file[0-_]1/

A

The dash and the \ will need to be escaped, it should be:

/file[0-\_]1/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

What is the difference between \d and \D

A

\d is a digit, \D is not a digit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

With regards to \w, word characters, what is the difference between ‘-‘ and ‘_’?

A

’-‘ is considered punctuation and is not a word characters, however ‘_’ is a word character.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

Does \d\d\d\d match 1984?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

Does \w\w\w\w match 1984?

A

Yes digits are considered word characters, but note: word characters are not considered digits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

When would you use \s over just a simple space?

A

If you wanted to also capture a tab or line return character.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

Can you use shortcuts in character sets?

A

Yes

[\d\s] is valid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

What is the difference between [^\d\s] and [\D\S]?

A

The first says
Is the character not a digit or a space (so you would get just letters)
The second says
Is the character not a digit and not a space
So in this case, you would probably get everything that is a word character

123 abc (first one would return "abc")
123 abc (second one would match all those characters, as a digit is invalid for \D (is a digit (fail) but it's not a space (pass))
125
Q

How do the repetition operators work “*+?”?

A

They affect the preceding character
* Zero or more times
+ One or more times
? Zero or one times

126
Q

What is unique about the + repetition operator?

A

It is the only operator of the three that requires that the character must exist.

127
Q

Does this match three digits or more?

/\d\d\d\d*/

A

Yes - because the * operator says “Zero or more times” so even though there are four \d’s, it will match 3 or more digit characters.

128
Q

What is another way to write “/\d\d\d\d*/”?

A

/\d\d\d+/

129
Q

How would you match color and colour?

A

/colou?r/

130
Q

What regex could you use to capture all words that end in ‘s’?

A

/\w+s/

131
Q

What is the difference between {4} and {4,} in a reg ex expression?

A

The first repetition quantifier specifies that the preceding value can occur 4 and only 4 times. The second specifies that it can occur from 4 to infinity times.

132
Q

What is another way of writing \d*

A

\d{0,}

133
Q

What is another way to write \d+

A

\d{1,}

134
Q

Repetition quantifiers are….

A

greedy

135
Q

What does a greedy expression mean?

A

It tries to match the longest possible string

136
Q

Given the following word “filename.jpg” and this expression:
/.+.jpg/
The plus is said to be greedy, but it will “give back” .jpg, what does this mean?

A

The .+ section of the expression could match all of the filename, but since the .jpg actually matches a portion, the first repetition quantifier ‘.+’ will rewind the string and let the .jpg portion make the match.

137
Q

Repetition quantifiers are greedy, and give back as little as possible. Given the string “page 266” and the expression:
/.[0-9]+/
What portion is matched by the first part of the expression,
‘.
’?

A

“page 26”

Even though the 266 is matched by the second repetition quantifier, it will only give back what it needs to to make the match, which is the last character (digit).

138
Q

How much of this string:
“foobar”, “foobaz”, “chicken”
will the following match
/”.+”, “.+”/

A

All of it:
“foobar”, “foobaz”, “chicken”
Because the repetition quantifiers are greedy

139
Q

What is the effect of placing a ‘?’ meta character after a repetition quantifier meta character?

A

It makes the quantifier lazy (as opposed to greedy)
+?
??
{min,max}?

140
Q

What is the nickname for the following expression?

+?

A

The lazy plus

141
Q

What is the nickname for the following expression?

*?

A

The lazy star

142
Q

What is the difference between a lazy strategy and a greedy strategy?

A

In a greedy strategy, the regex engine will try to match AS MUCH AS POSSIBLE, before giving control to the next part of the expression.

In a lazy strategy, the regex engine will match AS LITTLE AS POSSIBLE before giving control to the next part of the expression.

143
Q

Is the following expression valid?

/apples??/

A

Yes - but it is meaningless. The first ? specifies the s can occur zero or one times. The second? Specifies the first ? should be lazy, so it will match on ‘zero’ times. So regardless of whether it comes across a word “apple” or “apples” it will match just the “apple” portion.

144
Q

When using lazy expressions, how does the engine evaluate something like:

/.*?[0-9]+/

A

Given a word like:
foobar888
The first part of the expression (being lazy due to the ?) - will actually defer to the second part of the expression ([0-9]+). This is because the first part says the wildcard can occur 0 or more times. It’s that lazy, that it will just assume it’s not going to match (0 times). The second part of the expression is looking for a digit, it doesn’t match, so it gives the value back to the first part. It moves on one step, and then assumes again that it wont match (0 times)

145
Q

If /.?[0-9]+/ matches the string “page 266”, what would /.?[0-9]*?/ match?

A

Nothing. Because both elements of the last regex are told to match 0 or more, and both are set to lazy. So on the first character, they will both be satisfied with matching nothing, and return.

146
Q

Is using lazy expressions faster than using greedy expressions?

A

No - there is no way to optimise for this ahead of time, since you don’t know what order the data will be in. From an algorithm point of view, it does basically the same thing just reversing the priority.

147
Q

Is /.+/ faster than /.*/?

A

Yes - the /.*/ method requires that the engine backtracks to the very first character it tested in that round, to determine whether the second possibility (that the character didn’t exist) is true. This doesn’t occur with /.+/ because it has to occur at least once.

148
Q

What is faster than /.+/?

A

If you at least know the range that the number of characters may occupy, i.e. /.{3,6}/ - this becomes faster as you are able to limit the checking.

149
Q

Is /.+/ faster than /[a-zA-Z]+/ ?

A

No - the second one is faster as it is a more refined search (i.e. within limits)

150
Q

Write an expression that matches words that are uppercase and end in s?

A

/[A-Z][a-z]+s/

151
Q

What is one way to improve efficiency when searching for words in strings?

A

Search for whole words - use word boundaries etc.

152
Q

Can a grouping meta character be used inside of a character set?

A

No, character sets are for characters - grouping meta character does not make sense in this context.

153
Q

Can a grouping meta character contain character sets?

A

yes - you can group multiple character sets together into a group.

154
Q

How would you match the words “dependent” and “independent”?

A

/(in)?dependent/

155
Q

Is /runs?/ and /run(s)?/ the same thing?

A

yes - but the second version may be clearer to read.

156
Q

Is “([A-Z][0-9])” The same as “[A-Z][0-9]”

A

It will match the same things - the use of the brackets however does instruct the engine to remember this pattern for later use.

157
Q

Does the alternation character require the grouping meta characters to work?

A

No - but it’s best to do it, otherwise it could change the meaning or be hard to read.

158
Q

With the alternation character, “|” what order does the match get done in?

A

Left to right

abc|def - would try to match abc first then def

159
Q

is /applejuice|sauce/ The same as /apple(juice|sauce)/

A

No - the first one would only match applejuice or sauce. The second one would match applejuice or applesauce.

160
Q

Is the alternating (or | character) good for finding misspelled words?

A

Yes - you can do something like:

w(ei|ie)rd

This would match weird or wierd

161
Q

Does the following expression:
/apple|orange/
match
apple|orange

A

No - the | character is working as the or or alternating metacharacter in this context. To make that match you would need to escape the character.

/apple|orange/

162
Q

If you have a pattern /abc|def/
and global search is not enabled - what is matched in the following string?
abcdef

A

Just the abc portion. At that point the match is complete, and it moves on.

163
Q

What is allowed in the folowing expression ( | )?

(i.e. grouping and alternation)?

A

Basically anything, even additional alternations

164
Q

You are asked to match a string “peanut” or “peanutbutter” but your expression should prefer “peanutbutter” if it exists. Why is the following expression a bad choice, and what is the write expression?

/(peanut|peanutbutter)/

A

It is the wrong choice because the expression is greedy by default. Therefore with the following data:

“peanutbutter”

it will match the first part of the word (“peanut”) - and never give you back the preffered “peanutbutter”. You should write it like this:

peanut(butter)?

165
Q

When using the or operator, does the engine scan the whole string for the first option?

A

No - in fact it checks the first character, if it doesn’t match, then it uses the second option. If it runs out of options, THEN it moves onto the next character. That can be seen here:

abcdefxyz

xyz|abc|def

This will return “abc” and not the expected xyz, because it doesn’t parse the whole string for each option.

166
Q

What is the best way to order your options for efficiency?

A

Put the simplest, most efficient option first. i.e.

/\w+\d{2,5}|export\d{2}/

It’s best to reverse it:

/export\d{2}|\w+\d{2,5}/

Because export will quickly discard many matches as it can only match a very explicit set of characters. The second option, would need to do a lot of going forwards, and backtracktracking as the \w+ will match a lot of potential items before it realises there is no digit in the string.

167
Q

If you have the following expression:
/(aa|bb|cc|){6}/

Does this mean it will only match “aaaaaa”?

A
No - the first match doesn't affect the second match in anyway - so it could match
aabbcc
aaaaaa
cccccc
bbcccc
etc.
168
Q

Are the following strings matched by the pattern /(\d\d|[A-Z][A-Z]){3}/

aabbcc
1122cc
11aacc
cccc11
123456
A

No - the character range explicitly states they should be uppercase characters.

169
Q

Are the following strings matched by the pattern /(\d\d|[A-Z][A-Z]){3}/

AABBCC
1122CC
11AACC
CCCC11
123456
A

Yes, they are all matched, remember, when a match occurs when using an or in a grouping, the first match doesn’t affect the second one. So the first match is made, then it moves on - and starts again at the next position.

170
Q

You wanted to write a very simple expression that found all the words and spaces in a document. What is wrong with the following expression?

(\w )+

A

You are using the grouping metacharacters. This would only find one or more character and space together.

171
Q

You wanted to write a very simple expression that found all the words and spaces in a document. What is wrong with the following expression?

[\w]+

A

This expression will find one or more word characters, which is good - but fails to find spaces.

[\w ]+

Add a space into the expression and it will find all of the words that have a space between them. I.e. “sweet corn” becomes a single match.

172
Q

What is the difference between the ^ $ characters and the \A and \Z characters?

A

Only the way it handles the new line characters.

In multiline mode, ^ and $ will start matching line breaks, however \A and \Z behave exactly like ^ and $ in single line mode.

173
Q

What are the two uses for the caret ^ character?

A

The negative character set (i.e. only used when inside a set of square brackets) and beginning of line.

174
Q

What is an important distinction between anchor characters (i.e. the ^ and $ characters) and normal characters?

A

They have zero width - they refer to a position and not to a character.

175
Q

What is another way to write the expression /^apple/?

A

/\Aapple/

176
Q

What is one caveat of using \A and \Z anchor characters?

A

They aren’t widely supported (i.e. not in Javascript) - although they are available in PHP.

177
Q

There is a subtle difference between these two expressions, what are they?
[a-z .]
^[a-z]$

A

The first one will match any character between a-z, a space and a period. This means a sentence like this:
“you smell like cheese.”

Would match everything but the period. But it would still match.

The second sentence REQUIRES that from begining to end, every character is matched - and because there is no period the second expression, the sentence “you smell like cheese.” would not match. The big difference would be in something like “preg-match-all” - the first one would return an array of each matches. But the second one would return a single array element with the full match, because in order to match it has to be from beginning to end correct.

178
Q

If you are matching a string “you eat chicken” - and in regexpal you may write something like:
/[a-z ]/
This highlights all the characters showing a match, but why isn’t it what you want?

A

Because it has matched each individual character. And this means preg_match_all would return every instance of a character as an array.

Using /[a-z]+/

would be better - as it would match entire words. This is because the expression [a-z]+ is greedy, and will move through each character until it gets to a space or something else it doesn’t fit the expression. At that point, that’s the match, and that is what is returned. i.e.

“foobar….” - only “foobar” would match, and it would be returned as a single array element.

179
Q

What is matched from the following sentences if you use the pattern:

/^[a-z]+$/

chicken
Big chicken
big chicken

A

Just chicken - since it will only match lower case characters, and only words with no spaces or caps.

180
Q

Does the expression:
/^[a-z]$/

mean match any word that starts and ends with a lowercase letter?

A

No - it means find any one letter words that are on a single line (i.e. a sentence made of one letter)

181
Q

How would you just grab the first word in a sentence, if you knew it was going to be a set of uppercase characters, 3 characters long? i.e.
AAA Foobar
CCC Chiken

A

^[A-Z]{3}\s

You may want to add a space at the end to make sure you don’t pickup words like BBBc

182
Q

Can you use the anchor characters inside a grouping expression?

A

Yes - you could do something like:

(^\s{1,4}|\s{1,4}$)

Which would find all of the space characters (from 1-4) at the beginning or the end of a line.

183
Q

Why might the following expression fail to match the following words?

/[a-z ]+$/
chicken
foobar
foobaz

A

If there are line breaks after each one - then they will not match. ^$\A\Z does not match line breaks

This is called single line mode.

NOTE: The \Z option will not help in this case

184
Q

Using preg-match-all, what is the difference in output for:

[a-z]+
[a-z]+?

on

“I am lazy”

A

The first one will return an array with:
“I”, “am”, “lazy”
The second will return “I”, “a”, “m”, “l”, “a”, “z”, “y”. Because it is set to be lazy, rather than greedy. Therefore as soon as it gets a match it says “Yup, i’m done” rather than keeping on going until it can’t get a match.

185
Q

What is single line mode in a regex engine?

A

The default setting, it only matches things on a single line - and in particularly when using anchors $, it can cause problems because the expression wont match the invisible return character at the end of a line.

186
Q

If you use the beginning and end anchors - why might you find you aren’t getting a match?

A

Because of the end of line characters. You need to turn the regex engine into multiline mode. In PHP, you can simply add a ‘m’ after the regex for preg_match, and preg_match_all

187
Q

Why are the begin and end anchors different to other matches?

A

You are looking at an entire string - which is why multiline mode should probably be on.

188
Q

What are the two word boundary meta characters?

A

\b - Word boundary
\B - Not a word boundary

NOTE: These are positional characters - and do not match an actual character.

189
Q

What are the conditions that form a word boundary?

A

Boundaries form at the end of a string and the beginning of a string.

Anywhere the characters change from/to a word character to a non-word character.

Word characters being anything [A-Za-z0-9_]

190
Q

Write a regex that would find all the words in a sentence using word boundaries?

A

/\b\w+\b/

191
Q

How many word boundaries exist in the phrase “top-notch”?

A

Four. Remember - the hyphen is not a word character, so it forms a boundary.

One before the t in top,
one after the h in notch
one after the p in top
one before the n in notch.

192
Q

Write a regex that splits the following sentence into words (using word boundaries) - but treats ‘top-notch’ as a single word.

“the chicken was top-notch”

A

\b[A-Za-z-]+\b

NOTE that we include the hyphen in the characters accepted within the word - but it has to be escaped.

193
Q

What would \B\w+\B match in the string

“This is a test”

A

hi es

Because they are characters that are not at word boundaries.

194
Q

How would you extract the word “important” using non-word boundaries from this text?
AAAimportant

A

\Bimportant

195
Q

Would the word “important” be extracted from teh following string: “**important” if the following regex was used:

\B\w+

A

No - the word “mportant” would be extracted. That’s because the capital B is saying “Not at a word boundary”. And because the transition between * and i forms a word boundary, it doesn’t match. i and m ARE word characters, and therefore not a word boundary, and therefore matches.

196
Q

When you use the following regex “\b[\w’]+\b” on the word “summer’s”, what will it return?

A

The full word “summer’s”. Please note - the word boundaries still exist - but because the + is greedy, it will continue to take characters that are allowed (the ‘ is allowed due to it being in the []).

197
Q

What would be returned in the following string “summer’s” if the following regex were used:

“\b[\w’]+?\b”

A

sumer
s

That’s because there is a word boundary (triggered by the ‘ character) between r and s. Because the expression has been made lazy, and not greedy, the regex engine will check each character to see if it’s reached a word boundary, rather than being greedy until it has no characters left that match.

NOTE: To match “summer’s” as one word, remove the lazy ?

198
Q

How can word boundaries improve speed and efficiency?

A

By using a regex that has word boundaries - the engine can quickly discard matches as it moves along the string because the first thing it checks is the word boundary condition.

\b\w+s\b

I.e. this will find a word that has a boundary on each side and ends with s. The word boundaries mean it doesn’t have to parse partial parts of the string. i.e. if we have the word chicken. The engine might see there is a boundary at c - check each character to the end of the word chicken. See there is no s at the end, then backtrack to the letter h. But it doesn’t have to parse through to the end of this string again - because there is no boundaries at h, or i, or c etc. It can just move along until it finds another character with a boundary.

199
Q

Is a space a word boundary?

A

No. A word boundary is a position - not a character. It is zero length.

200
Q

Does the string “apples and oranges” get matched by the following regex?

/apples\band\boranges/

A

No - because \b (word boundary) is not equal to a space. There is a boundary there, but this doesn’t match because the spaces are not taken into account.

201
Q

Does the string “apples and oranges” get matched by the following regex?

/apples \band\b oranges/

A

No - because there are actually four boundaries between the words apples and oranges. The following would match.

/apples\b \band\b \boranges/

202
Q

What is a back reference?

A

Whenever you use parenthesis around an expression, this creates a back reference, where the actual matched data is stored.

203
Q

How do you access the data in a back reference?

A

You can use the backslash character and a number

\1
\9

204
Q

What are the two main ways that backreferences can be used?

A
  1. In the same expression as the group

2. Referenced later (usually in a programming language)

205
Q

Can backreferences be used inside character classes?

A

No - they are two fundamentally different concepts. While a character class refers to ‘types’ of characters that can be matched, a backreference actually refers to specific data.

206
Q

How might you match the string “apples to apples” using a back reference?

A

/(apples) to \1/

The first word in parenthesis is the back reference, the \1 refers to the data stored in there.

207
Q

How would you match the following string (using 3 back references). “abcdefefcdab”

A

/(ab)(cd)(ef)\3\2\1/

208
Q

Write a regex expression that will match names where the second name contains the first name within it (and a fixed suffix, like ‘son’)

Steve Stevenson
John Johnson
Billiam Billiamson

A

/\b([A-Z][a-z]+)\b\s\b\1n?son\b/

209
Q

Write a regex that finds duplicate words

Paris in the
the spring

A

\b(\w+)\s+\1\b

NOTE: In PHP you will need to use preg_match_all - as preg_match will only check a line at a time.

210
Q

What is captured in the following expression if the string is “b”
/(a?b)/

A

It captures both “” and “b”.

NOTE: Optional operators will result in an empty string being captured if an exact match is not present.

211
Q

What is the gotcha with optional characters in back references?

A

An optional character in a parenthesis will capture a result regardless of whether it is there or not. So in the case it’s the letter ‘a’, the capture will be ‘a’. If the a is not present, then the capture is ‘’ - i.e. zero length.

/(a?)b\1c/

Would match “abac”, but it would also match bc.

212
Q

What is the difference between the way the optional grouping works here

(A?)B
(A)?B

A

In the first example, the backreference \1 would contain an empty string if the A was not present.

In the second example, if A is not found, nothing is recorded, as the A was not optional. The Whole GROUP was optional, but not the A in and of itself.

213
Q

What is odd (with regards to how Javascript handles the following) expression?

/(A)?B\1/

B
ABA

A

It would seem that it should match both strings - because if A is missing, then logically the backreference would be empty too.

However, every regex engine EXCEPT javascript, does NOT match ‘B’ with this expression.

214
Q

In PHP you can use a backreference from 10-99. But how would you do this if you wanted to use a backreference \1 next to a literal 1? That would look like backreference \11 and confuse the regex engine.

A

You can use a different notation for those occurrences

{$1}1

215
Q

When you are using preg_replace, you may write an expression that takes in a little bit more than required, to guarantee you are in the right part of the file.

But this means you must…

A

Write the extra bits back into the file. So you must capture them in groups.

216
Q

To modify a csv that has a list of presidents so the first name and last name are put in separate fields and reversed, how would you do this in PHP?

1, John Adams, 1232

needs to be

1, Adams, John, 1232

A
$file_data = file_get_contents('us_presidents.csv');
$regex = '/^(\d{1,2}),([\w .]+?) ([\w ]+?),(\d{1,4})/m';
$replace = '\1,\3,\2,\4';
$result = preg_replace($regex, $replace, $file_data);
217
Q

What does the ?: meta character do?

A

If you place this within a group, for example:

(?:\w+)

This tells the regex engine that this group is a non-capturing group.

218
Q

Why are non-capturing groups useful?

A

It’s good for optimisation, and it’s also useful if your regex engine only goes from \1 - \9 backreferences. Save spaces.

219
Q

Which of the following strings:

oranges and apples to oranges
oranges and apples to apples

Is matched by /(?:oranges) and (apples) to \1/

A

oranges and apples to apples.

The reason being the first group is non capturing, so the second group becomes referenced by the number \1.

220
Q

What’s wrong with the following regex

/(?:oranges) and (apples) to \2/

A

There is no group two - so it will fail to match as expected (due the first group being non-capturing).

221
Q

The syntax for a “non-capturing group” uses two characters, the ? and :. What does the ? mean in this context?

A

It’s a character that says - “give this group a different meaning”. The “:” actually defines the group as “non-capturing”. The point being, there are other characters that work with the “?” metacharacter to give different results.

222
Q

What are the two main types of lookaround assertions?

A

Lookahead and lookbehind

223
Q

There are two types of lookahead, and lookbehind assertions. What are they?

A

Positive and negative

224
Q

What metacharacters define a positive lookahead assertion?

A

?=

225
Q

How many characters do the look ahead assertions match?

A

0, they are zero length.

226
Q

You want to find all the words in a text that end with a comma. And you use this regex:

\b[A-Za-z’]+\b,

What is the flaw in this?

A

It returns the comma as well. You should use a look ahead assertion:

\b[A-Za-z’]+\b(?=,)

This will return all words that end with a comma, by looking ahead, determining a comma exists, then doing the match. But only matching the word itself.

227
Q

What would the difference between the following two regex’s be? The intention is to have a regex that finds words that end in a comma, but don’t return the comma.

\b[A-Za-z’]+\b(?=,)
\b[A-Za-z’]+\b(?:,)

A

The second one will match and return the commma.

228
Q

What is the difference between:

?=,
(?:,)

A

The first regex is a look ahead assertion - it checks to see if the ‘,’ is there - but doesn’t match it. It is zero width.

The second regex is a non-capturing group. It actually matches the ‘,’ so is NOT zero width.

229
Q

What is the major differences between the regex’s

/(?=seashore)sea/
/sea(?=shore)/

A

They will both match the same thing. But the first one will do the look ahead first, and not start the scan for the word sea if it doesn’t find seashore.

The second one will scan for sea, then do the look ahead if it finds it.

230
Q

Which is likely to be more efficient?

/(?=seashore)sea/
/sea(?=shore)/

A

The second one /sea(?=shore)/
The reason being the first regex will search the entire text for the lookahead string “seashore”. But because a lookahead assertion is zero width, it will rewind it to the beginning of the text. Then it will star the search for sea.

With the second one, it will do the lookahead from the point it finds the word “sea” - so no rewinding.

231
Q

Even though this regular expression may be less efficient, why could it be a desirable way of writing one?

/(?=seashore)sea/

A

Because the regex will rewind after matching “seashore” and THEN start the search for sea.

This allows you to do multiple checks within a single regex.

232
Q

This matches us phone numbers
/\d{3}-\d{3}-\d{4}/
And this matches anything that is 5 or below and containing hyphens.
^[0-5-]+$

Right now, you’d have to do two passes. How would you make these a single pass regex?

A

(?=^[0-5-]+$)\d{3}-\d{3}-\d{4}

The look ahead assertion works here because it checks to see if the entire string satisfies the assertion. It does, it rewinds, then checks to see if it matches the following regex. It does. So it returns the phone number.

233
Q

(?=.*4321) would check for what?

A

It would check to see if the string 4321 is somewhere in the string. Keep in mind, the .* is saying “I’m a wildcard, just look for anything followed by…” 4321.

234
Q

Why wouldn’t the following expression match the word “light”

\b(?=gh)\w+\b

A

Because we are using word boundaries. It’s saying, we are looking for a word boundary, followed by the characters “gh” (in a lookahead assertion). Of course, the gh in light does not occur on a word boundary.

235
Q

What meta characters are used for negative look ahead assertions?

A

?!

(?!regex) /* Make sure you don’t put a space after ! */

236
Q

How would you write a basic expression using a negative look ahead assertion, so that the pattern matches “online”, only if it isn’t followed by the word “training”?

A

/online(?! training)/

237
Q

What would be one way to search through a text for words that may have ‘'’s in the sentence, but are NOT folllowed by a comma or a period?

A

\b[A-Za-z’]+?\b(?![,.])

238
Q

How would you use a negative look ahead to determine the last time the word “black” was used in a string?

“The black dog ran through the black night after the black car”

A

\bblack\b(?!.*black)

Or using capture groups

(\bblack\b)(?!.*\1)

The reason this works is that it’s testing for the word “black”, that doesn’t have the word “black” after it - which logically is the last time it will occur in a string.

239
Q

What are the meta characters for look behind assertions?

A

?<=

?<!

240
Q

How does the following parse the word “baseball”

(?<=base)ball

A

For each character, as it moves along the string, it looks backwards and says “Is the word base there”. If it is, then it will check for ball - looking forward.

241
Q

/(?

A

/ball(?

242
Q

Are look behind assertions supported in JavaScript or PHP?

A

In PHP, but not JavaScript

243
Q

What are some of the limitations of using look behind assertions?

A

Because of the complexity of look behind assertions (from a processing point of view) - they are limited to:

fixed strings,
character classes
Alternation (but only fixed length strings - i.e. all same length)
You cannot use repetition, or optional expressions.

244
Q

Look ahead assertions can be very complex, while look behind assertions….

A

must be very simple.

245
Q

What is wrong with the following expression?

?<=ben|jenn)(jamin|ny

A

In the look behind assertion, there is an alternation. Either “ben” or “jenn”. The idea being that a string of either “jamin” or “ny” will be found, then the lookbehind assertion will check to see if they are preceded by “ben” or “jenn” - (i.e. “Benajamin, Benny, Jennny”).

The issue here is that the “Jenn” string is longer than the “ben” string - in alterations that are used in look behinds, all alternations must be the same length (in PHP anyway).

246
Q

Write a reg ex to match a number “54.00” and make sure it doesn’t have a $ sign in front of it?

A

(?<![$\d])\d+.\d\d

247
Q

Given the expression (?<![$\d])\d+.\d\d
What would happen if you change the matching part of the expression, to a positive lookahead?

(?<![$\d])(?=\d+.\d\d)

A

It would give you a zero width match, however, this would allow you to do inserts - so this is very useful. It would “position” the cursor just in front of the first match.

34 54.00 $54.00

Which would be directly in front of the 54.00 in the string above.

248
Q

Write a regex that would put commas in the appropriate places in a numerical string.

A

(?<=\d)(?=(\d\d\d)+(?!\d))

(?<=\d) == Look behind assertion, make sure there is a number in front of the digits
(\d\d\d)+ == Find all the three digits grouped
(?!\d) == Make sure it doesn’t have a digit after it
(?=(\d\d\d)+(?!\d)) == Turn previous two tests into an assertion

The entire thing is a zero width match - so its useful in preg_replace

249
Q

How do you represent a Unicode character?

A

“U+” a for digit hex number

250
Q

How would you match the word “café” with unicode?

A

Noting the e is actually an é - which is represented by the unicode character U+00E9

/caf\u00E9/

/caf\x{00E9}/u (in php)

Note the use of \u to indicate it’s a unicode character. NOTE: IN PHP YOU MUST USE \x and enclose the unicode value in {}’s

251
Q

What is one of the gotchas with matching a word like “café” using regex?

A

The word cafe can be made up of up to 5 different characters. I.e. “cafe”, “café”, and “café” where the “é” is made of two separate characters.

252
Q

How would you match the é character if it was made of two separate characters?

A

caf\x{0065}\x{0301}
or
caf\u0065\u0301 (depending on regex engine)

253
Q

What wildcard can you use to search for unicode characters, which may be made up of multiple characters?

A

\X

NOTE: also matches end of line characters
Only supported in Perl and PHP

/caf\X/ matches both “cafe” and “café”

254
Q

How do the unicode property wildcards work?

A

You can use a property wild card to search for specific classes of character, via their properties (for those characters that have properties). i.e.

\p{L} - any letter
\p{M} - any mark (i.e. character that has a mark)
\p{Z} - Any white space separator
\p{S} - Symbol
\p{N} - Number
\p{P} - Punctuation
\p{O} - Other

NOTE: \P (uppercase P) does the opposite. I.e. NOT a letter.

255
Q

How would you write a regex to validate US postal codes, knowing that there are two different formats
00000
00000-0000

A

^\d{5}(?:-\d{4})*$

NOTE: It is important to use the line anchors - so that you only get matches for things that ENTIRELY match the data

256
Q

Why might you have to escape forward slashes in a regex that is to be used in a programming language?

A

Because the programming language generally requires the forward slashes as part of the regex itself, around the outside. If you don’t escape the forward slashes (for example, when matching a protocol for a URL) - it will get confused as to where the end of the regex is.

257
Q

What is the problem with having an expression like this?

[/#?]?.*

A

The first character group is optional, yet we are specifying that only forward slash, hash and question mark are valid. But then the next character is a wild card. So - if there is no character that matches in that spot, the wild card will be used anyway, making that character grouping redundant.

258
Q

How would you write a regex that matches all these? And what is wrong with the following regex?
^\d.?\d$

5.1
314.22323
0.123
.345
23

A

Because all of the elements are optional, it has the effect that preg_match_all will in fact match on line returns. The better way to handle this would be to split it into two. Using the alternation operator.

^(?:\d*.\d+|\d+)$

259
Q

What is one gotcha when pattern matching for the unicode value for the ¥ symbol?

A

There are two unicode values the ¥ symbol. \u00A5 and \uFFE5 (the full width ¥ symbol).

260
Q

If you are writing a regex for currency, what is one caveat to do with currency symbols you may have to take into account?

A

Some currencies are the given at the end of the monetary value. So it’s not a simple case of having an optional currency symbol at the beginning or end - you would also match a wrong value that has a currency symbol on both ends.

In order to write this regex, you’d have to duplicate the regex, and use an alternation. One with the check at the start, and one with it at the end.

261
Q

When trying to match a cell in an ip address, you might be tempted to use this:
^[0-255] etc. Why won’t this work?

A

This is a character set - so it will try to search for a range between 0 and 2, plus 5., giving you the possible values as 0, 1, 2 and 5.

This is a number as “string” problem.

262
Q

Which is more optimal for the regex engine?

/0?[0-9][0-9]?/ or /0?[0-9]?[0-9]/

A

The first one. By moving the optional character to the last one, greediness will kick in. This will reduce the amount of backtracking the engine will have to do.

263
Q

What is the caveat with trying to match number ranges in regex?

A

There are no numbers as such, everything is a string. So you may have to break down the possible string options, and match for that, rather than the actual number range. For instance, numbers 0-255 would be done with the following expression:

(25[0-5] | 2[0-4][0-9] | [01]?[0-9][0-9]?)

264
Q

Write the portion of a regex that could be used to match a month (1-12, or 01-12)

A

(?:0?[1-9]|1[0-2])

NOTE: We are treating this as a number to string problem. There are no number ranges.

265
Q

How would you write a regex that would match a number between 10-29?

A

[12][0-9]

266
Q

What is the following testing for?

cat[[:>:]]

A

Any word that ends with the word cat. It will match: cat tomcat

But won’t match: certificate catty catalicious

Note this syntax is not avilable everywhere, and is similar to word boundaries. It is the POSIX versions and is supported by some versions of MySQL and later PHP versions.

267
Q

If you have to do a regex that say requires a match at both ends (think a HTML tag) - how do you resolve the following issue? (note: left angle brackets out because brainscape fucking sucks)

(strong|em)(*.?)(strong|em)

NOTE: The issue is that this will match html tags that aren’t balanced - i.e. strong-em or em-strong.

A

You can use a back reference.

(strong|em)(*.?)\1

The \1 will match which ever of the two were selected / found in the first bit, therefore creating a balanced match.

268
Q

What is the issue with the following:

^( ([a-z])\1 | [a-z]) )?

A

It may not give you the expected result, because the first back reference is not actually referring to the capture closest to it. The outer “alternation” is also capturing a value, and that is what the \1 is referring too (which is NOT intuitive). To prevent this, simply make the outer alternation a non-capturing group.

^(?: ([a-z])\1 | [a-z]) )?

269
Q

What is one performance consideration when ordering your forward references, for say a password check where there must be one lower case character, and one digit?

A

It’s best to put the lookahead that is most likely to fail up front. It saves time on processing. In the case of digits and lowercase letters, most people will use lowercase letters first, so if you check for the digit first (the most likely to fail), then you are saving computational cycles.

270
Q

If you pass an object to a function, and modify it’s state in the function, does that persist outside the function?

A

Yes - an object is “passed by reference” if you like, it will in fact be modified.

(In actual fact, it’s an identifier, which is a little like a reference to a reference).

271
Q

What is the primary reason for using the clone keyword in PHP?

A

When you pass an object to a function, it is essentially being passed as reference (technically no - but you get the idea). This means a change in that function would result in a global change for that object.

Using clone allows you to pass a copy of that object, that can be changed in that function - and not affect the original copy.

272
Q

What is one argument for making all “private” methods - protected instead?

A

If a developer finds a problem with your code, and needs to override the method - they will not be able to, if it is marked private.

273
Q

Private methods are automatically considered…

A

final

274
Q

Abstract methods cannot be marked as…

A

private or final

275
Q

If an API function is declared protected final, what can a developer do with it?

A

They can access it in a sub class, but they cannot override it. They cannot access it outside of the classes (it is essentially private to the outside world).

276
Q

If there are issues associated with using protected / private for object methods - why not just mark everything public?

A

Because even if you have a convention (such as underscores to denote private) - if someone uses an exposed function, your API becomes very difficult to update - because a user will expect their function to continue to operate correctly. Effectively you are creating a scenario where you are stuck with potentially immutable objects.

277
Q

If you have an object that can only output a string (no way to access the data directly, and it doesn’t return a string) - and you need to modify it inside another object, what is one method by which you can get the data?

A

You can use ob_start().

ob_start(); // Start collecting the buffer
myObj->printData();
$data = ob_get_clean(); // Get the buffer then empty it.

print strip_tags($data); // This modifies the string by removing the tags

278
Q

You can add abstract functions to traits, do you need to declare the trait as abstract?

A

No - because traits are not directly instantiable, it is not required.

279
Q

Does a trait need to use the $this variable to access it’s variables?

A

Yes

280
Q

Given to traits, A and B with same function names “printString” - how do you resolve the conflict?

A

use A, B {
A::printString insteadof B;
B::printString as world;
}

NOTE: You MUST resolve the conflict (using instead of) - you can just redeclare A and B to different things.

281
Q

How do you redefine the visibility of a trait?

A

trait A {
public function printString() { // Do stuff here }
}

use A {
A::printString { sayHello as protected; }
}

You can alias too:

use A {
A::printString { sayHello as protected superSayHello; }
}

It’s probably recommend that you don’t redefine trait method visibility though - as it’s confusing

282
Q

How can you determine in PHP if a call to the server was AJAX?

A
/* AJAX check  */
if(!empty($_SERVER['HTTP_X_REQUESTED_WITH']) &amp;&amp; strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest') {
	/* special ajax here */
	die($content);
}
283
Q

What is the problem with placing complex object startup code in the __construct method?

A

Being a magic method, this is largely invisible to an external customer of this object, which can make using this object difficult. It’s also very hard to test an object like this. You should leave __construct to simple assignment where possible.

284
Q

What is the __destruct method useful for?

A

One useful use case is for caching - you can easily store the state of the object as it is destroyed.

Also useful for closing database connections, or other connections.

285
Q

What would be one use case for the __unset magic method?

A

If unset is called on an object property, the __unset magic method is called. If for instance that property was a database connection, this method could be used to unset any other connections that had a dependency on that connection.

286
Q

What is the __sleep magic method used for?

A

When an object is serialised, this method is run by PHP (if it is defined). This is the developers chance to clean up any connections, data etc. I..e you may want to clean up any connections to outside objects that would otherwise be serialised too.

287
Q

What is the purpose of __wakeup magic method?

A

When an object is serialised, PHP calls the __sleep method (if it exists) and any connections, resources, links to outside objects should be freed in that method.

This method gives the developer the opportunity to reverse this process when it is unserialised.

288
Q

The __sleep and __wakeup methods have been largely supplanted by what SPL interface?

A

The Serializable interface

interface Serializable {
public function serialize();
public function unserialize($serialized_string);
}

289
Q

What is the purpose of the __invoke magic method?

A

You can call an object as if it was a function.

290
Q

If you take the output of var_export (for an object), what function would you pass it through to get another object with the same properties?

A

$new_obj = eval(var_export($old_object));

Note it only gives you properties, not methods, or correct visibility specifiers

291
Q

What are some good use cases for the __clone method?

A
Prevent cloning (i..e singleton)
Reset database connections, before cloning
292
Q

What function can be used to determine what traits a class uses?

A

class_uses()

NOTE: It will not return any traits of a parent class.

293
Q

What is the potential risk of using the PHP function extract()?

A

Extract will import the symbols / variables from an array into the current symbol table. This means if used in a method of a class, it will add the variables from the array into the methods scope (not the class scope). This could result in data being overridden.

294
Q

If you are using the extract function, what steps could you take to make sure it doesn’t overwrite data?

A

You can pass, as the second argument, the constant EXTR_SKIP, which will mean it won’t overwrite the variables should the name already exist.

295
Q

When is using extract a security risk?

A

When using it on untrusted, external data (i.e. user input, or $_GET, $_FILES etc).

296
Q

How could you force a prefix on the variable names that extract could generate?

A

Pass the constant EXTR_PREFIX_ALL, then the prefix. It will prefix all created variables with that prefix.

297
Q

What would an alternative to using extract be?

A

Explicitly “importing” the variables.

$myval = $outsidearray[‘myval’];

298
Q

In older versions of PHP (pre 5.3), you could use an object as a callback by passing an array containing a reference of the object, and a string referring to the method to be called. What is the more common practice now?

A

Use the magic method __invoke.

By simply passing the object, the function treating it as a callback will trigger the magic __invoke method on the object.

299
Q

What is the syntactic difference between a closure and an anonymous function?

A

It’s essentially the use of the “use” keyword - i.e. including external variables into the scope of the function (this becomes a closure).

300
Q

If you use var_dump to examine a closure what data type will it appear to be?

A

An object. Internally Closures are implemented as objects that call the __invoke method.

301
Q

What’s the purpose of the Closure::bindTo method?

A

You can bind a closure to a new context, and in the process create a new anonymous function with that objects data available. The bound object determines what the value of “$this” will be within the function/object.

302
Q

Exceptions should be caught by the layer that…

A

generated it. I.e. if your database layer generated the exception, then the layer that called the database should handle it.

303
Q

Before an exception reaches the user it should be…

A

handled. It can of course bubble up to the user (as long as it’s being “bubbled” up by handling it, not letting PHP handle it. From the layer that triggers it up to the controller. However - if at that point the error is fatal - a useful message should be produced.

304
Q

What should you do when an exception is raised?

A

log it - it should be logged for further study (remember, this is an exceptional situation).

305
Q

Why is logging exceptions easy?

A

Because the exception classes implement __toString. This means it’s easy to cast the data from the exception to a string.

error_log($myexception);

306
Q

Exceptions should never be…

A

…silenced. Don’t catch the exception, but ignore it.

307
Q

Exceptions should not be thrown when…

A

…there is no intention of resolving it.

308
Q

Is it possible to raise an exception while handling an exception?

A
Yes, you can do so within nested try / catch blocks.
try {
   try {
      throw new ExceptionA('message 1');
   } catch (ExceptionA $e) {
      throw new ExceptionB('New exception', 0, $e);
   }
} catch (ExceptionB $e) {
   var_dump($e->getPrevious()->getMessage();
}

// NOTE we are passing the previous exception as an argument to this exception.

309
Q

Do exceptions have to be “fatal”?

A

No - if you can handle the error properly, handle it.