Elements Of Computational Thinking Flashcards
Abstraction
Abstraction is one of the most important principles in Computer Science and is a critical part of computational thinking. It is the process of removing excessive details to arrive at a representation of a problem that consists of only the key features. Abstraction often involves analysing what is relevant to a given scenario and simplifying a problem based on this information. This is called representational abstraction.
Another form of abstraction involves grouping together similarities within a problem to identify what kind of problem it is. This is called abstraction by generalisation and allows certain problems to be categorised as being of a particular type. Thus a common solution can be used to solve these problems.
Data abstraction
Data abstraction is a subcategory of abstraction in which details about how data is being stored are hidden. As a result, programmers can make use of abstract data structures such as stacks and queues without concerning themselves with how these structures are implemented.
Uses for Abstraction
Programmers can also perform functions such as pushing and popping items to and from a stack without having any knowledge about the code used to implement this functionality. This is called procedural abstraction and is also used in decomposition. It models what a subroutine does without considering how this is done. Once a procedure has been coded, it can be reused as a black-box.
Very large, complex problems make use of multiple levels of abstraction, where each level performs a different role. The highest levels of abstraction are closest to the user and are usually responsible for providing an interface for the user to interact with hardware whereas the lowest levels of abstraction are responsible for actually performing these tasks through the execution of machine code.
The need for abstraction
At its core, abstraction allows non-experts to make use of a range of systems or models by hiding information that is too complex or irrelevant to the system’s purpose.
Abstraction enables for more efficient design during software development as programmers can focus on elements that need to be built into the software rather than worrying about unnecessary details. This then reduces the time needed to be spent on the project. Removing wasteful details early on also prevents the program from getting unnecessarily large.
Layers of abstraction are used within networking and programming languages. Programming languages can be separated out into a spectrum of high and low-level languages. Low-level languages such as assembly code and machine code directly interact with computer systems but are more difficult to write. Programming using machine code requires having an understanding of the functions specific binary codes perform and although assembly code is easier to memorise, it still requires programmers to know the mnemonics associated with the instruction set specific to the processor. High-level languages provide an abstraction for the machine code that is in fact executed when a program is run. This makes the process of developing programs easier, as syntax in high-level languages parallels natural language and is considerably easier to learn and use compared to low-level languages. This has also made coding accessible to non-specialists.
The need for abstraction (TCP/IP)
The TCP/IP model is an abstraction for how networks function, separated into four layers of abstraction: application, transport, internet and link. Each layer deals with a different part of the communication process, and separating these stages out makes them simpler to understand. Each layer does not need to know how other layers work. Outgoing communication is visualised as going down these layers, while incoming information can be imagined as going up these layers. However, it is also important to ensure compatibility
between these layers so standards must be agreed in advance. The TCP/IP model uses a set of protocols which means that each layer can be dealt with individually, with details about other layers being hidden.
The difference between abstraction and reality
Abstraction is a simplified representation of reality. Real-world entities may be represented using computational structures such as tables and databases. Real-world values are often stored as variables.
Object-oriented programming makes use of objects, which are also an abstraction for real-world entities. In object-oriented programming, abstraction considers the functionality, interface and properties of entities. Attributes are an abstraction for the characteristics of an object while methods are an abstraction for the actions a real-world object is able to perform.
When devising an abstract model given a scenario, you must consider:
- What is the problem that needs to be solved by the model?
Can the problem be solved computationally? What are the key features of the problem? - How will the model be used?
What sort of format does the model need to be displayed in? Consider factors such as convenience, affordability and ease of access. - Who will the model be used by?
How many people will be using the model? What level of expertise do they have in the subject/ discipline associated with the problem? - Which parts of the problem are relevant based on the target audience and the purpose of the model?
Remove sections that are not relevant to the problem that needs solving. Remove details that will confuse the audience.
Identify the inputs and outputs for a given situation
Designing a solution entails thinking ahead about the different components of a problem and how they will be handled in the best way possible. Thinking ahead allows developers to consider problems or difficulties that may arise when the software is used. Taking these factors into account at an early stage, developers can design strategies to make programs easy and intuitive to use.
At their core, all computational problems consist of inputs which are processed to produce an output. Inputs include any data that is required to solve the problem, entered into the system by the user. Often, the order in which data is input and the method of input must also be taken into consideration. Outputs are the results that are passed back once the inputs have been processed and the problem solved. Designers must decide on a suitable data type, structure and method to use in order to present the solution, given the scenario.
Preconditions
Preconditions are requirements which must be met before a program can be executed. If the preconditions are not met, the program will fail to execute or return an invalid answer. Specifying preconditions means that a subroutine can safely expect the arguments passed to it to meet certain criteria, as defined by the preconditions. Preconditions can be tested for within the code but are more often included in the documentation accompanying a particular subroutine, library or program.
Preconditions can also be included within the documentation, in which case it is the user’s responsibility to ensure inputs meet the requirements specified by these preconditions. A common example of this is the factorial function, which can only be called upon positive numbers. Rather than checking that the arguments passed to the function must be non-negative, this is specified within the documentation accompanying this function. Including preconditions within the documentation reduces the length and complexity of the program as well as saving time needed to debug and maintain a longer program.
The purpose of preconditions is to ensure that the necessary checks are carried out before the execution of a subroutine, either by the user or as part of the subroutine. By explicitly ensuring these conditions are met, subroutines are made more reusable.
Caching
Caching is the process of storing instructions or values in cache memory after they have been used, as they may be used again. This saves time which would have been needed to store and retrieve the instructions from secondary storage again. Caching is very common in the storage of web pages. The web pages that a user frequently accesses are cached, so the next time one of these pages is accessed, content can be loaded without any delay. This also means images and text do not have to be downloaded again multiple times, freeing up bandwidth for other tasks on a network.
A more advanced variation of caching and thinking ahead is prefetching, in which algorithms predict which instructions are likely to soon be fetched. The instructions and data which are likely to be used are then loaded and stored in cache before they are fetched. By thinking ahead, therefore, less time is spent waiting for instructions to be loaded into RAM from the hard disk.
Clearly, one of the biggest limitations to this is the accuracy of the algorithms used in prefetching, as they can only provide an informed prediction as to the instructions which are likely to be used and there is no guarantee that this will be right. Similarly, the effectiveness of caching depends on how well a caching algorithm is able to manage the cache. Larger caches still take a long time to search and so cache size limits how much data can be stored. In general, this form of thinking ahead can be difficult to implement but can significantly improve performance if implemented effectively.
Reusable Program Components
Commonly used functions are often packaged into libraries for reuse. Teams working on large projects that are likely to make use of certain components in multiple places might also choose to put together a library so these components can be reused. Reusable components include implementations of abstract data structures such as queues and stacks as well as classes and subroutines. When designing a piece of software, the problem is decomposed: it is broken down into smaller, simpler tasks. This allows developers to think ahead about how each task can be solved, and identify where program components developed in the past, or externally-sourced program components, can be reused to simplify the development process.
Reusable components are more reliable than newly-coded components, as they have already been tested and any bugs dealt with. This saves time, money and resources. Subroutines can then simply be reused with different arguments to produce a variety of outputs. Producing well-tested, reusable components means that they can also be reused in future projects, saving development costs. However, it may not always be possible to integrate existing components developed by third parties due to compatibility issues with the rest of the software. This may mean these components need to be modified to work with existing software, which can sometimes be more costly and time-consuming than developing them in-house.
Identify the components of a problem
In computer science, thinking procedurally makes the task of writing a program a lot simpler by breaking a problem down into smaller parts which are easier to understand and consequently, easier to design.
The first stage of thinking procedurally in software development involves taking the problem defined by the user and breaking it down into its component parts, in a process called problem decomposition. In this process, a large, complex problem is continually broken down into smaller subproblems which can be solved more easily. By separating the problem into sections, it becomes more feasible to manage and can be divided between a group of people according to the skill sets of different individuals.
This process requires software developers to consider a problem in terms of the underlying subproblems that need to be solved to achieve the desired result
Problems are commonly decomposed using top-down design:
This is also known as stepwise refinement, and is the preferred method used to approach very large problems, as it breaks problems down into levels. Higher levels provide an overview of a problem, while lower levels specify in detail the components of this problem.
The aim of using top-down design is to keep splitting problems into subproblems until each subproblem can be represented as a single task and ideally a self-contained module or subroutine. Each task can then be solved and developed as a subroutine by a different person. Once programmed, subroutines can also be tested separately, before being brought together and finally integrated.
Identify the components of a solution
This is the stage in which the details about how each component is implemented are considered. You will be able to see below how separating out these components has made it easier to identify a feasible and programmable solution.
In the same way that we broke down the problem, we must also build up to its solution. In order to identify the components of the solution, each programmer must evaluate the component of the problem allocated to them and assess how it can best be solved. Going back to our previous example involving the book reservation system, we need to consider the lowest-level components.
Order of steps needed to solve a problem
When constructing the final solution based on the solutions to the problem components, thinking about the order in which operations are performed becomes important. Some programs might require certain inputs to be entered by the user before the processing can be carried out. These inputs would also need to be validated before they can be passed onto the next subroutines, which must also be taken into consideration.
It might be possible for several subroutines to be executed simultaneously within a program, and programmers must identify where this is possible by looking at the data and inputs the subroutine requires. Some subroutines will require data from other subroutines before they are able to execute, and so will be unable to execute simultaneously. In this case, programmers should determine the order in which subroutines are executed, as well as how they interact with each other, based on their role in solving the problem.