When requesting data the request will go via the cache. The cache is smaller and faster than the original data source and will hold a subset of the original data. If the requested data is in cache then the copy in cache is returned - cache hit However, if the data isn't in cache it must be copied from the original data source into cache and then returned to the user - cache miss A cache miss is very slow even slower than going directly to the data source

This method will initially update the cache and only update the underlying data source when the data is ejected from the cache Write back will perform better if the hit rate is high, however, it is more complex to implement Every cache entry is given a dirty flag. When an entry is first added to cache the dirty flag is false. When an update occurs the flag is set to true When the entry is ejected, the flag is checked and the original data source is updated. Entry ejected with flag set to true will need to be saved to the backing store.

2.1.2 Thinking Ahead Flashcards by T Isl

What is thinking ahead

Requires us to take a step back and consider what the solution may look like, what it would entail, what conditions it would work under and how it will work with other wider systems
Allows us to consider how solution might be built and make implementation decisions the problem would be broken down

How well did you know this?

Not at all

Perfectly

How do you identify different inputs and outputs

To fully understand the problem it must be broken down into smaller problems usimg top-down design techniques
- allows a modular solution
- For each module, the identification of the inputs and outputs will be the first stage in developing a solution

How well did you know this?

Not at all

Perfectly

What are the important considerations to make when deciding the inputs and outputs

Data Type -Ensure correct data is passed to the module
Structure of the data - Will it be primitive data or data structure. String/list
Validation - particular format/ acceptable range
Number of Arguments - Does it require multiple parameters to be sent to it
Where would the input come from/ output go -Will the data input be automatic or entered manually. Stored in database/ simple text file.
Are there preconditions - Something need to happen in order to solve

How well did you know this?

Not at all

Perfectly

When handling user input what has to be done

Validation will need to be coded into the module to ensure the data is in the correct format for the program to process
This is to be prevent incorrect data entering the system
Bugs may occur if invalid data is inputted. Allow hackers to bypass security
Bigger problem is modules are daisy chained (out of 1 is input of another)

How well did you know this?

Not at all

Perfectly

How do you use cache

When requesting data the request will go via the cache.
- The cache is smaller and faster than the original data source and will hold a subset of the original data.
If the requested data is in cache then the copy in cache is returned - cache hit
However, if the data isn’t in cache it must be copied from the original data source into cache and then returned to the user - cache miss
- A cache miss is very slow even slower than going directly to the data source

How well did you know this?

Not at all

Perfectly

What is a hit ratio/ hit percentage

The number of times a cache is hit or missed will determine its overall performance
This can be measured by comparing the number of hits vs the misses, known as the hit rate/hit ratio which is expressed as a percentage

How well did you know this?

Not at all

Perfectly

What is Cache management - reading

When a cache miss occurs, an algorithm will be required to decide which if any data entry should be ejected from the cache to make way for new data
The choice of algorithm directly impacts efficiency of the cache and how fast a cache can be managed.

How well did you know this?

Not at all

Perfectly

What are different algorithms that can be used to decide which data entry is taken out of cache

Clairvoyant Algorithm
Least Recently Used Algorithm
Using a counter

How well did you know this?

Not at all

Perfectly

What is the Clairvoyant Algorithm

Will swap out entries that aren’t going to be used for the longest amount of time
This requires information about the future use of cache, it is not realistically implementable in general purpose computers
- At any given point, this algorithm will always make the right choice for a swap to ensure the hit rate is as high as possible
This can be approximated by using analysis of the past use of cache
Performance of any caching algorithm can be measured by comparing the hit rate to the clairvoyant algorithm

How well did you know this?

Not at all

Perfectly

What is the Least Recently Used Algorithm

Will swap out the least recently used page or entry
This uses historic information rather than predicting the future
1 Way to implement LRU is to use a linked list
- Front of list = recently used and back = not recently used

How well did you know this?

Not at all

Perfectly

How is the counter used as a caching algorithm

For every access to cache assigns the value of the counter to the entry and increments it by 1
- That way the entry with the lowest counter would be removed

How well did you know this?

Not at all

Perfectly

What are the 2 ways of cache writing management

When dealing with making changes to the underlying data source of a cache there are 2 main strategies
- One is to update both the cache and the data source at once (known as write through)
- The other is to initially update the cache and only update the underlying data source when its data is ejected from cache (write back)

How well did you know this?

Not at all

Perfectly

What is Write Through

This method will update both the cache and the data source at the same time
This is simple to implement, however all data writes will be slow as it has to be written to both
This is particularly problematic if there is a large number of write occurs
Simple to implement however data writing is slow.

How well did you know this?

Not at all

Perfectly

What is Write Back

This method will initially update the cache and only update the underlying data source when the data is ejected from the cache
Write back will perform better if the hit rate is high, however, it is more complex to implement
Every cache entry is given a dirty flag.
- When an entry is first added to cache the dirty flag is false. When an update occurs the flag is set to true
When the entry is ejected, the flag is checked and the original data source is updated. Entry ejected with flag set to true will need to be saved to the backing store.

How well did you know this?

Not at all

Perfectly

What are the 3 types of caching

CPU cache
Disk caching
Web caching

How well did you know this?

Not at all

Perfectly

What is CPU cache

CPU cache is a hardware cache used by the central processing unit of a computer to reduce the time spent retrieving from Main Memory
Cache is smaller, faster memory, located to the processor core

What is Disk caching (VM) (Paging)

Many operating systems take advantage of large amounts of unused memory by implementing a page cache
- This will pre-fetch commonly used data from the hard drive such as regular accessed applications and store them as pages in RAM
- As hard drives are much slower than memory - significantly increase performance
- Adding more memory to a computer can increase performance as more pages can be cached

What is web caching

Web browsers will cache images and files downloaded over HTTP
This helps reduce bandwidth required, as files are accessed on the local computer not the internet
- Work differently as they store files in hard drive rather than in memory
If the cached page is updated then the updated page has to be retrieved from the server and the last entry is deleted. News pages wont want their homepages cached.

What are the benefits of caching

Faster access to data when a cache hit occurs
Can reduce bandwidth requirements and loads on other resources
A second source of information can be accessed if the master source fails

What are the drawbacks of caching

Increased complexity when dealing with a cache miss means that accessing backing stores is slower when cache is enabled
Being given a stale version of data when the original data has changed
Finding data in cache is linked to the size of the cache. Larger caches will take longer to search

Why is reusable code needed

Prebuilt code speeds up the development process
The code will have been tested and will have a single place to update and fix bugs - helps make code base more stable
The code will have been developed by a expert - spends a long time making code efficient as possible
Any updates to the reusable code means that improvements will automatically be implemented without the need to change any of the code
- If the code is dynamically linked to the library then there is no need to recompile

What is reusable code

Commonly used instructions are packaged into libraries for reuse
Teams working on large projects are likely to make use of certain components in multiple places - so reusable code
Reusable components include implementations of abstract data structures such as queues and stacks as well as classes and subroutines

HW Q) Explain how programmers make use of reusable components when developing large systems

Development time can be reduced by using code that has been tried and tested
As the code has been thoroughly tested the final product should be more reliable
Software is modular and can be shared through the use of library software

HW Q) Explain the term ‘hit ratio’ means for caching and why it is important to keep it high

The hit ratio shows the number of cache hits in comparison to cache misses
When a cache miss occurs, data has to be copied from the main store to the cache which could cause stale data to be ejected.
This process is slow and will reduce overall speed of the system.
The more cache hits, fater the system will perform; therefore its crucial

HW Q) Explain why it is important to use abstraction when creating reusable components

* If modules destined for reuse were made specific to the implementation they were first used in, then it would be harder to use them in a different context. * Abstraction is the process of separating the idea from the implementation ; this is key for reusing code in different contexts

HW Q) A student decided to create a revision app for exam. Expalin how student could use abstraction in this scenario

* App could be made more generic and can be used for any subject. * By considering what the questions in different subjects have in common and removing subject specific styles, it would be possible to create an app which could be used for any subject

HW Q) Most programming APIs are written abstractly. Explain why this is a good practice

* APIs are used in many different scenarios and applications * The designers of the API have no knowledge as how their code will be used and therefore need to mae it as abstract as possibel * If the code was too specific, it would prevent others from using the API effectively

What are the key considerations when dealing with web cache

* Freshness * Will access the cache without checking the server. How long a file is fresh is controlled by the server and the client * Validation * If the file isnt fresh anymore then the last modified header is checked to see if file must be fetched again * Invalidation * The file held in cache is set to invali. Meaning it will have to be fetched from the server.

How do you make classes more abstract

* Develop software libraries first and then make use of these to develop multiple applications. * Inheritance reuse - allows common methods and attributes to be placed in a superclass. Code can inherit and extend functionality * Component reuse- when a fully encapsulated set of classes is produced which the developer can reuse * Framework reuse - more common in larger projects. A collection of classes which provide basic functionality for a specific problem domain.