Diploma exam Flashcards

Question

Describe data classification methods and methods of training classifiers

Answer 1

* DNN * K neighrest neighbours * knn center * decision trees * SVM decsioins based on plane separating classes * bayesian classification - works based on the conditional probability of each known feature ...? all above methods are firstly trained then tested * half by half * cross validation * k-fold validation * leave one out

Answer 2

**business understanding** - **reports** for understanding business goals || **data understanding** - creating and using documentation - **graphs** (histograms, scatter plots) - **regression** (test of correlations between the values) | **data preparation** - **cleaning data** (removing missing values) - **data transformation** (normalisarion, merging, concatenation, focusing on important things) - process of **grouping the data**, deciding only on important fields in relation to this research (reducing the number of dimensions) || **modeling** - choosing model - **test scenarios for** purpose of model validation (association rules, decision trees, random forest (group of trees which help to make more informed decision), clustering methods, time series analysis, bayesian classification, KNN - k-neighrest neighbours) | **evaluating results** in case results are not enough go to step 1 - **ROC curve**, **confusion matrix, MSE, cross validation**, k-fold, half to half, leave one out - | v **deployment of the model** -** final report, implementation plan**, monitoring and maintenance plan.

Answer 3

* flexibility (good for the purpose of storing **data with different parameters** each object like in case with ecommerce) * appropriate for the **horizontal scaling** noSQL DB-s are constructed in such a way to so that it is easily to distribute the data between many servers what helps in effective access and security of the data * distributed architecture * cost effective - as it can use **multiple cheap servers **instead of investing in the expensive one * because it is able to hold complex tables it can kind of store overhead structures which are well suited for the purpose of **performance** their applications due to scalability: * for the purpose of the **web applications** - because their vertical scaling capabilities makes it appropriate for the purpose of handling fluctuating number of users * e-coomerce * Big Data * **IoT data** - because these devices all together generate huge amount of data they require scalable solution for the purpose of storing this data due to DS they operate on: **e-coomerce** - in case of constant changes within the attributes of the diverse products in the catalog its appropriate as it is able to cope with flexibility od DS **Big Data** - good for the purpose of big data which consist out of many different types of data which are constantly changing or being upgraded by a new type of the data

Answer 4

CC-TURA Consistency Completeness Timeliness - data is current enough to be used understandability Relevance Accuracy

Answer 5

Goods: Tangible items that are produced, sold, and used, such as electronics, clothing, or furniture. Services: Intangible offerings that involve performing tasks or providing expertise for others, like consulting services, healthcare, or restaurant dining. G/S * can be stored/ * tangible * /consumed instantyl * transfer from one place to another is possible

Answer 6

**Monopoly** **short and long run are the same** due to the lack of competition. MR curve is typically below D what makes MC MR intersection cross below optimal quantity level. And as there is no competition price is then set to its limits, so at higher level than in competitive markets --- Companies strategy due to the price and quantity does not change because there are no competitors who could compete on the market. However profit of monopolies may change in situation if they would **work on reducing the costs** if price od the good is above ATC - income = ATC - break even point between ATC and AVC - operating at a loss below AVC - shut down --- **Perfect Market** short - companies may make profit, as well as losses by setting higher or lower price than the competition. --- long - price to high eventually leads to earning no profit as other firms produce exactly the same good. Price too low if manageable without making lose eventually causes other firms to lower their price as they sell the same good, but if operating on the lose causes company to fail

Answer 7

Sender-Message-Channel-Receiver encodes, medium transfer and potential interruptions, decodes written formal * scientific papers, books, articles * required by low or other regulations * slowest medium verbal formal * conferences, business presentations, official meetings, job interview verbal formal * typical emails verbal formal * solving simple cases * talking * solving basic cases * fastest medium communication methods: interactive: two or more sides multidirectional push: sent to specific receiver pull: send once and available for multiple receivers

Answer 8

**economic resources owned or controlled by the individual or company which have potential to provide the future benefits.** tangible - office, furnishing, cash, machinery, vehicles intangible - patents, shares, trademarks, copyright current **below one year** * cash, * inventory finished goods ready for sale, * raw materials used in production, * goods and materials that a company holds for the purpose of resale or for use in its production process noncurrent - assets expected to provide benefits beyond one year * equipment - tangible assets that a company uses in its operations to generate revenue or facilitate its business activities * property * long-term investments

Answer 9

define research problem * main reserach problem * sepcific research problems - supplementary questions in relation to main research problem * research objectives - formulate ro fro the purpose of evaluating of the main rp --- prepare research design * establish available data sources * create based on them questionnaire which will be able to establish research objectives --- data collection * establish target group * find the ways to collect enough responses from the target group * data processing and coding (e.g. from descriptive scale to numeric) --- analyzing and interpreting data: * analyzing, calculating statistics, averages, preparing comparisons, graphs * interpreting results of analysis in the context of research objectives if they are fulfilled or not --- creating research report * its purpose is to help company to decide whether to invest in the idea they had or not

Answer 10

NPV - bet present value (evaluates if project would earn or not) --- PP --- DPP (discounted) --- IRR - Its meaning is the minimal Rate of Return which will cause project to be successful. --- MIRR it solves to IRR problems: * multiple IRR problem and * First, whereas the regular IRR assumes that the **cash flows from each project are reinvested at the IRR itself**, the MIRR assumes that cash flows are **reinvested at the cost of capital**. --- PI (profitability index) = discounted cash inflows / cashoutflows

Answer 11

CLI / GUI / NUI less user friendly text interaction (keybord mainly) low level operations lower flexibility higher, more complex tasks which can be done using that

Answer 12

FUGI CD F - flow chart U - UML G - Gnatt Chart I - IDEF0 C - Colourd Petri Nets D - Data FLow Diagram https://docs.google.com/document/d/1dZogcIPGFFFDJO0Gxu9ZnKyDv7ll1dXCxj6Zq74ciD0/edit?usp=sharing

Answer 13

low FPS in CS on CCC map F - figure-ground P - proximity S - similarity C - common fate S - symmetry C- common region C- closure C - continuity

Answer 14

CIA Confidentiality - information is kept private and can be access only by authorized users - data breaches Integrity - ensures accuracy and reliability(wiarygoność) of the data, "data manipulation" attack Availability - data is accessible at any time DOS attacs