Module Two - Languages of Data Science Flashcards
Name some key features of Caffe framework
High Performance: It can process over 60 million images per day on a single GPU.
Focus on Convolutional Neural Networks (CNNs) which are commonly used in image-related tasks like classification, detection, and segmentation.
Layer-Based Architecture: Models in Caffe are built as a series of layers, with each layer representing a specific operation (e.g., convolution, pooling, activation).
Cross-Platform Compatibility
Python libraries: Pandas
Pandas is an open-source Python library that provides data structures and data analysis tools, primarily for manipulating and analyzing structured data in the form of Data Frames and Series.
TensorFlow
TensorFlow is an open-source machine learning framework developed by Google that facilitates the creation, training, and deployment of deep learning models through a flexible and comprehensive ecosystem of tools and libraries.
Why is SQL an American National Standards Institute (or AN-see) standard?
SQL is an American National Standards Institute (or AN-see) standard, which means if you learn SQL and use it with one database, you can apply your SQL knowledge to many other databases.
What is the WEKA software suite intended for?
WEKA (Waikato Environment for Knowledge Analysis) is a popular open-source software suite for data mining and machine learning. Developed at the University of Waikato in New Zealand, WEKA provides a collection of algorithms and tools for various data analysis tasks, making it a widely used tool in both academic and industrial settings.
Python libraries: SciPy
SciPy is an open-source Python library used for scientific and technical computing, offering a wide range of functionalities for optimization, integration, interpolation, eigenvalue problems, and other advanced mathematical operations built on the NumPy library.
Python libraries: NumPy
NumPy is a powerful open-source Python library for numerical computing that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently.
Python libraries: PyTorch
PyTorch is an open-source machine learning library for Python that provides a flexible framework for building and training deep learning models using dynamic computation graphs and tensor operations.
Can Python be used for NLP?
Yes. Python can also be used for Natural Language Processing (NLP) using the Natural Language Toolkit (NLTK).
What are the main characteristics of the Julia language?
Julia is a compiled language designed in MIT for high-performance numerical analysis and computational science.
Julia provides speedy development like Python or R, while producing programs that run as fast as C or Fortran programs.
It’s compiled which means that Julia code is executed directly on the processor as executable code.
It calls C, Go, Java, MATLAB, R, Fortran, and Python libraries, and has refined parallelism.
What is JuliaDB?
Developed with Julia, JuliaDB is a Data Science package for working with large persistent data sets.
Who are the typical users of R Lang?
Statisticians, mathematicians, and data miners use R to develop statistical software, graphing, and data analysis.
R Language’s array-oriented syntax makes it easier to translate from math to code for learners with no or minimal programming background.
Name some frequently used Python libraries for Data Science
For data science, you can use Python’s scientific computing libraries like Pandas, NumPy, SciPy, and Matplotlib.
Why is SQL different from other software development languages?
SQL is different from other software development languages because it is a non-procedural language.
SQL stands for Structured Query Language.
It was designed for managing data in relational databases.
SQL is an ANSI standardized language.
If you learn SQL and use it with one database, you can apply your SQL knowledge with many other databases easily.
As of 2024, which other deep learning framework have superseded Caffe?
TensorFlow and PyTorch have largely superseded it for broader machine learning applications.
While Caffe is great for CNNs and computer vision, it lacks the flexibility and ease of use for other types of deep learning models like RNNs, which are more easily handled by frameworks like TensorFlow and PyTorch.