Snowflake Features & Architecture Flashcards
User Defined Function (UDF)
User defined functions (UDF) are schema-level objects that enable users to write their own functions in four different languages: SQL, JavaScript, Python, and Java
-UDFs accept 0 or more parameters
-UDFs can return scalar or tabular results (UDTF)
-UDFs can be called as part of a SQL statement.
-UDFs can be overloaded.
JavaScript UDF
-JavaScript is specified with the language parameter.
-Enables use of high-level programming language features.
-JavaScript UDFs can refer to themselves recursively.
-Snowflake data types are mapped to JavaScript data types.
Java UDF
-Snowflake boots up a JVM to execute function written in Java.
-Snowflake currently supports writing UDFs in Java versions 8.x, 9.x, 10.x, and 11.x
-Java UDFs can specify their definition as in-line code or a pre-compiled jar file.
-Java UDFs cannot be designated as secure.
Create or Replace External Function
Create or Replace API Integration
External Function Call Lifecycle
External Function Limitations
-Slower due to not being to optimize outside process
-Scalar only
-Not shareable
-Less secure
-Egress charges
Stored Procedure
In Relational Database Management Systems (RDMS) stored procedures are named collections of SQL statements often containing procedural logic.
Snowflake Stored Procedure Options
- JavaScript
- Snowflake Scripting - SQL
- Snowpark - Python, Java, or Scala
Stored Procedure: JavaScript
Stored Procedures & UDFs
Sequences - Inset into a Table
Task
A task is an object used to schedule the execution of a SQL command or a stored procedure.
Task Workflow
Tree of Tasks
Streams
A stream is an object created to view & track DML changes to a source table - inserts, updates, & deletes
Streams Code
Billing Options
On-Demand - Pay for usage as you go
Capacity - Pay for usage upfront
Billing - Services Billed
- Virtual Warehouse Services - Credits
- Cloud Services - Credits
- Serverless Services - Credits
- Storage - $ Dollar Value
- Data Transfer -$ Dollar Value
Virtual Warehouse Services Billing
- Credit calculated based on size of virtual warehouse
- Credit calculated on per second basis while a virtual warehouse is in ‘started’ state
- Credit calculated with a minimum of 60 seconds
Cloud Services Billing
- Credits calculated at a rate of 4.4 credits per compute hour
- Only cloud services that exceeds 10% of the daily usage of the compute resources are billed
- This is called the Cloud Services Adjustment
Serverless Services Billing
- Each serverless feature has it’s own credit rate per compute-hour
- Serverless features are composed of both compute services and cloud services
- Cloud Services Adjustment does not apply to cloud services usage when used by serverless features
Data Storage Billing
- Data storage is calculated monthly based on the average number of on-disk bytes per day in the following locations:
- Database Tables
- Internal Stages
-Costs calculated based on a flat dollar value rate per terabyte (TB) based on: - Capacity or On-demand
- Cloud provider
- Region
Data Transfer Billing
- Data transfer charges apply when moving data from one region to another or from one cloud platform to another
- Unloading data from Snowflake using COPY INTO <location> command</location>
- Replicating data to a Snowflake account in a different region or cloud platform
- External functions transferring data out of and into Snowflake
Connectors and Drivers Available
- Python
- Go
- PHP
- .NET
- NodeJS
- Spark
- Kafka
- JDBC
- ODBC
Snowflake Partner Tools Categories
- Business Intelligence
- Data Integration
- Security & Governance
- SQL Development & Management
- Machine Learning & Data Science
Snowflake Partner Connect
a feature to expedite connectivity with partnered tools
Snowflake Scripting
-an extension to Snowflake SQL that adds support for procedural logic
- It’s used to write store procedures and procedural code outside of a stored procedure
Branching Construct
Looping Constructs
CURSOR
RESULTSET
Snowpark API Languages
- Java
- Scala
- Python
Snowpark API Methods
.select()
.join()
.group_by()
.distinct()
.drop()
.union()
.sort()
Snowpark API Abstraction
DateFrame
Snowpark
- Snowpark API
-Lazily-evaluated/executed
-Pushdown computation