Getting and cleaning data - XML Flashcards

1
Q

XML THE ESSENTIALS

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

XML ARCHITECTURE

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

XML FIRST STEPS

A
  • library(XML)
  • Assign a variable to shorten code lines
    > fileUrl <- “http://www.w3schools.com/xml/simple.xml”
  • Load document into memory
    > doc <- xmlTreeParse(fileUrl,useInternal=TRUE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

XML EXPLORATION DRILLING

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

XML EXTRACTING DRILLING - 1

A
  • Extract and display 1st section
    > rootNode[[1]]
  • Belgian Waffles
    $5.95
    Two of our famous Belgian Waffles with plenty of real*
  • maple syrup*
    650
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

XML EXTRACTING DRILLING - 2

A
  • Extract subsection 1 of section 1
    >rootNode[[1]][[1]]
    Belgian Waffles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

xmlValue

A

function used to extract value of a named XML node

e.g. xmlSApply(xmlNode[1][1] , xmlValue)

Will extract and display the content of the node corresponding to

subsection 1 of section 1 of the document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

USING XPATH AS NODE POINTER -1

A
  • nodename Selects all nodes with the name “nodename”
  • / Selects from the root node
  • // Selects nodes in the document from the current node that
    match the selection no matter where they are
  • . Selects the current node
  • .. Selects the parent of the current node
  • @ Selects attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

USING XPATH AS NODE POINTER -2

A
  • bookstore Selects all nodes with the name “bookstore”
  • /bookstore Selects the root element bookstore
  • bookstore/book Selects all book elements that are children of bookstore
  • //book Selects all book elements no matter where they are in the document
  • bookstore//book Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element
  • //@lang Selects all attributes that are named lang

http://www.w3schools.com/xpath/xpath_syntax.asp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

USING XPATH AS NODE POINTER -3

A
  • /bookstore/book[1] Selects the first book element that is the child of the bookstore element.
  • /bookstore/book[last()-1] Selects the last but one book element that is the child of the bookstore element
  • /bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore element
  • //title[@lang] Selects all the title elements that have an attribute named lang
  • //title[@lang=’en’] Selects all the title elements that have an attribute named lang with a value of ‘en’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

USING XPATH AS NODE POINTER -4

A

/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00

/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00

/bookstore/* Selects all the child nodes of the bookstore element

//* Selects all elements in the document

//title[@*] Selects all title elements which have any attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

USING XPATH AS NODE POINTER -5

A

//book/title | //book/price Selects all the title AND price elements of all book elements

//title | //price Selects all the title AND price elements in the document

/bookstore/book/title | //price Selects all the title elements of the book element of the bookstore element AND all the price elements in the document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

XPATH EXAMPLES

A

Part of the XML package

xpathSApply(rootNode,”//name”,xmlValue)
[1] “Belgian Waffles” “Strawberry Belgian Waffles” “Berry-Berry Belgian Waffles”
[4] “French Toast” “Homestyle Breakfast”

xpathSApply(rootNode,”//price”,xmlValue)
[1] “$5.95” “$7.95” “$8.95” “$4.50” “$6.95”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly